M0: Security Hardening — Technical Documentation¶
Version: 1.0.0 Date: 2026-02-28 Author: The Archive and Heritage Group (Pty) Ltd Framework Version: 2.8.2 Issue: #198
1. Overview¶
Milestone 0 (M0) addresses three critical security vulnerabilities identified during a security audit of the AtoM Heratio framework:
- Unsafe PHP
unserialize()calls — 14 instances across 8 files with no class restriction, enabling potential PHP Object Injection (POI) attacks - PHP serialization for data storage — Getty vocabulary cache and semantic search embeddings used
serialize()/unserialize(), exposing stored data to deserialization attacks - API file upload with no validation — path traversal via
typeparameter, no MIME validation, no extension allowlist, no file size limits
2. Vulnerability Details¶
2.1 PHP Object Injection via unserialize()¶
Risk: Critical CWE: CWE-502 (Deserialization of Untrusted Data)
PHP's unserialize() without ['allowed_classes' => false] can instantiate arbitrary objects. If an attacker controls the serialized string (e.g., via database injection or cookie manipulation), they can trigger magic methods (__wakeup, __destruct) on any class in the autoloader, potentially achieving Remote Code Execution (RCE).
Affected Files (14 instances):
| # | File | Line(s) | Context |
|---|---|---|---|
| 1 | ahgUserManagePlugin/lib/Services/UserCrudService.php |
427 | ACL permission constants |
| 2 | ahgSettingsPlugin/.../pluginsAction.class.php |
145 | Plugin list from setting_i18n |
| 3 | ahgThemeB5Plugin/.../themesAction.class.php |
48 | Enabled plugins (Propel) |
| 4 | ahgThemeB5Plugin/.../themesAction.class.php |
58 | Enabled plugins (Laravel QB) |
| 5 | ahgThemeB5Plugin/.../themesAction.class.php |
110 | Plugin settings on save |
| 6 | ahgInformationObjectManagePlugin/.../InformationObjectCrudService.php |
1100 | Serialized property array |
| 7 | ahgSettingsPlugin/.../inventoryAction.class.php |
64 | Inventory level settings |
| 8 | ahgSettingsPlugin/.../oaiAction.class.php |
37 | OAI plugin enabled check |
| 9–11 | ahgMuseumPlugin/.../GettyCacheService.php |
65, 204, 242 | Cache read fallback |
| 12–14 | ahgSemanticSearchPlugin/.../EmbeddingService.php |
258, 366, 423 | Embedding read fallback |
2.2 PHP Serialization for Data Storage¶
Risk: Medium CWE: CWE-502
Two subsystems used serialize()/unserialize() for data persistence:
- Getty Cache (
GettyCacheService.php): File-based cache for Getty vocabulary API responses stored as PHP serialized data - Semantic Search Embeddings (
EmbeddingService.php): Vector embeddings stored as PHP serialized arrays in theahg_thesaurus_embeddingdatabase table
PHP serialization is unnecessary for these data types (arrays/scalars only) and introduces deserialization risk. JSON is a safer, more portable, and more compact alternative.
2.3 API File Upload Vulnerabilities¶
Risk: Critical CWE: CWE-22 (Path Traversal), CWE-434 (Unrestricted File Upload)
The apiv2FileUploadAction had multiple vulnerabilities:
| Vulnerability | Detail |
|---|---|
| Path traversal | type parameter passed directly to directory path: $uploadDir = sf_upload_dir . '/' . $type . '/' . date(...). Attacker could send type=../../../etc to write files anywhere |
| No MIME validation | Client-supplied Content-Type trusted without server-side verification via magic bytes |
| No extension allowlist | Any file extension accepted, including .php, .sh, .exe |
| No size limit | No maximum file size enforced |
| Base64 no pre-check | Base64 content decoded without estimating size first, enabling memory exhaustion |
3. Fixes Applied¶
3.1 unserialize() Hardening¶
All 14 instances now include ['allowed_classes' => false]:
// Before (vulnerable)
$data = unserialize($input);
// After (safe — only arrays/scalars deserialized, no objects)
$data = unserialize($input, ['allowed_classes' => false]);
This ensures PHP will never instantiate objects during deserialization, eliminating the POI attack vector entirely.
3.2 Serialization Format Migration (PHP → JSON)¶
Getty Cache (GettyCacheService.php):
- Write path:
serialize($data)→json_encode($data, JSON_UNESCAPED_UNICODE) - Read path:
json_decode()first, fallback to@unserialize(['allowed_classes' => false])for legacy cache files - Methods updated:
get(),set(),getStats(),prune() - Legacy cache files are read correctly and will be replaced with JSON on next write
Semantic Search Embeddings (EmbeddingService.php):
- Write path:
serialize($embedding)→json_encode($embedding) - Read path:
json_decode()first, fallback to safe@unserialize()for legacy data - Methods updated:
storeEmbedding(),getTermEmbedding(),findSimilarTerms(),findRelatedTerms() - Added null/type guards to skip corrupt embeddings gracefully
Backward compatibility: Both subsystems use a try-JSON-first, fallback-to-safe-unserialize pattern. Existing data reads correctly; new writes use JSON. Over time, all data migrates to JSON organically.
3.3 API File Upload Hardening¶
Path traversal fix:
// Before
$type = $request->getParameter('type', 'general');
// After — basename strips directory components, regex strips special chars
$type = basename($request->getParameter('type', 'general'));
$type = preg_replace('/[^a-zA-Z0-9_-]/', '', $type);
MIME validation: Post-save finfo_file() check using magic bytes. If MIME is disallowed, the file is deleted immediately:
$mimeCheck = FileValidationService::validateMime($filepath);
if (!$mimeCheck['valid']) {
@unlink($filepath);
return ['error' => true, 'reasons' => $mimeCheck['errors']];
}
Extension allowlist: 48 safe extensions covering images, documents, audio, video, archives, 3D models, and archival formats. Configurable via ahg_settings key file_allowed_extensions.
Size limits: Default 100 MB, configurable via ahg_settings key file_max_upload_mb. For base64 uploads, size is estimated before decoding to prevent memory exhaustion.
Strict base64 decoding: Uses base64_decode($str, true) which returns false on invalid input.
3.4 FileValidationService (New)¶
Location: atom-framework/src/Services/FileValidationService.php
Namespace: AtomExtensions\Services
Centralized file validation service for use across all plugins:
| Method | Purpose |
|---|---|
validateUpload(array $file, array $options): array |
Full upload validation: extension + MIME + size |
validateMime(string $filePath, ?string $claimedMime): array |
Magic-byte MIME detection via finfo |
sanitizeFilename(string $filename): string |
Strip path traversal, null bytes, dangerous chars |
getAllowedExtensions(): array |
From ahg_settings or 48-extension default list |
getMaxSize(): int |
From ahg_settings or 100 MB default |
validateBase64Size(string $base64, ?int $maxSize): array |
Estimate decoded size before decoding |
All methods are static for easy use without instantiation.
Configuration via ahg_settings:
| Setting Key | Type | Default | Description |
|---|---|---|---|
file_allowed_extensions |
string (comma-separated) | 48 built-in extensions | Custom extension allowlist |
file_max_upload_mb |
integer | 100 | Maximum upload size in MB |
4. Files Changed¶
atom-framework (1 new file)¶
| File | Change |
|---|---|
src/Services/FileValidationService.php |
NEW — Centralized file validation |
atom-ahg-plugins (8 files modified)¶
| File | Change |
|---|---|
ahgUserManagePlugin/lib/Services/UserCrudService.php |
unserialize hardened |
ahgSettingsPlugin/.../pluginsAction.class.php |
unserialize hardened |
ahgSettingsPlugin/.../inventoryAction.class.php |
unserialize hardened |
ahgSettingsPlugin/.../oaiAction.class.php |
unserialize hardened |
ahgThemeB5Plugin/.../themesAction.class.php |
unserialize hardened (3 instances) |
ahgInformationObjectManagePlugin/.../InformationObjectCrudService.php |
unserialize hardened |
ahgMuseumPlugin/lib/Services/Getty/GettyCacheService.php |
Converted to JSON + safe fallback |
ahgSemanticSearchPlugin/lib/Services/EmbeddingService.php |
Converted to JSON + safe fallback |
ahgAPIPlugin/.../fileUploadAction.class.php |
Full security hardening |
5. Verification¶
Automated Checks¶
# Verify no unprotected unserialize remains
grep -rn "unserialize(" atom-ahg-plugins/ --include="*.php" | grep -v "allowed_classes" | grep -v "json_decode"
# Expected: zero results (or multi-line false positives where allowed_classes is on next line)
# PHP syntax validation on all changed files
php -l <file> # for each file above
Manual Test Cases¶
| # | Test | Expected Result |
|---|---|---|
| 1 | API upload with type=../../../etc |
400 — type sanitized to etc, no path traversal |
| 2 | API upload of shell.php |
400 — extension php not in allowlist |
| 3 | API upload of shell script renamed to .jpg |
400 — MIME mismatch (text/x-shellscript != image/jpeg) |
| 4 | API upload of valid JPEG | 201 — accepted, MIME confirmed image/jpeg |
| 5 | API base64 upload exceeding 100 MB | 400 — size limit exceeded (checked before decode) |
| 6 | Getty cache: read old serialized file | Parsed via fallback, returns correct data |
| 7 | Getty cache: write new entry | Written as JSON |
| 8 | Embedding: read old serialized record | Parsed via fallback, returns correct array |
| 9 | Embedding: store new embedding | Stored as JSON |
| 10 | Theme admin: plugin enable/disable | Works correctly with hardened unserialize |