-
Notifications
You must be signed in to change notification settings - Fork 480
feat(clone): this allows you to use the dotcms image as an init container and clone an another dotCMS env #34443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…iner and clone an another dotCMS env ref: #34442
Remove the unconditional `exit 0` so the entrypoint continues to source startup scripts, clarify the import script’s exit-13 success path, and install `libarchive-tools` to support asset unpacking during imports. ref: #34442
Remove the unconditional `exit 0` so the entrypoint continues to source startup scripts, clarify the import script’s exit-13 success path, and install `libarchive-tools` to support asset unpacking during imports. ref: #34442
Remove the unconditional `exit 0` so the entrypoint continues to source
startup scripts, clarify the import script’s exit-13 success path, and
install `libarchive-tools` to support asset unpacking during imports.
ref: #34442
|
I am fine with this, the lack of a helm chart to help with the templating of this init-container makes integrating this a pain, but is fine for starting with a few instances. There are a couple of concerns that may need addressing and clarifying taking into account how it would effectively work when there is more than on replica for the stateful set, it is more an issue if trying to run on a upgrade where pods are being replaced and we still have active connections. PR #34443 Review - dotCMS Environment Cloning FeatureSummaryThis PR adds environment cloning functionality using an init container pattern. The implementation works for initial deployment scenarios with EFS shared storage and OrderedReady pod management, but has critical limitations that should be addressed before production use. Overall Assessment: ✅ What Works (With EFS + OrderedReady)
🚨 Critical Issues (Must Fix)1. Lock File Race Condition - Fundamental FlawProblem: The lock file mechanism has a race condition. It only "works" because OrderedReady prevents concurrent execution, not because the lock is correct. Current Code ( # Check if lock exists
if [ -f "$IMPORT_IN_PROCESS" ]; then
# ... check lock age ...
fi
# Create lock (NOT ATOMIC with check above)
mkdir -p $IMPORT_DATA_DIR && touch $IMPORT_IN_PROCESSWhy It's Flawed:
Impact: If Recommendation: Use atomic directory creation: LOCK_DIR="$IMPORT_DATA_DIR/.lock"
if mkdir "$LOCK_DIR" 2>/dev/null; then
# We got the lock
trap "rmdir '$LOCK_DIR' 2>/dev/null" EXIT
else
# Lock exists, check if stale
# ... existing stale lock logic ...
fiAlternative: Use PostgreSQL advisory locks (automatically released on connection close) 2. Doesn't Work on Existing StatefulSets with Multiple ReplicasProblem: Cannot safely run on existing StatefulSets during rolling updates. Old pods remain active while new pods run import, causing database conflicts. Scenario: Impact:
Recommendation: Add active connection check before import: check_active_connections() {
ACTIVE=$(psql -h "$DB_HOST" -d "$DB_NAME" -U "$DB_USERNAME" -qtAX -c \
"SELECT count(*) FROM pg_stat_activity
WHERE datname = '$DB_NAME'
AND pid != pg_backend_pid()
AND state != 'idle'" 2>/dev/null || echo "0")
if [ "$ACTIVE" -gt 0 ]; then
echo "ERROR: Database has $ACTIVE active connections"
echo "Cannot import while database is in use"
echo "This script is designed for initial deployment only"
echo "Stop all pods before performing refresh"
exit 1
fi
}
# Call before import
check_active_connections || exit 1Also: Document this limitation clearly in the PR description and script comments
|
| Scenario | Risk Level | Works? | Notes |
|---|---|---|---|
| Initial deployment (EFS + OrderedReady) | ✅ Low | Yes | Works correctly |
| Existing StatefulSet (rolling update) | 🚨 High | No | Conflicts with active connections |
| Without OrderedReady | 🚨 High | No | Lock race condition causes failures |
| Per-pod volumes | 🚨 High | No | Each pod imports independently |
💡 Additional Suggestions
- Consider database advisory locks instead of file-based locks (more reliable, auto-cleanup)
- Add pod ordinal check for extra safety (only pod-0 imports on initial deployment)
- Add structured logging for better observability
- Add metrics/telemetry for import operations
- Consider resume capability for failed imports (checkpoint progress)
Final Recommendation
Request Changes - Address critical issues before merge:
- Fix lock file race condition (atomic operations)
- Add active connection check (prevent conflicts on existing StatefulSets)
- Document dependencies (OrderedReady, EFS)
The feature is useful and the init container pattern is correct, but the implementation needs these fixes for production robustness.
Context Notes
- Storage: Assumes
/data/sharedis EFS (shared across pods) - Pod Management: Requires
OrderedReady(default StatefulSet policy) - Use Cases: Initial deployment or total refresh (intentional database/filesystem replacement)
- Pattern: Init container (as documented in PR description)
Remove the unconditional so the entrypoint continues to source
startup scripts, clarify the import script’s exit-13 success path, and
install to support asset unpacking during imports.
ref: #34442
ref: #34442
Below is documentation:
dotCMS Environment Cloning
This feature allows you to use the dotCMS Docker image as an init container to clone data and assets from another running dotCMS environment. This is useful for:
How It Works
The
10-import-env.shscript runs at container startup (before dotCMS itself starts) and:When the script exits cleanly and is restarted, it will skip re-importing and run dotCMS. This enables the "init container" pattern in Kubernetes where the import runs once, then the main dotCMS container starts with the imported data.
Environment Variables
Required
DOT_IMPORT_ENVIRONMENThttps://demo.dotcms.com)DOT_IMPORT_API_TOKENDOT_IMPORT_USERNAME_PASSWORDuser:passwordformatNote: Either
DOT_IMPORT_API_TOKENorDOT_IMPORT_USERNAME_PASSWORDis required.Database Connection
DB_BASE_URLjdbc:postgresql://host/dbname)DB_USERNAMEDB_PASSWORDOptional
DOT_IMPORT_DROP_DBfalseDOT_IMPORT_MAX_ASSET_SIZE100mbDOT_IMPORT_ALL_ASSETSfalseDOT_IMPORT_IGNORE_ASSET_ERRORStrueUsage Examples
Docker Standalone
Kubernetes Init Container
Behavior Details
Idempotency
import_complete.txtmarker file after successful importLocking
lock.txtfile prevents concurrent imports (important for Kubernetes)Database Safety
DOT_IMPORT_DROP_DB=true)DOT_IMPORT_DROP_DB=trueto wipe and reimportDownloaded Files
$IMPORT_DATA_DIRExit Codes
0113Troubleshooting
Import stuck or failed
ls -la /data/shared/import/lock.txtrm /data/shared/import/lock.txtForce reimport
rm /data/shared/import/import_complete.txtAuthentication failures
DOT_IMPORT_API_TOKENorDOT_IMPORT_USERNAME_PASSWORDis correct/api/v1/maintenance/_downloadAssets/api/v1/maintenance/_downloadDbSource Environment Requirements
The source dotCMS environment must: