Alembic Production Deployment Workflow¶
Date: 2025-10-16 Author: Claude Code Status: Preview Database Stamped, Production Workflow Documented
Executive Summary¶
This document describes the complete workflow for syncing database schema changes from the preview environment to production using Alembic migrations. It covers the development-to-production pipeline, including PR creation, deployment automation, and safety checks.
Key Achievement: Preview database successfully stamped with Alembic. Configuration issues resolved. Production workflow documented and ready for implementation.
Current State¶
Preview Environment¶
- ✅ Database:
slidefactory-previewon Azure PostgreSQL - ✅ Alembic Status: Stamped with baseline version
ae98aa32e0d2 (head) - ✅ Configuration:
.envfile fixed to use direct DATABASE_URL (no shell variable expansion) - ✅ Alembic Config:
alembic/env.pyloads.envviadotenv
Production Environment¶
- ⏳ Database:
slidefactoryon Azure PostgreSQL (needs stamping) - ⏳ Alembic Status: Not yet stamped (one-time operation required)
- ⏳ Deployment: Manual migration execution (automation pending)
Configuration Fixes Applied¶
Issue 1: Shell Variable Expansion in .env¶
Problem: The .env file used shell variable syntax:
Python's os.environ.get() doesn't expand ${...} variables, causing fallback to default postgres:5432.
Solution: Replaced with direct URL:
DATABASE_URL="postgresql+psycopg2://dbadmin:esLVoH6UVweWRVU9yC*u@slidefactory-db.postgres.database.azure.com:5432/slidefactory-preview?sslmode=require&connect_timeout=30"
Issue 2: Alembic Not Loading .env¶
Problem: alembic/env.py imported app.config.settings but app/config.py doesn't load .env automatically—it only reads os.environ.
Solution: Added to alembic/env.py:
# Load .env file for local development (Docker Compose loads it automatically)
from dotenv import load_dotenv
load_dotenv()
Complete Workflow: Preview → Production¶
Step 1: Develop Schema Changes (Preview Branch)¶
On preview branch:
# 1. Modify SQLAlchemy models
# Example: app/context/models.py
class Document(Base):
# ... existing fields ...
new_field = Column(String, nullable=True) # NEW
# 2. Generate migration (auto-detects changes)
.venv/bin/alembic revision --autogenerate -m "add new_field to documents"
# 3. Review generated migration
cat alembic/versions/<revision>_add_new_field_to_documents.py
# 4. Test migration on preview database
.venv/bin/alembic upgrade head
# 5. Verify preview database
.venv/bin/alembic current
# Expected: <new_revision> (head)
# 6. Test rollback (optional but recommended)
.venv/bin/alembic downgrade -1
.venv/bin/alembic upgrade head
# 7. Commit migration file to git
git add alembic/versions/<revision>_*.py
git add app/context/models.py
git commit -m "feat: add new_field to documents table"
git push origin preview
Result: - Preview database has new schema - Migration tested and verified - Migration file committed to git
Step 2: Create Pull Request (preview → main)¶
Create PR via GitHub:
gh pr create \
--base main \
--head preview \
--title "feat: add new_field to documents" \
--body "## Changes
- Added new_field to Document model
- Alembic migration included: <revision>_add_new_field
## Testing
- [x] Migration tested on preview database
- [x] Rollback tested successfully
- [x] Application verified working with new schema
## Database Changes
- Table: context_documents
- Action: Add column new_field (String, nullable)"
PR Review Checklist: - [ ] Migration files included in PR - [ ] Migration tested on preview database - [ ] Rollback tested (alembic downgrade -1) - [ ] No data loss in migration - [ ] Application code compatible with both old and new schema (for zero-downtime) - [ ] Migration is idempotent (safe to run multiple times)
Step 3: One-Time Production Database Setup¶
⚠️ IMPORTANT: Before merging any PRs with migrations, production must be stamped once.
Check if production is stamped:
# Connect to production database
az containerapp exec \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--command "alembic current"
# If output is "ae98aa32e0d2 (head)" → Already stamped ✅
# If output is empty or error → Needs stamping ❌
Stamp production (ONE TIME):
# This only creates alembic_version table and sets baseline
# NO schema changes, NO data changes
az containerapp exec \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--command "alembic stamp head"
# Verify
az containerapp exec \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--command "alembic current"
# Expected: ae98aa32e0d2 (head)
What this does:
-- Only creates tracking table
CREATE TABLE alembic_version (
version_num VARCHAR(32) NOT NULL,
PRIMARY KEY (version_num)
);
-- Sets baseline version
INSERT INTO alembic_version (version_num) VALUES ('ae98aa32e0d2');
Safety: - ✅ No schema changes - ✅ No data changes - ✅ Takes <1 second - ✅ Reversible (just drop alembic_version table) - ✅ Non-blocking (app keeps running)
Step 4: Merge PR and Deploy¶
Merge PR:
# Via GitHub UI or CLI
gh pr merge <pr-number> --squash --delete-branch
# Or manually
git checkout main
git merge preview
git push origin main
GitHub Actions Workflow:
This triggers .github/workflows/production.yml which: 1. Builds Docker image 2. Pushes to Azure Container Registry 3. Deploys to Azure Container Apps
Step 5: Run Database Migrations on Production¶
Current State: Manual execution required (automation pending)
Option A: Manual Migration Execution (Current, Safest)¶
# After code deploys to production
az containerapp exec \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--command "alembic upgrade head"
# Verify migration applied
az containerapp exec \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--command "alembic current"
# Expected: <new_revision> (head)
When to run: - After GitHub Actions completes deployment - Before announcing new features to users - During maintenance window if needed
Option B: Automated via GitHub Actions Job (Recommended)¶
Add to .github/workflows/production.yml:
name: Deploy to Production
on:
push:
branches:
- main
jobs:
deploy:
# ... existing build and push steps ...
migrate-database:
name: Run Database Migrations
needs: deploy
runs-on: ubuntu-latest
steps:
- name: Azure Login
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Run Alembic Migrations
run: |
az containerapp exec \
--name slidefactory-prod \
--resource-group ${{ secrets.AZURE_RESOURCE_GROUP }} \
--command "alembic upgrade head"
timeout-minutes: 5
- name: Verify Migration
run: |
VERSION=$(az containerapp exec \
--name slidefactory-prod \
--resource-group ${{ secrets.AZURE_RESOURCE_GROUP }} \
--command "alembic current" | grep -oE '^[a-f0-9]{12}')
echo "Production database is at version: $VERSION"
if [ -z "$VERSION" ]; then
echo "❌ Migration verification failed"
exit 1
fi
echo "✅ Migration successful"
Pros: - ✅ Automated (no manual steps) - ✅ Runs after deployment - ✅ Includes verification step - ✅ Fails pipeline if migration fails
Cons: - ⚠️ Requires Azure CLI access in GitHub Actions - ⚠️ Less control over timing
Option C: Init Container (Best for Azure Container Apps)¶
Modify Azure Container Apps deployment config:
# In your containerapp definition (IaC or Azure Portal)
properties:
template:
initContainers:
- name: migration-init
image: slidefactoryacr.azurecr.io/slidefactory:latest
command: ["alembic", "upgrade", "head"]
env:
- name: DATABASE_URL
secretRef: database-url
- name: REDIS_HOST
secretRef: redis-host
# ... other required env vars ...
containers:
- name: web
image: slidefactoryacr.azurecr.io/slidefactory:latest
# ... main container config ...
How it works: 1. Container Apps starts init container first 2. Init container runs alembic upgrade head 3. If migration succeeds → main container starts 4. If migration fails → deployment fails (rollback)
Pros: - ✅ Runs before app starts (safest) - ✅ Fully automated - ✅ Fails deployment if migration fails (prevents broken state) - ✅ No manual intervention needed - ✅ Built into Azure Container Apps lifecycle
Cons: - ⚠️ Requires IaC or manual Azure Portal config update - ⚠️ Migration failures block entire deployment
Recommendation: Use Option C (init container) for production-grade automation.
Workflow Diagram¶
┌─────────────────────────────────────────────────────────────┐
│ Development: Preview Branch │
├─────────────────────────────────────────────────────────────┤
│ 1. Modify model: app/context/models.py │
│ 2. Generate migration: alembic revision --autogenerate │
│ 3. Test locally: alembic upgrade head │
│ 4. Commit & push to preview │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ GitHub Actions: Deploy to Preview │
├─────────────────────────────────────────────────────────────┤
│ 1. Build Docker image │
│ 2. Push to Azure Container Registry │
│ 3. Deploy to preview Container App │
│ 4. Init container runs: alembic upgrade head │
│ 5. Main container starts with new schema │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Verification: Test on Preview │
├─────────────────────────────────────────────────────────────┤
│ 1. Manual testing of new feature │
│ 2. Check alembic current (verify version) │
│ 3. Monitor logs for errors │
│ 4. Verify rollback works (optional) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Pull Request: preview → main │
├─────────────────────────────────────────────────────────────┤
│ 1. Create PR via GitHub │
│ 2. Code review │
│ 3. Migration review │
│ 4. Approve & merge │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ GitHub Actions: Deploy to Production │
├─────────────────────────────────────────────────────────────┤
│ 1. Build Docker image (from main branch) │
│ 2. Push to Azure Container Registry │
│ 3. Deploy to production Container App │
│ 4. Init container runs: alembic upgrade head │
│ 5. Main container starts with new schema │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Production: Schema Synced │
├─────────────────────────────────────────────────────────────┤
│ ✅ Schema matches preview │
│ ✅ Alembic version table updated │
│ ✅ Application runs with new schema │
│ ✅ Zero downtime (if migration is additive) │
└─────────────────────────────────────────────────────────────┘
Safety Checks and Best Practices¶
Before Creating PR¶
Development checks:
# Verify migration on preview
alembic current
# Should show: <new_revision> (head)
# Check migration history
alembic history --verbose
# Test rollback
alembic downgrade -1
alembic upgrade head
# Verify data integrity
psql $DATABASE_URL -c "SELECT COUNT(*) FROM context_documents;"
During PR Review¶
Code review checklist: - [ ] Migration file reviewed line by line - [ ] No DROP COLUMN without data migration - [ ] No DROP TABLE without confirmation - [ ] Default values provided for NOT NULL columns - [ ] Indexes added for new columns (if needed) - [ ] Foreign keys have proper ON DELETE behavior - [ ] Migration is reversible (downgrade() implemented) - [ ] No hardcoded values (use dynamic queries)
After Deployment¶
Production verification:
# Check Alembic version
az containerapp exec ... --command "alembic current"
# Check database for new columns
az containerapp exec ... --command \
"psql \$DATABASE_URL -c '\d+ context_documents'"
# Monitor application logs
az containerapp logs tail \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--follow
# Test application endpoints
curl https://slidefactory-prod.azurecontainerapps.io/health
Rollback Procedures¶
If Migration Fails During Deployment¶
Automatic rollback (with init container): - Init container failure prevents main container from starting - Previous container revision keeps running - No broken state
Manual rollback:
# Rollback database
az containerapp exec \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--command "alembic downgrade -1"
# Rollback container app
az containerapp revision list \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001
az containerapp ingress traffic set \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--revision-weight <previous-revision>=100
If Migration Causes Data Issues¶
Emergency rollback process:
# 1. Stop traffic to application
az containerapp ingress disable \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001
# 2. Rollback database migration
az containerapp exec ... --command "alembic downgrade -1"
# 3. Restore from backup (if needed)
# Follow Azure PostgreSQL backup restoration process
# 4. Rollback application code
az containerapp revision copy \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001 \
--from-revision <previous-stable-revision>
# 5. Re-enable traffic
az containerapp ingress enable \
--name slidefactory-prod \
--resource-group rg-slidefactory-prod-001
Zero-Downtime Migration Strategies¶
Additive Changes (Safe)¶
Examples: - Adding nullable columns - Adding new tables - Adding indexes
Strategy:
# Migration: Add nullable column
def upgrade():
op.add_column('documents', sa.Column('new_field', sa.String(), nullable=True))
def downgrade():
op.drop_column('documents', 'new_field')
Deployment: 1. Deploy migration (adds nullable column) 2. Deploy new code (uses new column) 3. ✅ Zero downtime
Non-Nullable Changes (Requires Strategy)¶
Example: Adding NOT NULL column
Strategy: 3-Phase Deployment
Phase 1: Add nullable column
Deploy → All new records will have NULL (old code doesn't set it)Phase 2: Backfill data
def upgrade():
# Backfill existing records
op.execute("UPDATE documents SET new_field = 'default' WHERE new_field IS NULL")
Phase 3: Make NOT NULL
Deploy → Column is now NOT NULLBreaking Changes (Requires Maintenance)¶
Examples: - Dropping columns used by old code - Renaming columns - Changing column types
Strategy: Blue-Green Deployment
- Schedule maintenance window
- Deploy new schema
- Deploy new code
- Verify functionality
- End maintenance window
Or: Expand-Contract Pattern
- Expand: Add new column, keep old column
- Migrate: Dual-write to both columns
- Contract: Remove old column after all code updated
Monitoring and Alerting¶
Key Metrics to Monitor¶
During migration: - Migration execution time - Database connection count - Lock wait times - Transaction log size
After migration: - Application error rates - Response time degradation - Database query performance - Failed requests
Recommended Alerts¶
alerts:
- name: "Migration Timeout"
condition: migration_duration > 5m
action: Notify ops team
- name: "Migration Failed"
condition: init_container_exit_code != 0
action: Block deployment, notify ops
- name: "Schema Mismatch"
condition: alembic_version != expected_version
action: Notify ops team
- name: "High Error Rate Post-Deployment"
condition: error_rate > 5% for 5m
action: Trigger rollback procedure
Common Issues and Solutions¶
Issue 1: Migration Timeout¶
Symptom: Init container times out during migration
Causes: - Large data migration - Missing indexes - Lock contention
Solution:
# Break large migrations into batches
def upgrade():
connection = op.get_bind()
# Process in batches
batch_size = 1000
offset = 0
while True:
result = connection.execute(
f"UPDATE documents SET new_field = 'value' "
f"WHERE new_field IS NULL "
f"LIMIT {batch_size} OFFSET {offset}"
)
if result.rowcount == 0:
break
offset += batch_size
Issue 2: Duplicate Migration Version¶
Symptom: alembic current shows wrong version after manual SQL changes
Solution:
# Check actual database state
psql $DATABASE_URL -c "SELECT * FROM alembic_version;"
# Fix version mismatch
alembic stamp <correct_revision>
Issue 3: Merge Conflicts in Migrations¶
Symptom: Two PRs create migrations with same parent revision
Solution:
# After merging first PR, rebase second PR
git checkout feature-branch-2
git rebase main
# Regenerate migration with correct parent
alembic revision --autogenerate -m "feature 2"
# Manually merge any conflicts in migration files
Checklist for Production Migration¶
Pre-Deployment¶
- Production database is stamped with Alembic
- Migration tested on preview database
- Rollback tested successfully
- Code review completed
- PR merged to main
- Database backup created
- Maintenance window scheduled (if needed)
- Team notified of deployment
During Deployment¶
- GitHub Actions workflow starts
- Docker image builds successfully
- Image pushed to container registry
- Container App deployment starts
- Init container runs migration
- Migration completes successfully
- Main container starts
- Health checks pass
Post-Deployment¶
- Verify Alembic version:
alembic current - Check application logs for errors
- Test critical user flows
- Monitor database performance
- Monitor application metrics
- Verify rollback plan is ready
- Document any issues encountered
Next Steps¶
Immediate Actions¶
- ✅ Preview database: Already stamped (
ae98aa32e0d2) - ⏳ Production database: Needs stamping (one-time operation)
- ⏳ CI/CD automation: Implement init container or GitHub Actions job
- ⏳ Documentation: Update deployment runbook
Future Improvements¶
- Add monitoring: Azure Monitor alerts for migration failures
- Automated testing: Integration tests for migrations
- Blue-green deployments: Zero-downtime for breaking changes
- Migration review process: Template for PR reviews
- Disaster recovery: Automated backup before migrations
Related Documentation¶
- Initial Alembic Setup Report
- Database Migration Strategy
- CLAUDE.md - Database Migrations Section
- Alembic Official Docs
- Azure Container Apps - Init Containers
Conclusion¶
This workflow provides a safe, automated path for syncing database schema changes from preview to production. The key principles are:
- Test first: Always test migrations on preview before production
- Automate safely: Use init containers to prevent broken deployments
- Monitor actively: Watch for issues during and after migrations
- Rollback ready: Have a plan to revert if something goes wrong
- Document everything: Keep a record of migrations and decisions
Status: Ready for production stamping and automation implementation.
Last Updated: 2025-10-16 Next Review: After first production migration