N8N Scaling Assessment and Improvement Plan¶
Date: 2025-11-10 Branch: preview Author: Claude Code Assessment
Executive Summary¶
The current n8n setup runs in regular (non-scalable) mode with a single instance handling all workloads. This creates a bottleneck for workflow execution and limits horizontal scaling. This assessment provides a comprehensive plan to migrate both Docker (local) and Azure (production/preview) environments to n8n Queue Mode with dedicated worker instances for improved scalability and performance.
Current State Analysis¶
Docker Environment (Local Development)¶
Location: docker-compose.override.yml (lines 44-80)
Current Configuration:
n8n:
image: n8nio/n8n:latest
environment:
- EXECUTIONS_MODE=regular # ⚠️ Problem: Single-process mode
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
# No Redis queue configuration
Issues: - Single n8n instance handles both UI/API and workflow execution - EXECUTIONS_MODE=regular means no horizontal scaling capability - No worker processes for parallel workflow execution - Redis is available in the stack but not used by n8n - Cannot scale to handle multiple simultaneous workflows efficiently
Azure Environment (Production/Preview)¶
Location: .github/workflows/preview.yml, .github/workflows/production.yml
Current Configuration: - n8n deployed separately (not in GitHub Actions workflows) - S5 Slidefactory connects via N8N_API_URL and N8N_API_KEY environment variables - Deployment details unknown, but likely single instance based on architecture
Issues: - No visibility into current n8n deployment configuration - Likely same single-instance limitation as Docker setup - Cannot scale workers independently from main instance - No documented infrastructure-as-code for n8n deployment
N8N Queue Mode Architecture¶
Overview¶
N8N supports horizontal scaling through Queue Mode, which separates concerns:
┌─────────────────────────────────────────────────────────────┐
│ N8N QUEUE MODE ARCHITECTURE │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Main Instance │ │ Redis │ │
│ │ │ │ (Message Queue) │ │
│ │ - UI/Editor │────────▶│ │ │
│ │ - API │ │ - Job Queue │ │
│ │ - Webhooks │ │ - Job Results │ │
│ │ - Scheduling │ │ │ │
│ └──────────────────┘ └──────────────────┘ │
│ │ │
│ │ Jobs │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ │ │
│ ┌────────────┼──────────────┬──────────────┬──────────┼─┐ │
│ │ Worker 1 │ Worker 2 │ Worker 3 │ Worker N│ │ │
│ │ │ │ │ │ │ │
│ │ Executes │ Executes │ Executes │ Executes│ │ │
│ │ Workflows │ Workflows │ Workflows │ Workflows│ │ │
│ └────────────┴──────────────┴──────────────┴──────────┴─┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ PostgreSQL Database │ │
│ │ (Shared by all instances) │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Key Components¶
- Main Instance (1 replica)
- Handles UI, API, webhooks, and cron triggers
- Writes workflow execution jobs to Redis queue
- Does NOT execute workflows
-
Environment:
EXECUTIONS_MODE=queue -
Worker Instances (N replicas, scalable)
- Pull jobs from Redis queue
- Execute workflows in parallel
- Write results back to Redis and PostgreSQL
-
Environment:
EXECUTIONS_MODE=queue+ worker-specific config -
Redis (Message Broker)
- Stores pending workflow jobs
- Manages job distribution to workers
- Stores execution results temporarily
-
Bull queue library under the hood
-
PostgreSQL (Shared Database)
- Workflow definitions
- Execution history and logs
- Credentials and settings
- Must be accessible by all instances
Benefits¶
- Horizontal Scaling: Add/remove workers based on workload
- High Availability: Workers can fail without affecting the UI
- Performance: Parallel execution of multiple workflows
- Resource Isolation: Separate resources for UI and execution
- Cost Optimization: Scale workers independently
Improvement Plan¶
Phase 1: Docker Environment (Local Development)¶
1.1 Update docker-compose.override.yml¶
Changes Required:
services:
# N8N Main Instance - UI, API, Webhooks
n8n:
image: n8nio/n8n:latest
ports:
- "5678:5678"
environment:
# Queue Mode Configuration
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
- QUEUE_BULL_REDIS_PORT=6379
- QUEUE_BULL_REDIS_DB=1 # Separate from Celery (uses DB 0)
# Authentication
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=admin
- N8N_BASIC_AUTH_PASSWORD=admin
# Server Configuration
- N8N_HOST=localhost
- N8N_PORT=5678
- N8N_PROTOCOL=http
- WEBHOOK_URL=http://localhost:5678/
- GENERIC_TIMEZONE=Europe/Berlin
# Database
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=n8n
- DB_POSTGRESDB_USER=postgres
- DB_POSTGRESDB_PASSWORD=postgres
# Encryption (must be same across all instances)
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY:-change-me-in-production}
# Metrics
- N8N_METRICS=true
# User Management
- N8N_USER_MANAGEMENT_DISABLED=false
volumes:
- n8n_data:/home/node/.n8n
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:5678/healthz"]
interval: 30s
timeout: 10s
retries: 3
# N8N Worker Instances - Workflow Execution
n8n-worker:
image: n8nio/n8n:latest
command: worker
environment:
# Queue Mode Configuration
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
- QUEUE_BULL_REDIS_PORT=6379
- QUEUE_BULL_REDIS_DB=1
# Database (shared with main)
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=n8n
- DB_POSTGRESDB_USER=postgres
- DB_POSTGRESDB_PASSWORD=postgres
# Encryption (MUST match main instance)
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY:-change-me-in-production}
# Timezone
- GENERIC_TIMEZONE=Europe/Berlin
volumes:
- n8n_data:/home/node/.n8n
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
n8n:
condition: service_healthy
healthcheck:
test: ["CMD", "pgrep", "-f", "n8n worker"]
interval: 30s
timeout: 10s
retries: 3
# Resource limits (adjust based on workflow complexity)
deploy:
resources:
limits:
memory: 2G
reservations:
memory: 1G
# Scale workers: docker compose up -d --scale n8n-worker=3
replicas: 2
volumes:
n8n_data:
Key Changes: 1. Set EXECUTIONS_MODE=queue on main instance 2. Add QUEUE_BULL_REDIS_* configuration pointing to existing Redis 3. Create n8n-worker service with command: worker 4. Use Redis DB 1 for n8n (Celery uses DB 0) 5. Set replicas: 2 for workers (easily scalable with --scale) 6. Ensure N8N_ENCRYPTION_KEY is identical across all instances
1.2 Update .env.local Template¶
Add to .env.local:
# N8N Configuration
N8N_ENCRYPTION_KEY=your-secure-encryption-key-here-32-chars-min
N8N_API_KEY=your-n8n-api-key-from-ui
N8N_API_URL=http://n8n:5678
1.3 Testing Procedure¶
# 1. Stop existing services
docker compose down
# 2. Start with queue mode
docker compose up -d
# 3. Verify main instance is running
docker compose logs n8n | grep "queue mode"
# 4. Verify workers are running
docker compose logs n8n-worker | grep "worker started"
# 5. Check Redis connection
docker compose exec redis redis-cli ping
# 6. Scale workers up
docker compose up -d --scale n8n-worker=3
# 7. Check worker count
docker compose ps | grep n8n-worker
Phase 2: Azure Environment (Production/Preview)¶
2.1 Architecture Decision¶
Option A: Azure Container Apps (Recommended) - Separate container apps for n8n main and workers - Built-in autoscaling for workers based on CPU/memory - Easier to manage within existing infrastructure - Cost-effective with consumption-based pricing
Option B: Azure Kubernetes Service (AKS) - More complex but more control - Better for very large scale (100+ workers) - Higher operational overhead - Requires Helm chart management
Recommendation: Use Option A (Azure Container Apps) for consistency with existing S5 Slidefactory deployment.
2.2 Infrastructure as Code¶
Create new deployment workflow: .github/workflows/deploy-n8n.yml
name: Deploy N8N to Azure
on:
workflow_dispatch:
inputs:
environment:
description: 'Environment to deploy to'
required: true
type: choice
options:
- preview
- production
jobs:
deploy-n8n:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set environment variables
id: env
run: |
if [ "${{ github.event.inputs.environment }}" == "production" ]; then
echo "ENV_SUFFIX=" >> $GITHUB_OUTPUT
echo "DATABASE_URL=${{ secrets.PROD_N8N_DATABASE_URL }}" >> $GITHUB_OUTPUT
echo "REDIS_HOST=${{ secrets.PROD_REDIS_HOST }}" >> $GITHUB_OUTPUT
echo "REDIS_PASSWORD=${{ secrets.PROD_REDIS_PASSWORD }}" >> $GITHUB_OUTPUT
else
echo "ENV_SUFFIX=-preview" >> $GITHUB_OUTPUT
echo "DATABASE_URL=${{ secrets.PREVIEW_N8N_DATABASE_URL }}" >> $GITHUB_OUTPUT
echo "REDIS_HOST=${{ secrets.PREVIEW_REDIS_HOST }}" >> $GITHUB_OUTPUT
echo "REDIS_PASSWORD=${{ secrets.PREVIEW_REDIS_PASSWORD }}" >> $GITHUB_OUTPUT
fi
- name: Login to Azure
uses: azure/login@v2
with:
auth-type: SERVICE_PRINCIPAL
creds: ${{ secrets.AZURE_CREDENTIALS }}
# Deploy N8N Main Instance
- name: Deploy N8N Main Instance
run: |
az containerapp create \
--name n8n-main${{ steps.env.outputs.ENV_SUFFIX }} \
--resource-group ${{ secrets.AZURE_RESOURCE_GROUP }} \
--image n8nio/n8n:latest \
--environment ${{ secrets.AZURE_CONTAINER_APP_ENVIRONMENT }} \
--ingress external \
--target-port 5678 \
--min-replicas 1 \
--max-replicas 1 \
--cpu 1 \
--memory 2.0Gi \
--env-vars \
EXECUTIONS_MODE=queue \
QUEUE_BULL_REDIS_HOST=${{ steps.env.outputs.REDIS_HOST }} \
QUEUE_BULL_REDIS_PORT=6380 \
QUEUE_BULL_REDIS_PASSWORD=secretRef:redis-password \
QUEUE_BULL_REDIS_DB=1 \
QUEUE_BULL_REDIS_TLS=true \
DB_TYPE=postgresdb \
DB_POSTGRESDB_HOST=secretRef:db-host \
DB_POSTGRESDB_DATABASE=n8n \
DB_POSTGRESDB_USER=secretRef:db-user \
DB_POSTGRESDB_PASSWORD=secretRef:db-password \
N8N_ENCRYPTION_KEY=secretRef:encryption-key \
N8N_BASIC_AUTH_ACTIVE=true \
N8N_BASIC_AUTH_USER=admin \
N8N_BASIC_AUTH_PASSWORD=secretRef:n8n-password \
N8N_METRICS=true \
N8N_PROTOCOL=https \
GENERIC_TIMEZONE=Europe/Berlin
# Deploy N8N Workers
- name: Deploy N8N Workers
run: |
az containerapp create \
--name n8n-worker${{ steps.env.outputs.ENV_SUFFIX }} \
--resource-group ${{ secrets.AZURE_RESOURCE_GROUP }} \
--image n8nio/n8n:latest \
--environment ${{ secrets.AZURE_CONTAINER_APP_ENVIRONMENT }} \
--ingress internal \
--target-port 5678 \
--min-replicas 2 \
--max-replicas 10 \
--cpu 1 \
--memory 2.0Gi \
--command "n8n" "worker" \
--scale-rule-name cpu-scaling \
--scale-rule-type cpu \
--scale-rule-metadata "type=Utilization" "value=70" \
--scale-rule-name memory-scaling \
--scale-rule-type memory \
--scale-rule-metadata "type=Utilization" "value=80" \
--env-vars \
EXECUTIONS_MODE=queue \
QUEUE_BULL_REDIS_HOST=${{ steps.env.outputs.REDIS_HOST }} \
QUEUE_BULL_REDIS_PORT=6380 \
QUEUE_BULL_REDIS_PASSWORD=secretRef:redis-password \
QUEUE_BULL_REDIS_DB=1 \
QUEUE_BULL_REDIS_TLS=true \
DB_TYPE=postgresdb \
DB_POSTGRESDB_HOST=secretRef:db-host \
DB_POSTGRESDB_DATABASE=n8n \
DB_POSTGRESDB_USER=secretRef:db-user \
DB_POSTGRESDB_PASSWORD=secretRef:db-password \
N8N_ENCRYPTION_KEY=secretRef:encryption-key \
GENERIC_TIMEZONE=Europe/Berlin
2.3 Required Secrets¶
Add to GitHub Actions Secrets:
Preview Environment: - PREVIEW_N8N_DATABASE_URL: PostgreSQL connection string for n8n database - PREVIEW_REDIS_HOST: Azure Redis hostname - PREVIEW_REDIS_PASSWORD: Azure Redis password - N8N_ENCRYPTION_KEY: 32+ character encryption key (CRITICAL: must be same across all instances) - N8N_ADMIN_PASSWORD: Admin password for n8n UI
Production Environment: - PROD_N8N_DATABASE_URL: PostgreSQL connection string for n8n database - PROD_REDIS_HOST: Azure Redis hostname - PROD_REDIS_PASSWORD: Azure Redis password - Same N8N_ENCRYPTION_KEY as preview - N8N_ADMIN_PASSWORD: Admin password for n8n UI
2.4 Database Setup¶
N8N requires its own database schema. Options:
Option 1: Separate database on same PostgreSQL server
Option 2: Use same database with different schema
Update DB_POSTGRESDB_SCHEMA environment variable accordingly.
2.5 Azure Redis Configuration¶
Current Setup: S5 Slidefactory already uses Azure Redis for Celery.
N8N Redis Requirements: - Use separate Redis DB number (DB 1 for n8n, DB 0 for Celery) - Same Azure Redis instance can be shared - Or create dedicated Azure Redis instance for n8n (recommended for production)
Redis Configuration Check:
# Verify Redis max databases (default is 16)
az redis list --resource-group <resource-group> --query "[].{name:name,sku:sku}"
# Update Redis config if needed
az redis update \
--resource-group <resource-group> \
--name <redis-name> \
--set redisConfiguration.maxmemory-policy=allkeys-lru
2.6 Monitoring and Observability¶
Azure Monitor Integration:
# Enable diagnostic settings for n8n container apps
az monitor diagnostic-settings create \
--name n8n-diagnostics \
--resource /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.App/containerApps/n8n-main \
--logs '[{"category": "ContainerAppConsoleLogs", "enabled": true}]' \
--metrics '[{"category": "AllMetrics", "enabled": true}]' \
--workspace <log-analytics-workspace-id>
Key Metrics to Monitor: - Worker CPU/Memory utilization - Redis queue length (bull:n8n:* keys) - Workflow execution time - Failed workflow count - Worker autoscaling events
Alerts: - Queue length > 100 jobs - Worker CPU > 80% for 5 minutes - Failed workflow rate > 10% - Redis connection failures
Implementation Roadmap¶
Week 1: Local Development Environment¶
Day 1-2: Docker Compose Changes - [ ] Update docker-compose.override.yml with queue mode configuration - [ ] Test with single worker locally - [ ] Verify workflows execute correctly
Day 3-4: Multi-Worker Testing - [ ] Scale to 2-3 workers - [ ] Run load tests with multiple concurrent workflows - [ ] Monitor Redis queue behavior - [ ] Document any issues
Day 5: Documentation - [ ] Update CLAUDE.md with new n8n architecture - [ ] Update local development setup guide - [ ] Create troubleshooting guide
Week 2: Azure Preview Environment¶
Day 1-2: Infrastructure Preparation - [ ] Create n8n database schema in Azure PostgreSQL - [ ] Configure Redis DB 1 for n8n - [ ] Test Redis connectivity - [ ] Generate and store N8N_ENCRYPTION_KEY
Day 3-4: Deployment - [ ] Create .github/workflows/deploy-n8n.yml - [ ] Add required GitHub secrets - [ ] Deploy n8n main instance to preview - [ ] Deploy n8n workers to preview (start with 2) - [ ] Update S5 Slidefactory N8N_API_URL to new endpoint
Day 5: Testing and Validation - [ ] Migrate existing workflows to new n8n instance - [ ] Test workflow execution via S5 Slidefactory - [ ] Verify autoscaling triggers - [ ] Load testing with multiple presentations
Week 3: Production Deployment¶
Day 1-2: Production Preparation - [ ] Create production n8n database - [ ] Configure production Redis - [ ] Deploy n8n to production environment - [ ] Migrate production workflows
Day 3-4: Monitoring and Optimization - [ ] Set up Azure Monitor alerts - [ ] Configure autoscaling rules - [ ] Tune worker min/max replicas - [ ] Optimize Redis memory settings
Day 5: Documentation and Handoff - [ ] Complete deployment documentation - [ ] Create runbook for scaling operations - [ ] Train team on new architecture - [ ] Post-deployment review
Cost Implications¶
Azure Container Apps Pricing (Estimated)¶
Current Setup (Single n8n instance): - 1 instance × 1 vCPU × 2GB RAM × 730 hours/month - Estimated: $50-70/month
Queue Mode Setup (Main + Workers): - Main: 1 instance × 1 vCPU × 2GB RAM × 730 hours = $50-70/month - Workers: 2-10 instances (autoscaling) - Minimum (2 workers): $100-140/month - Average (4 workers): $200-280/month - Peak (10 workers): $500-700/month
Redis: - Already provisioned and paid for - Additional cost: negligible (using separate DB number)
Total Additional Cost: - Minimum: +\(100-140/month (2 workers) - **Typical**: +\)150-210/month (2-4 workers average) - Peak: +$450-630/month (10 workers during high load)
Cost Optimization Strategies: 1. Aggressive autoscaling down (scale to 1 worker during off-hours) 2. Use Azure Reserved Instances for predictable workloads (-30% cost) 3. Monitor and tune worker resource limits 4. Set budget alerts at $300/month threshold
Risk Assessment¶
Technical Risks¶
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Data loss during migration | Low | High | Test migration on preview first, backup workflows |
| Encryption key mismatch | Medium | High | Document key management, validate before deployment |
| Redis capacity exceeded | Low | Medium | Monitor queue size, alert on high depth |
| Worker autoscaling too slow | Medium | Medium | Tune scaling rules, set appropriate thresholds |
| Azure quota limits hit | Low | Medium | Request quota increase proactively |
| Database connection pool exhaustion | Medium | Medium | Configure max connections per instance |
Operational Risks¶
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Increased complexity | High | Low | Document architecture, create runbooks |
| Higher costs than expected | Medium | Medium | Set budget alerts, monitor usage |
| Debugging difficulty | Medium | Low | Centralized logging, distributed tracing |
| Team training required | High | Low | Documentation, knowledge sharing sessions |
Success Metrics¶
Performance Metrics¶
- Workflow Throughput: 3-5x improvement in concurrent workflow capacity
- Execution Time: No degradation in individual workflow execution time
- Queue Latency: Jobs picked up by workers within 5 seconds
- Autoscaling Time: Workers scale up within 2 minutes of high load
Reliability Metrics¶
- Uptime: 99.9% uptime for main instance
- Worker Availability: 99.5% uptime for at least 2 workers
- Failed Workflow Rate: <1% due to infrastructure issues
- Recovery Time: <5 minutes to recover from worker failure
Business Metrics¶
- Cost Efficiency: <30% increase in n8n infrastructure costs
- User Satisfaction: No complaints about workflow execution delays
- Development Velocity: Reduced time to add new workflows
- Scalability Headroom: Ability to handle 10x current workflow volume
Rollback Plan¶
If Issues Occur in Docker (Local)¶
- Stop all services:
docker compose down - Revert
docker-compose.override.ymlto previous version - Remove n8n-worker service
- Set
EXECUTIONS_MODE=regularon main n8n instance - Restart:
docker compose up -d
If Issues Occur in Azure¶
Immediate Rollback (< 1 hour): 1. Update S5 Slidefactory N8N_API_URL to old n8n instance 2. Redeploy S5 Slidefactory with old configuration 3. Keep new n8n instances running for investigation
Full Rollback (if needed): 1. Export all workflows from new n8n instance 2. Import workflows back to old n8n instance 3. Update S5 Slidefactory environment variables 4. Delete new n8n container apps 5. Document issues for future retry
Data Recovery: - All workflow definitions in PostgreSQL (no data loss risk) - Execution history preserved in database - Redis queue is ephemeral (jobs can be re-queued)
Appendix¶
A. Environment Variable Reference¶
Required for All Instances:
EXECUTIONS_MODE=queue
N8N_ENCRYPTION_KEY=<same-32-char-key-for-all>
DB_TYPE=postgresdb
DB_POSTGRESDB_HOST=<postgres-host>
DB_POSTGRESDB_DATABASE=n8n
DB_POSTGRESDB_USER=<db-user>
DB_POSTGRESDB_PASSWORD=<db-password>
QUEUE_BULL_REDIS_HOST=<redis-host>
QUEUE_BULL_REDIS_PORT=6379
QUEUE_BULL_REDIS_DB=1
GENERIC_TIMEZONE=Europe/Berlin
Main Instance Only:
N8N_HOST=<domain-or-localhost>
N8N_PORT=5678
N8N_PROTOCOL=http # or https for production
WEBHOOK_URL=<public-webhook-url>
N8N_BASIC_AUTH_ACTIVE=true
N8N_BASIC_AUTH_USER=admin
N8N_BASIC_AUTH_PASSWORD=<secure-password>
N8N_METRICS=true
N8N_USER_MANAGEMENT_DISABLED=false
Workers Only:
B. Useful Commands¶
Docker:
# Scale workers dynamically
docker compose up -d --scale n8n-worker=5
# View worker logs
docker compose logs -f n8n-worker
# Check Redis queue
docker compose exec redis redis-cli -n 1 KEYS "bull:*"
docker compose exec redis redis-cli -n 1 LLEN "bull:n8n:waiting"
# Monitor worker processes
docker compose exec n8n-worker ps aux | grep n8n
Azure:
# Scale workers manually
az containerapp update \
--name n8n-worker-preview \
--resource-group <rg> \
--min-replicas 5 \
--max-replicas 15
# View logs
az containerapp logs show \
--name n8n-worker-preview \
--resource-group <rg> \
--follow
# Check worker count
az containerapp revision list \
--name n8n-worker-preview \
--resource-group <rg> \
--query "[].properties.replicas"
C. Troubleshooting Guide¶
Problem: Workers not picking up jobs
Diagnosis:
# Check Redis connection
redis-cli -h <host> -p <port> -a <password> -n 1 PING
# Check queue length
redis-cli -h <host> -p <port> -a <password> -n 1 LLEN "bull:n8n:waiting"
# Check worker logs for errors
docker compose logs n8n-worker | grep -i error
Solution: - Verify QUEUE_BULL_REDIS_* environment variables match - Ensure N8N_ENCRYPTION_KEY is identical across all instances - Check Redis network connectivity - Restart workers: docker compose restart n8n-worker
Problem: High Redis memory usage
Diagnosis:
# Check Redis memory
redis-cli -h <host> -p <port> -a <password> INFO memory
# Check queue sizes
redis-cli -h <host> -p <port> -a <password> -n 1 KEYS "bull:*" | wc -l
Solution: - Set Redis maxmemory policy to allkeys-lru - Increase worker count to process jobs faster - Adjust job retention settings in n8n - Consider dedicated Redis instance for n8n
Problem: Database connection pool exhausted
Diagnosis:
# Check PostgreSQL connections
psql -h <host> -U <user> -d n8n -c "SELECT count(*) FROM pg_stat_activity;"
# View n8n logs
docker compose logs n8n | grep -i "connection pool"
Solution: - Increase PostgreSQL max_connections - Reduce per-instance connection pool size in n8n - Use connection pooler (PgBouncer) in front of PostgreSQL - Scale database instance if CPU is saturated
Conclusion¶
This plan provides a comprehensive roadmap to migrate S5 Slidefactory's n8n deployment from single-instance regular mode to scalable queue mode with dedicated workers. The phased approach (Docker → Preview → Production) minimizes risk while delivering immediate scalability benefits.
Key Takeaways: 1. Architecture: Main instance + N workers + Redis queue 2. Benefits: 3-5x throughput, horizontal scaling, better resource isolation 3. Timeline: 3 weeks from local dev to production deployment 4. Cost: +$150-300/month for typical workload 5. Risk: Low-medium with proper testing and rollback plan
Next Steps: 1. Review and approve this plan 2. Schedule implementation windows 3. Provision Azure resources (database, configure Redis) 4. Begin Week 1 implementation (Docker environment) 5. Continuous monitoring and optimization post-deployment
Document Version: 1.0 Last Updated: 2025-11-10 Review Date: 2025-11-24 (2 weeks post-implementation)