N8N Distributed Queue Mode Setup for Azure¶

Date: 2025-11-25 Branch: preview Status: Implementation Ready

Overview¶

This document provides the implementation guide for setting up N8N in distributed queue mode on Azure Container Apps, enabling parallel workflow execution and horizontal scaling.

Architecture¶

┌─────────────────────────────────────────────────────────────┐
│                   AZURE N8N QUEUE MODE                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────────┐         ┌──────────────────┐         │
│  │  N8N Main        │         │   Azure Redis    │         │
│  │  Container App   │         │                  │         │
│  │                  │         │   Database: /6   │         │
│  │  - UI/Editor     │────────▶│   (Queue)        │         │
│  │  - API           │         │                  │         │
│  │  - Webhooks      │         │   Note: /2 used  │         │
│  │  - Scheduling    │         │   by Slidefactory│         │
│  └──────────────────┘         └──────────────────┘         │
│   Replicas: 1                          │                    │
│                                         │ Jobs               │
│                                         ▼                    │
│               ┌─────────────────────────────────────────┐   │
│               │                                          │   │
│  ┌────────────┼──────────────┬──────────────┬──────────┼─┐ │
│  │ Worker 1   │  Worker 2    │  Worker 3    │  Worker N│ │ │
│  │ Container  │  Container   │  Container   │  Container│ │ │
│  │            │              │              │          │ │ │
│  │ Executes   │  Executes    │  Executes    │  Executes│ │ │
│  │ Workflows  │  Workflows   │  Workflows   │  Workflows│ │ │
│  └────────────┴──────────────┴──────────────┴──────────┴─┘ │
│   Replicas: 2-10 (auto-scaling)                             │
│                                                              │
│  ┌──────────────────────────────────────────────────────┐  │
│  │      PostgreSQL Flexible Server (10.0.2.4)           │  │
│  │      Database: n8n, Schema: public                    │  │
│  └──────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

Current Configuration¶

Existing N8N Instance: slidefactory-n8n - Image: n8nio/n8n:latest - Mode: Regular (single instance, no queue) - Replicas: 1 (min=1, max=1) - Database: PostgreSQL at 10.0.2.4:5432/n8n - No Redis configuration

Existing Redis: slidefactory-redis.redis.cache.windows.net - Currently using DB /2 for Slidefactory Celery - Will use DB /6 for N8N queue mode

Deployment Steps¶

Step 1: Create N8N Main Instance (Queue Mode)¶

Transform existing slidefactory-n8n to queue mode:

# Update existing N8N instance to queue mode
az containerapp update \
  --name slidefactory-n8n \
  --resource-group rg-slidefactory-prod-001 \
  --set-env-vars \
    EXECUTIONS_MODE=queue \
    QUEUE_BULL_REDIS_HOST=slidefactory-redis.redis.cache.windows.net \
    QUEUE_BULL_REDIS_PORT=6380 \
    QUEUE_BULL_REDIS_PASSWORD=kp7qzCeo5ZiGhFlt3moMkAvEa4Fp4wg4MAzCaBZ0KCQ= \
    QUEUE_BULL_REDIS_DB=6 \
    QUEUE_BULL_REDIS_TLS=true \
    N8N_ENCRYPTION_KEY=wXtmbqJFBamwW6mvN4KKGpgf2w4bvG7CzCKbaGBVQKBFY \
    DB_TYPE=postgresdb \
    DB_POSTGRESDB_HOST=10.0.2.4 \
    DB_POSTGRESDB_PORT=5432 \
    DB_POSTGRESDB_DATABASE=n8n \
    DB_POSTGRESDB_USER=dbadmin \
    DB_POSTGRESDB_PASSWORD="esLVoH6UVweWRVU9yC*u" \
    DB_POSTGRESDB_SCHEMA=public \
    DB_POSTGRESDB_SSL=false \
    N8N_BASIC_AUTH_ACTIVE=true \
    N8N_BASIC_AUTH_USER=admin \
    N8N_BASIC_AUTH_PASSWORD=clueless-club-gnarly \
    N8N_HOST=slidefactory-n8n.thankfulsmoke-fef50a06.westeurope.azurecontainerapps.io \
    N8N_PROTOCOL=https \
    N8N_PORT=80 \
    WEBHOOK_URL=https://slidefactory-n8n.thankfulsmoke-fef50a06.westeurope.azurecontainerapps.io \
    N8N_EDITOR_BASE_URL=https://slidefactory-n8n.thankfulsmoke-fef50a06.westeurope.azurecontainerapps.io \
    N8N_SECURE_COOKIE=true \
    N8N_COOKIES_SAME_SITE=lax \
    GENERIC_TIMEZONE=Europe/Berlin \
    N8N_METRICS=true \
    TRUST_PROXY=true

Step 2: Create N8N Worker Container App¶

Create new worker instances with auto-scaling:

# Create N8N worker container app
az containerapp create \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --image n8nio/n8n:latest \
  --environment <CONTAINER_APP_ENVIRONMENT_ID> \
  --ingress internal \
  --target-port 80 \
  --min-replicas 2 \
  --max-replicas 10 \
  --cpu 1.0 \
  --memory 2.0Gi \
  --command "n8n" "worker" \
  --env-vars \
    EXECUTIONS_MODE=queue \
    QUEUE_BULL_REDIS_HOST=slidefactory-redis.redis.cache.windows.net \
    QUEUE_BULL_REDIS_PORT=6380 \
    QUEUE_BULL_REDIS_PASSWORD=kp7qzCeo5ZiGhFlt3moMkAvEa4Fp4wg4MAzCaBZ0KCQ= \
    QUEUE_BULL_REDIS_DB=6 \
    QUEUE_BULL_REDIS_TLS=true \
    N8N_ENCRYPTION_KEY=wXtmbqJFBamwW6mvN4KKGpgf2w4bvG7CzCKbaGBVQKBFY \
    DB_TYPE=postgresdb \
    DB_POSTGRESDB_HOST=10.0.2.4 \
    DB_POSTGRESDB_PORT=5432 \
    DB_POSTGRESDB_DATABASE=n8n \
    DB_POSTGRESDB_USER=dbadmin \
    DB_POSTGRESDB_PASSWORD="esLVoH6UVweWRVU9yC*u" \
    DB_POSTGRESDB_SCHEMA=public \
    DB_POSTGRESDB_SSL=false \
    GENERIC_TIMEZONE=Europe/Berlin

Step 3: Configure Auto-Scaling for Workers¶

Add CPU and memory-based scaling rules:

# Add CPU-based scaling rule
az containerapp update \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --scale-rule-name cpu-scaling \
  --scale-rule-type cpu \
  --scale-rule-metadata "type=Utilization" "value=70"

# Add memory-based scaling rule
az containerapp update \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --scale-rule-name memory-scaling \
  --scale-rule-type memory \
  --scale-rule-metadata "type=Utilization" "value=80"

Environment Variables Reference¶

Required for Both Main and Workers¶

Variable	Value	Description
`EXECUTIONS_MODE`	`queue`	Enables queue mode
`QUEUE_BULL_REDIS_HOST`	`slidefactory-redis.redis.cache.windows.net`	Azure Redis hostname
`QUEUE_BULL_REDIS_PORT`	`6380`	Azure Redis SSL port
`QUEUE_BULL_REDIS_PASSWORD`	`<from secrets>`	Redis password
`QUEUE_BULL_REDIS_DB`	`6`	Redis database (separate from Slidefactory DB 2)
`QUEUE_BULL_REDIS_TLS`	`true`	Enable TLS for Azure Redis
`N8N_ENCRYPTION_KEY`	`<same for all>`	CRITICAL: Must be identical
`DB_TYPE`	`postgresdb`	Database type
`DB_POSTGRESDB_HOST`	`10.0.2.4`	PostgreSQL host
`DB_POSTGRESDB_DATABASE`	`n8n`	Database name
`GENERIC_TIMEZONE`	`Europe/Berlin`	Timezone

Main Instance Only¶

Variable	Value	Description
`N8N_HOST`	`slidefactory-n8n.thankfulsmoke...`	Public hostname
`N8N_PROTOCOL`	`https`	Protocol
`N8N_PORT`	`80`	Internal port
`WEBHOOK_URL`	`https://slidefactory-n8n...`	Webhook URL
`N8N_BASIC_AUTH_ACTIVE`	`true`	Enable auth
`N8N_BASIC_AUTH_USER`	`admin`	Username
`N8N_METRICS`	`true`	Enable metrics

Verification Steps¶

1. Check Main Instance Status¶

# View main instance logs
az containerapp logs show \
  --name slidefactory-n8n \
  --resource-group rg-slidefactory-prod-001 \
  --follow

# Look for: "Queue mode enabled" or "Connected to Redis"

2. Check Worker Status¶

# View worker logs
az containerapp logs show \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --follow

# Look for: "Worker started" or "Listening for jobs"

# Check worker replica count
az containerapp revision list \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --query "[].{name:name, replicas:properties.replicas, active:properties.active}"

3. Verify Redis Queue¶

Connect to Redis and check queue status:

# Connect to Azure Redis (requires redis-cli with TLS support)
redis-cli -h slidefactory-redis.redis.cache.windows.net \
  -p 6380 \
  -a "kp7qzCeo5ZiGhFlt3moMkAvEa4Fp4wg4MAzCaBZ0KCQ=" \
  --tls \
  -n 6

# Check queue keys
KEYS "bull:*"

# Check waiting jobs count
LLEN "bull:n8n:waiting"

# Check active jobs
LLEN "bull:n8n:active"

# Check completed jobs
LLEN "bull:n8n:completed"

4. Test Workflow Execution¶

Open N8N UI: https://slidefactory-n8n.thankfulsmoke-fef50a06.westeurope.azurecontainerapps.io
Create a test workflow with some delay
Execute workflow
Check worker logs to see which worker picked up the job
Execute multiple workflows simultaneously to verify parallel processing

Scaling Operations¶

Manual Scaling¶

# Scale workers up
az containerapp update \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --min-replicas 5 \
  --max-replicas 20

# Scale workers down
az containerapp update \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --min-replicas 1 \
  --max-replicas 5

Auto-Scaling Triggers¶

Workers will auto-scale based on: - CPU utilization: >70% triggers scale up - Memory utilization: >80% triggers scale up - Cool-down period: 5 minutes between scale operations

Monitoring¶

Key Metrics to Monitor¶

Worker Count: Number of active worker replicas
Queue Depth: Number of pending jobs in Redis
Execution Time: Average workflow execution time
Success Rate: Percentage of successful executions
Redis Memory: Memory usage of Redis DB 6

Azure Monitor Queries¶

// Worker CPU utilization
ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "slidefactory-n8n-worker"
| summarize avg(CpuPercent_d) by bin(TimeGenerated, 5m)

// Workflow execution count
ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "slidefactory-n8n-worker"
| where Log_s contains "Workflow executed"
| summarize count() by bin(TimeGenerated, 1h)

Cost Implications¶

Current Setup¶

Main: 1 instance × 0.5 vCPU × 1GB RAM = ~$30/month

Queue Mode Setup¶

Main: 1 instance × 0.5 vCPU × 1GB RAM = ~$30/month
Workers (average 3-4): 4 × 1 vCPU × 2GB RAM = ~$200-280/month
Total: ~$230-310/month

Cost Optimization¶

Set aggressive scale-down during off-hours (min=1)
Monitor and tune resource limits
Use Azure Reserved Instances for 30% discount

Rollback Plan¶

If issues occur:

# Rollback main instance to regular mode
az containerapp update \
  --name slidefactory-n8n \
  --resource-group rg-slidefactory-prod-001 \
  --remove-env-vars \
    EXECUTIONS_MODE \
    QUEUE_BULL_REDIS_HOST \
    QUEUE_BULL_REDIS_PORT \
    QUEUE_BULL_REDIS_PASSWORD \
    QUEUE_BULL_REDIS_DB \
    QUEUE_BULL_REDIS_TLS

# Delete worker instances
az containerapp delete \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --yes

Troubleshooting¶

Problem: Workers Not Picking Up Jobs¶

Diagnosis:

# Check Redis connection from worker
az containerapp logs show \
  --name slidefactory-n8n-worker \
  --resource-group rg-slidefactory-prod-001 \
  --tail 100 | grep -i redis

Solution: - Verify QUEUE_BULL_REDIS_* variables match - Ensure N8N_ENCRYPTION_KEY is identical - Check Redis network connectivity

Problem: High Queue Depth¶

Diagnosis:

# Check queue length in Redis
redis-cli -h <host> -p 6380 -a <password> --tls -n 6 LLEN "bull:n8n:waiting"

Solution: - Increase worker min/max replicas - Check worker logs for errors - Verify workflows are not failing

Problem: Database Connection Pool Exhausted¶

Solution: - Increase PostgreSQL max_connections - Add PgBouncer connection pooler - Reduce worker count temporarily

Next Steps¶

Review and approve this plan
Test on preview environment first
Schedule production deployment window
Set up monitoring and alerts
Document runbook for team

Implementation Script: /home/cgast/Github/slidefactory/s5-slidefactory/scripts/deploy-n8n-queue-mode.sh Last Updated: 2025-11-25