Load Balancing

Load balancer configuration, health checks, session affinity, and scaling patterns for SysManage distributed deployments.

Load Balancing Overview

SysManage is designed to support horizontal scaling through load balancing. The architecture enables multiple server instances to share the load while maintaining session continuity and data consistency.

Core Principles

Stateless Design: Application servers maintain no session state
Database Centralization: Shared PostgreSQL cluster for consistency
WebSocket Affinity: Sticky sessions for real-time connections
Health-Based Routing: Automatic failure detection and recovery

Load Balancing Architecture

Multi-Tier Load Balancing

┌─────────────────────────────────────────────────────────────────┐
│                    External Load Balancer                      │
│                      (Layer 4/7)                               │
│                   ┌─────────────────┐                          │
│                   │ Cloud LB / F5   │                          │
│                   │ AWS ALB / GCP   │                          │
│                   └─────────┬───────┘                          │
└─────────────────────────────┼───────────────────────────────────┘
                              │
┌─────────────────────────────┼───────────────────────────────────┐
│                   Internal Load Balancer                       │
│                     (Application Layer)                        │
│    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐    │
│    │    Nginx     │    │   HAProxy    │    │   Traefik    │    │
│    │ (Preferred)  │    │ (Alternative)│    │ (Container)  │    │
│    └──────┬───────┘    └──────┬───────┘    └──────┬───────┘    │
└───────────┼─────────────────────┼─────────────────────┼─────────┘
            │                   │                   │
    ┌───────┼───────┬───────────┼───────┬───────────┼───────┐
    │               │                   │                   │
┌───▼────┐    ┌───▼────┐        ┌───▼────┐        ┌───▼────┐
│ App #1 │    │ App #2 │        │ App #3 │        │ App #4 │
│        │    │        │        │        │        │        │
│FastAPI │    │FastAPI │        │FastAPI │        │FastAPI │
│Server  │    │Server  │        │Server  │        │Server  │
└────────┘    └────────┘        └────────┘        └────────┘
    │               │                   │                   │
    └───────┬───────┴───────────┬───────┴───────────┬───────┘
            │                   │                   │
┌───────────┼─────────────────────┼─────────────────────┼─────────┐
│                     PostgreSQL Cluster                        │
│    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐    │
│    │   Primary    │    │   Replica    │    │   Replica    │    │
│    │  (Write)     │    │   (Read)     │    │   (Read)     │    │
│    └──────────────┘    └──────────────┘    └──────────────┘    │
└─────────────────────────────────────────────────────────────────┘

Nginx Configuration

Primary Load Balancer Setup

nginx.conf - Main Configuration

upstream sysmanage_backend {
    # Load balancing method
    least_conn;

    # Backend server pool
    server 10.0.1.10:8000 weight=3 max_fails=3 fail_timeout=30s;
    server 10.0.1.11:8000 weight=3 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8000 weight=2 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:8000 backup;  # Backup server

    # Health check configuration
    keepalive 32;
    keepalive_requests 100;
    keepalive_timeout 60s;
}

upstream sysmanage_websocket {
    # WebSocket connections need IP hash for stickiness
    ip_hash;

    server 10.0.1.10:8000;
    server 10.0.1.11:8000;
    server 10.0.1.12:8000;

    keepalive 16;
}

server {
    listen 80;
    listen 443 ssl http2;
    server_name sysmanage.example.com;

    # SSL Configuration
    ssl_certificate /etc/nginx/ssl/sysmanage.crt;
    ssl_certificate_key /etc/nginx/ssl/sysmanage.key;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
    ssl_prefer_server_ciphers off;

    # Security headers
    add_header X-Frame-Options DENY;
    add_header X-Content-Type-Options nosniff;
    add_header X-XSS-Protection "1; mode=block";
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/s;

    # API endpoints
    location /api/ {
        limit_req zone=api burst=20 nodelay;

        proxy_pass http://sysmanage_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Health check endpoint bypass
        proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }

    # Authentication endpoints (stricter rate limiting)
    location /api/auth/ {
        limit_req zone=auth burst=10 nodelay;

        proxy_pass http://sysmanage_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # WebSocket connections (sticky sessions)
    location /ws/ {
        proxy_pass http://sysmanage_websocket;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket specific timeouts
        proxy_connect_timeout 7d;
        proxy_send_timeout 7d;
        proxy_read_timeout 7d;
    }

    # Static files
    location /static/ {
        alias /var/www/sysmanage/static/;
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    # Health check endpoint
    location /health {
        access_log off;
        return 200 "healthy\n";
        add_header Content-Type text/plain;
    }
}

Advanced Health Checks

Custom Health Check Configuration

# Define custom health check
location /backend-health {
    internal;
    proxy_pass http://sysmanage_backend/api/health/detailed;
    proxy_pass_request_body off;
    proxy_set_header Content-Length "";
    proxy_set_header X-Original-URI $request_uri;
    proxy_connect_timeout 1s;
    proxy_send_timeout 1s;
    proxy_read_timeout 1s;
}

# Health check with failover logic
location /api/ {
    # Try primary backend first
    error_page 502 503 504 = @fallback;
    proxy_pass http://sysmanage_backend;

    # Custom health validation
    auth_request /backend-health;

    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
}

# Fallback to backup servers
location @fallback {
    proxy_pass http://sysmanage_backup;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
}

HAProxy Configuration

Enterprise-Grade Load Balancing

haproxy.cfg - Complete Configuration

global
    daemon
    user haproxy
    group haproxy
    pidfile /var/run/haproxy.pid

    # SSL/TLS Configuration
    ssl-default-bind-ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512
    ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11

    # Logging
    log 127.0.0.1:514 local0

    # Performance tuning
    tune.ssl.default-dh-param 2048
    tune.bufsize 32768
    tune.maxrewrite 8192

defaults
    mode http
    log global
    option httplog
    option dontlognull
    option log-health-checks
    option forwardfor
    option http-server-close

    # Timeouts
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    timeout http-request 10s
    timeout http-keep-alive 10s
    timeout check 3000ms

    # Retry configuration
    retries 3
    option redispatch

frontend sysmanage_frontend
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/sysmanage.pem

    # Redirect HTTP to HTTPS
    redirect scheme https if !{ ssl_fc }

    # Security headers
    http-response set-header X-Frame-Options DENY
    http-response set-header X-Content-Type-Options nosniff
    http-response set-header X-XSS-Protection "1; mode=block"
    http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains"

    # Rate limiting using stick tables
    stick-table type ip size 100k expire 30s store http_req_rate(10s)
    http-request track-sc0 src
    http-request deny if { sc_http_req_rate(0) gt 20 }

    # ACL definitions
    acl is_websocket hdr_val(upgrade) -i websocket
    acl is_api path_beg /api/
    acl is_auth path_beg /api/auth/
    acl is_static path_beg /static/
    acl is_health path /health

    # Routing logic
    use_backend sysmanage_websocket if is_websocket
    use_backend sysmanage_api if is_api
    use_backend sysmanage_static if is_static
    use_backend sysmanage_health if is_health
    default_backend sysmanage_web

backend sysmanage_api
    balance leastconn
    option httpchk GET /api/health

    # Server pool with health checks
    server app1 10.0.1.10:8000 check inter 5s fall 3 rise 2 weight 100
    server app2 10.0.1.11:8000 check inter 5s fall 3 rise 2 weight 100
    server app3 10.0.1.12:8000 check inter 5s fall 3 rise 2 weight 80
    server app4 10.0.1.13:8000 check inter 5s fall 3 rise 2 weight 80 backup

    # Connection pooling
    option forwardfor
    option httpclose

    # Retry logic
    retry-on all-retryable-errors
    retries 3

backend sysmanage_websocket
    balance source  # Sticky sessions for WebSocket
    option httpchk GET /api/health/websocket

    server app1 10.0.1.10:8000 check inter 10s fall 2 rise 1
    server app2 10.0.1.11:8000 check inter 10s fall 2 rise 1
    server app3 10.0.1.12:8000 check inter 10s fall 2 rise 1

    # WebSocket specific settings
    timeout tunnel 1h
    timeout server 1h

backend sysmanage_web
    balance roundrobin
    option httpchk GET /health

    server web1 10.0.1.20:3000 check inter 5s
    server web2 10.0.1.21:3000 check inter 5s
    server web3 10.0.1.22:3000 check inter 5s backup

backend sysmanage_static
    balance roundrobin
    option httpchk GET /health

    server static1 10.0.1.30:80 check inter 10s
    server static2 10.0.1.31:80 check inter 10s

backend sysmanage_health
    option httpchk
    http-request return status 200 content-type text/plain string "OK"

# Statistics and monitoring
listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s
    stats admin if TRUE

Health Check Implementation

Application Health Endpoints

FastAPI Health Check Implementation

from fastapi import APIRouter, HTTPException, Depends
from typing import Dict, Any
import asyncio
import time
from datetime import datetime

health_router = APIRouter(prefix="/api/health")

class HealthChecker:
    def __init__(self):
        self.start_time = time.time()
        self.last_db_check = None
        self.last_redis_check = None

    async def check_database(self) -> Dict[str, Any]:
        """Check PostgreSQL database connectivity"""
        try:
            start = time.time()
            # Simple connectivity check
            await db.execute("SELECT 1")
            latency = (time.time() - start) * 1000

            self.last_db_check = datetime.utcnow()
            return {
                "status": "healthy",
                "latency_ms": round(latency, 2),
                "last_check": self.last_db_check.isoformat()
            }
        except Exception as e:
            return {
                "status": "unhealthy",
                "error": str(e),
                "last_check": datetime.utcnow().isoformat()
            }

    async def check_redis(self) -> Dict[str, Any]:
        """Check Redis connectivity"""
        try:
            start = time.time()
            await redis_client.ping()
            latency = (time.time() - start) * 1000

            self.last_redis_check = datetime.utcnow()
            return {
                "status": "healthy",
                "latency_ms": round(latency, 2),
                "last_check": self.last_redis_check.isoformat()
            }
        except Exception as e:
            return {
                "status": "unhealthy",
                "error": str(e),
                "last_check": datetime.utcnow().isoformat()
            }

    async def check_agent_connectivity(self) -> Dict[str, Any]:
        """Check agent connectivity health"""
        try:
            # Get agent connectivity stats
            active_agents = await get_active_agent_count()
            total_agents = await get_total_agent_count()
            disconnected_agents = total_agents - active_agents

            connectivity_ratio = active_agents / max(total_agents, 1)

            return {
                "status": "healthy" if connectivity_ratio > 0.8 else "degraded",
                "active_agents": active_agents,
                "total_agents": total_agents,
                "disconnected_agents": disconnected_agents,
                "connectivity_ratio": round(connectivity_ratio, 3)
            }
        except Exception as e:
            return {
                "status": "unhealthy",
                "error": str(e)
            }

health_checker = HealthChecker()

@health_router.get("/")
async def basic_health():
    """Basic health check for load balancer"""
    return {"status": "ok", "timestamp": datetime.utcnow().isoformat()}

@health_router.get("/detailed")
async def detailed_health():
    """Detailed health check with dependency status"""

    # Run all checks concurrently
    db_check, redis_check, agent_check = await asyncio.gather(
        health_checker.check_database(),
        health_checker.check_redis(),
        health_checker.check_agent_connectivity(),
        return_exceptions=True
    )

    # Calculate overall health
    checks = [db_check, redis_check, agent_check]
    healthy_checks = sum(1 for check in checks
                        if isinstance(check, dict) and
                        check.get("status") in ["healthy", "degraded"])

    overall_status = "healthy"
    if healthy_checks < len(checks):
        overall_status = "unhealthy"
    elif any(check.get("status") == "degraded" for check in checks
             if isinstance(check, dict)):
        overall_status = "degraded"

    response = {
        "status": overall_status,
        "timestamp": datetime.utcnow().isoformat(),
        "uptime_seconds": round(time.time() - health_checker.start_time, 2),
        "checks": {
            "database": db_check if isinstance(db_check, dict) else {"status": "error", "error": str(db_check)},
            "redis": redis_check if isinstance(redis_check, dict) else {"status": "error", "error": str(redis_check)},
            "agents": agent_check if isinstance(agent_check, dict) else {"status": "error", "error": str(agent_check)}
        }
    }

    # Return appropriate HTTP status
    if overall_status == "unhealthy":
        raise HTTPException(status_code=503, detail=response)

    return response

@health_router.get("/websocket")
async def websocket_health():
    """WebSocket-specific health check"""
    try:
        # Check WebSocket server health
        active_connections = await get_active_websocket_connections()
        max_connections = 1000  # Configuration-based limit

        connection_ratio = active_connections / max_connections

        if connection_ratio > 0.9:
            status = "degraded"
        else:
            status = "healthy"

        return {
            "status": status,
            "active_connections": active_connections,
            "max_connections": max_connections,
            "connection_ratio": round(connection_ratio, 3),
            "timestamp": datetime.utcnow().isoformat()
        }
    except Exception as e:
        raise HTTPException(status_code=503, detail={
            "status": "unhealthy",
            "error": str(e),
            "timestamp": datetime.utcnow().isoformat()
        })

Custom Health Check Script

Advanced Health Verification

#!/bin/bash
# Health check script for external monitoring

ENDPOINT="https://sysmanage.example.com/api/health/detailed"
TIMEOUT=10
RETRY_COUNT=3

check_health() {
    local attempt=1

    while [ $attempt -le $RETRY_COUNT ]; do
        echo "Health check attempt $attempt..."

        response=$(curl -s -w "%{http_code}" -m $TIMEOUT "$ENDPOINT")
        http_code="${response: -3}"
        body="${response%???}"

        case $http_code in
            200)
                echo "✓ Service healthy"
                echo "$body" | jq .
                return 0
                ;;
            503)
                echo "⚠ Service degraded"
                echo "$body" | jq .
                return 1
                ;;
            *)
                echo "✗ Service unhealthy (HTTP $http_code)"
                echo "$body"
                ;;
        esac

        attempt=$((attempt + 1))
        sleep 2
    done

    return 2
}

# Run health check
if check_health; then
    echo "All systems operational"
    exit 0
else
    echo "Health check failed"
    exit 1
fi

Session Affinity

WebSocket Sticky Sessions

WebSocket connections require session affinity to maintain real-time communication state:

Redis-Based Session Store

import redis
import json
from typing import Optional, Dict, Any

class SessionAffinityManager:
    def __init__(self, redis_client: redis.Redis):
        self.redis = redis_client
        self.session_ttl = 3600  # 1 hour

    async def get_server_for_session(self, session_id: str) -> Optional[str]:
        """Get assigned server for WebSocket session"""
        try:
            server = await self.redis.get(f"session:{session_id}")
            return server.decode() if server else None
        except Exception as e:
            logger.error(f"Failed to get session affinity: {e}")
            return None

    async def assign_server_to_session(self, session_id: str, server_id: str):
        """Assign server to WebSocket session"""
        try:
            await self.redis.setex(
                f"session:{session_id}",
                self.session_ttl,
                server_id
            )

            # Track active sessions per server
            await self.redis.sadd(f"server:{server_id}:sessions", session_id)
            await self.redis.expire(f"server:{server_id}:sessions", self.session_ttl)

        except Exception as e:
            logger.error(f"Failed to assign session affinity: {e}")

    async def remove_session(self, session_id: str):
        """Remove session affinity mapping"""
        try:
            server_id = await self.get_server_for_session(session_id)
            if server_id:
                await self.redis.srem(f"server:{server_id}:sessions", session_id)

            await self.redis.delete(f"session:{session_id}")

        except Exception as e:
            logger.error(f"Failed to remove session affinity: {e}")

    async def get_server_session_count(self, server_id: str) -> int:
        """Get active session count for server"""
        try:
            count = await self.redis.scard(f"server:{server_id}:sessions")
            return count
        except Exception as e:
            logger.error(f"Failed to get session count: {e}")
            return 0

Load Balancer Integration

Nginx Lua Script for Dynamic Routing

-- Dynamic server selection for WebSocket connections
local redis = require "resty.redis"
local cjson = require "cjson"

local function get_redis_connection()
    local red = redis:new()
    red:set_timeouts(1000, 1000, 1000) -- connect, send, read timeouts

    local ok, err = red:connect("127.0.0.1", 6379)
    if not ok then
        ngx.log(ngx.ERR, "Failed to connect to Redis: ", err)
        return nil
    end

    return red
end

local function get_session_server(session_id)
    local red = get_redis_connection()
    if not red then
        return nil
    end

    local res, err = red:get("session:" .. session_id)
    if not res or res == ngx.null then
        return nil
    end

    red:set_keepalive(10000, 100)
    return res
end

local function assign_least_loaded_server(session_id)
    local servers = {"10.0.1.10:8000", "10.0.1.11:8000", "10.0.1.12:8000"}
    local min_load = math.huge
    local selected_server = servers[1]

    local red = get_redis_connection()
    if not red then
        return selected_server
    end

    -- Find server with least connections
    for _, server in ipairs(servers) do
        local count, err = red:scard("server:" .. server .. ":sessions")
        if count and count < min_load then
            min_load = count
            selected_server = server
        end
    end

    -- Assign session to selected server
    red:setex("session:" .. session_id, 3600, selected_server)
    red:sadd("server:" .. selected_server .. ":sessions", session_id)
    red:expire("server:" .. selected_server .. ":sessions", 3600)

    red:set_keepalive(10000, 100)
    return selected_server
end

-- Main routing logic
local session_id = ngx.var.cookie_sessionid or ngx.var.arg_session
if not session_id then
    ngx.status = 400
    ngx.say("Missing session ID")
    return
end

local assigned_server = get_session_server(session_id)
if not assigned_server then
    assigned_server = assign_least_loaded_server(session_id)
end

-- Set upstream server
ngx.var.backend_server = assigned_server

Auto-Scaling Configuration

Kubernetes Horizontal Pod Autoscaler

HPA Configuration

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: sysmanage-hpa
  namespace: sysmanage
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sysmanage-backend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: active_connections
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 2
        periodSeconds: 60

Custom Metrics for Scaling

Prometheus Metrics Export

from prometheus_client import Counter, Histogram, Gauge, generate_latest
import time

# Metrics for autoscaling
active_connections = Gauge('sysmanage_active_connections', 'Active WebSocket connections')
request_duration = Histogram('sysmanage_request_duration_seconds', 'Request duration')
requests_total = Counter('sysmanage_requests_total', 'Total HTTP requests', ['method', 'endpoint'])
cpu_usage = Gauge('sysmanage_cpu_usage_percent', 'CPU usage percentage')
memory_usage = Gauge('sysmanage_memory_usage_bytes', 'Memory usage in bytes')

class MetricsMiddleware:
    def __init__(self, app):
        self.app = app

    async def __call__(self, scope, receive, send):
        if scope["type"] == "http":
            start_time = time.time()

            # Track request
            method = scope["method"]
            path = scope["path"]
            requests_total.labels(method=method, endpoint=path).inc()

            async def send_wrapper(message):
                if message["type"] == "http.response.start":
                    # Record request duration
                    duration = time.time() - start_time
                    request_duration.observe(duration)
                await send(message)

            await self.app(scope, receive, send_wrapper)
        else:
            await self.app(scope, receive, send)

@app.get("/metrics")
async def get_metrics():
    """Prometheus metrics endpoint"""
    return Response(generate_latest(), media_type="text/plain")

Load Balancer Monitoring

Key Metrics

Traffic Distribution

Requests per backend server
Connection distribution ratios
Session affinity effectiveness
Load balancing algorithm performance

Health & Availability

Backend server health status
Failed health check counts
Server failover frequency
Recovery time metrics

Performance Metrics

Response time distribution
Connection establishment time
SSL handshake duration
Throughput per server

Monitoring Dashboard

Grafana Dashboard Query Examples

# Request rate per backend
rate(nginx_http_requests_total[5m])

# Response time percentiles
histogram_quantile(0.95, rate(sysmanage_request_duration_seconds_bucket[5m]))

# Active connections per server
sysmanage_active_connections

# Health check success rate
rate(nginx_upstream_checks_total{status="up"}[5m]) / rate(nginx_upstream_checks_total[5m])

# Load distribution fairness (coefficient of variation)
stddev_over_time(rate(nginx_http_requests_total[5m])) / avg_over_time(rate(nginx_http_requests_total[5m]))

# SSL handshake duration
histogram_quantile(0.90, rate(nginx_ssl_handshake_time_bucket[5m]))

Troubleshooting

Common Issues

Uneven Load Distribution

Symptoms: Some servers overloaded while others idle

Solutions:

Check load balancing algorithm configuration
Verify server weights are appropriate
Monitor session affinity impact
Review health check intervals

WebSocket Connection Drops

Symptoms: Frequent WebSocket disconnections

Solutions:

Verify session affinity configuration
Check WebSocket timeout settings
Monitor server failover behavior
Review proxy buffer sizes

Health Check Failures

Symptoms: False positive health check failures

Solutions:

Adjust health check timeouts
Implement gradual health checks
Review application startup times
Check resource constraints

Best Practices

Configuration Guidelines

Health Check Tuning: Set appropriate intervals and timeouts
Session Affinity: Use sticky sessions only when necessary
Graceful Shutdowns: Implement proper connection draining
SSL Termination: Terminate SSL at load balancer for performance

Operational Guidelines

Monitoring: Monitor load distribution and health status
Capacity Planning: Plan for peak traffic scenarios
Disaster Recovery: Test failover scenarios regularly
Security: Implement rate limiting and DDoS protection

Next Steps

To learn more about related scaling and performance topics:

Scaling Strategies: Comprehensive scaling approaches
Performance Optimization: System performance tuning
Queue Management: Background task processing
Retry Logic: Resilience and fault tolerance

Load Balancing

Load Balancing Overview

Core Principles

Load Balancing Architecture

Multi-Tier Load Balancing

Nginx Configuration

Primary Load Balancer Setup

nginx.conf - Main Configuration

Advanced Health Checks

Custom Health Check Configuration

HAProxy Configuration

Enterprise-Grade Load Balancing

haproxy.cfg - Complete Configuration

Health Check Implementation

Application Health Endpoints

FastAPI Health Check Implementation

Custom Health Check Script

Advanced Health Verification

Session Affinity

WebSocket Sticky Sessions

Redis-Based Session Store

Load Balancer Integration

Nginx Lua Script for Dynamic Routing

Auto-Scaling Configuration

Kubernetes Horizontal Pod Autoscaler

HPA Configuration

Custom Metrics for Scaling

Prometheus Metrics Export

Load Balancer Monitoring

Key Metrics

Traffic Distribution

Health & Availability

Performance Metrics

Monitoring Dashboard

Grafana Dashboard Query Examples

Troubleshooting

Common Issues

Uneven Load Distribution

WebSocket Connection Drops

Health Check Failures

Best Practices

Configuration Guidelines

Operational Guidelines

Next Steps

Navigation