Types of Elastic Load Balancers and Their Characteristics
AWS offers three primary load balancer types. Each is optimized for different scenarios and workload requirements.
Application Load Balancer (ALB)
Application Load Balancers operate at Layer 7 (Application layer) and are ideal for modern web applications. They excel with microservices and container deployments.
ALBs support advanced routing based on hostnames, paths, and protocols. This makes them perfect for complex architectures serving multiple applications behind a single load balancer.
Network Load Balancer (NLB)
Network Load Balancers operate at Layer 4 (Transport layer) and handle extreme performance requirements. They deliver ultra-high throughput, sub-millisecond latency, and millions of requests per second.
NLBs are best for:
- Non-HTTP protocols
- Gaming applications
- IoT deployments
- Financial trading platforms
Classic Load Balancer (CLB)
Classic Load Balancers are the original ELB service. They operate across Layers 4 and 7 and provide basic load balancing for EC2-Classic instances.
For exam purposes, understand that ALB dominates modern AWS deployments due to its content-based routing. NLB handles extreme scale scenarios. Each type requires different configuration approaches and integrates differently with Auto Scaling groups, target groups, and health checks.
Load Balancer Configuration and Health Checks
Proper configuration directly impacts application availability and performance. Health checks are critical components that monitor target health automatically.
Health Check Configuration
You configure health checks by specifying:
- Protocol (HTTP, HTTPS, or TCP)
- Port number
- Path (for HTTP/HTTPS)
- Timing intervals
The healthy threshold defines how many consecutive successful checks are needed. Typically this is 2-3 checks.
The unhealthy threshold specifies how many consecutive failures trigger removal. Usually 2-3 checks as well.
Interval settings determine how frequently health checks occur. Shorter intervals provide faster failure detection but use more resources.
Target Groups and Deregistration
Target groups are logical sets of resources that receive traffic. They can include EC2 instances, containers, Lambda functions, or on-premises servers.
You can create multiple target groups with different health check configurations for different applications.
Connection draining (for CLB) or deregistration delay (for ALB/NLB) ensures graceful shutdown. This allows in-flight requests to complete before closing connections.
Understanding these parameters is essential for designing architectures that recover quickly from failures. Your system will maintain consistent performance under varying load conditions.
Advanced Routing and Traffic Management
Application Load Balancers provide sophisticated routing capabilities essential for modern distributed architectures.
Content-Based Routing Options
Path-based routing directs traffic to different target groups based on URL paths. For example, requests to example.com/api route to one target group while example.com/images routes to another.
Hostname-based routing enables a single ALB to serve multiple domains and subdomains. This eliminates the need for separate load balancers for different websites.
Other routing options include:
- HTTP header-based routing
- Query parameter routing
- HTTP method routing
Priority and Sticky Sessions
Priority rules determine the order in which routing rules are evaluated. Lower numbers are processed first, allowing you to create complex traffic patterns.
Sticky sessions maintain user sessions by routing subsequent requests from the same client to the same target. This is useful for applications maintaining in-memory state.
However, sticky sessions reduce load balancing effectiveness. They're typically avoided in modern stateless architectures.
Performance Features
SSL/TLS termination at the load balancer offloads cryptographic operations from application servers. This improves performance significantly.
Cross-zone load balancing distributes traffic evenly across availability zones. This prevents uneven load distribution when instances are unevenly distributed.
These routing mechanisms allow you to architect solutions that efficiently distribute traffic. They handle microservices deployments and implement sophisticated traffic management required by complex applications.
Integration with Auto Scaling and High Availability Patterns
Load balancers work in tandem with Auto Scaling to create resilient, self-healing architectures.
Automatic Instance Management
When you attach a load balancer to an Auto Scaling group, new instances automatically register as targets. They receive traffic immediately without manual configuration.
The load balancer's health checks integrate with Auto Scaling to replace unhealthy instances. This provides automatic recovery without manual intervention.
Target groups enable seamless scaling where instances are added or removed based on metrics like CPU utilization or request count.
Multi-Availability Zone Deployment
For maximum availability, deploy load balancers and targets across multiple availability zones. Each AZ hosts its own instances.
The load balancer distributes traffic proportionally across zones. If an entire AZ fails, the load balancer automatically routes all traffic to remaining zones.
Multi-Region Architectures
Multi-region architectures use Route 53 or Global Accelerator for load balancing across regions. Each region contains its own ELB and Auto Scaling group.
Connection pooling at load balancers improves performance by reusing connections to targets. This avoids establishing new connections for each request.
These integration patterns create self-scaling, self-healing architectures. They maintain availability and performance during failures, traffic spikes, and maintenance operations. Exam questions frequently test your understanding of how load balancers, Auto Scaling, and health checks work together.
Performance Optimization and Cost Considerations
Choosing the right load balancer type significantly impacts both performance and cost.
Performance by Load Balancer Type
Network Load Balancers handle extreme throughput scenarios at the lowest latency. They support millions of connections per second and ultra-low latency of microseconds. This performance comes at higher cost.
Application Load Balancers provide excellent performance for most web applications. They handle millions of requests per second while offering Layer 7 intelligence required for modern architectures.
Classic Load Balancers are less expensive but lack advanced routing features. They're being phased out for new applications.
Optimization Techniques
Connection multiplexing at ALBs and NLBs reduces the number of connections to backend targets. This improves target efficiency.
Keep-alive connections maintain TCP connections between load balancer and targets. This reduces latency by eliminating connection establishment overhead.
Load balancer capacity is determined by load balancer type, instance size, and traffic patterns. AWS automatically scales capacity, but understanding capacity planning helps optimize costs.
Cost Management Strategies
Cross-zone load balancing incurs additional data transfer charges. Consider disabling it if your architecture maintains balanced instance distribution across zones.
Reserved capacity for load balancers provides cost savings for predictable traffic patterns.
Monitoring metrics like active connections, processed bytes, and request count helps identify optimization opportunities. This ensures you're using the most cost-effective load balancer type for your workload.
