AWS Load Balancing ELB: Solutions Architect Study Guide

Q: What is the main difference between ALB, NLB, and CLB?

Application Load Balancers (ALB) operate at Layer 7 and excel at content-based routing. They use hostnames, paths, and HTTP headers to make routing decisions. ALBs are ideal for microservices and modern web applications. Network Load Balancers (NLB) operate at Layer 4 and provide extreme performance. They deliver sub-millisecond latency and millions of requests per second. NLBs are suited for non-HTTP protocols and ultra-high-performance applications. Classic Load Balancers (CLB) are legacy load balancers supporting basic Layer 4 and 7 functionality. For new deployments, AWS recommends ALB for most web applications and NLB for extreme performance scenarios. The exam heavily emphasizes understanding when to choose each type based on application requirements.

Q: How do health checks improve application availability?

Health checks continuously monitor target instance health by sending periodic requests and analyzing responses. If an instance fails to respond successfully within the unhealthy threshold (typically 2-3 consecutive failures), the load balancer automatically removes it from rotation. This prevents user traffic from reaching failed instances. This automatic failure detection and recovery occurs without manual intervention. This enables self-healing architectures. Healthy threshold settings (typically 2-3 successful checks) ensure instances are fully operational before receiving traffic. Proper health check configuration is critical for maintaining availability during instance failures, deployments, and infrastructure issues. For exam questions, remember that health checks are separate from Auto Scaling replacement, though they often work together.

Q: What are target groups and why are they important?

Target groups are logical collections of resources that receive traffic from load balancers. Resources can include EC2 instances, containers, Lambda functions, or on-premises servers. Each target group has its own health check configuration, routing rules, and sticky session settings. For ALBs, you can create multiple target groups and route different traffic patterns to different groups. You can use hostnames, paths, or headers to determine routing. This enables complex architectures where a single load balancer serves multiple applications. Target groups integrate seamlessly with Auto Scaling. They automatically add new instances and remove terminating instances. When designing solutions, target groups enable decoupling between load balancing logic and target management. This allows flexible architecture patterns.

Q: What is connection draining and why does it matter?

Connection draining (called deregistration delay on ALB/NLB) allows graceful instance shutdown. It waits for existing connections to complete before removing instances from the load balancer. When an instance is marked for deregistration, the load balancer stops sending new requests to it. However, it allows in-flight requests to finish within a timeout period (default 300 seconds). This prevents abrupt connection termination that could disrupt user sessions and cause poor user experience. During Auto Scaling scale-in events or instance termination, connection draining ensures active requests complete before servers shut down. You can adjust the timeout based on your longest expected request duration. For exam scenarios, understand that connection draining is essential for maintaining data consistency and user experience during planned maintenance and scaling events.

By FluentFlash Research Team·Updated 2026-04-30

Elastic Load Balancing (ELB) is a fundamental AWS service that distributes incoming traffic across multiple targets. This ensures high availability and fault tolerance for your applications.

For AWS Solutions Architect certification candidates, mastering ELB concepts is essential. Load balancing appears frequently in exam questions covering architectural design, scalability, and performance optimization.

This guide covers the three ELB types: Application Load Balancers (ALB), Network Load Balancers (NLB), and Classic Load Balancers (CLB). You'll learn their use cases, configuration options, and practical applications.

Understanding load balancing enables you to design resilient architectures. These architectures handle traffic spikes, maintain availability during instance failures, and optimize costs. Flashcards work particularly well for ELB topics because they help you quickly recall specific features and differences between load balancer types.

Key Takeaways

•ALB is the primary choice for modern web applications with content-based routing; NLB handles extreme performance; CLB is legacy and being phased out.
•Health checks automatically detect and remove unhealthy instances, enabling self-healing architectures with minimal manual intervention.
•Target groups enable sophisticated traffic distribution patterns and seamlessly integrate with Auto Scaling for automatic instance management.
•Deploy load balancers and targets across multiple availability zones to ensure high availability during zone failures.
•Connection draining ensures graceful instance shutdown by allowing existing requests to complete before removing instances.
•Choosing the correct load balancer type directly impacts performance, cost, and architectural resilience in AWS solutions.

Types of Elastic Load Balancers and Their Characteristics

AWS offers three primary load balancer types. Each is optimized for different scenarios and workload requirements.

Application Load Balancer (ALB)

Application Load Balancers operate at Layer 7 (Application layer) and are ideal for modern web applications. They excel with microservices and container deployments.

ALBs support advanced routing based on hostnames, paths, and protocols. This makes them perfect for complex architectures serving multiple applications behind a single load balancer.

Network Load Balancer (NLB)

Network Load Balancers operate at Layer 4 (Transport layer) and handle extreme performance requirements. They deliver ultra-high throughput, sub-millisecond latency, and millions of requests per second.

NLBs are best for:

Non-HTTP protocols
Gaming applications
IoT deployments
Financial trading platforms

Classic Load Balancer (CLB)

Classic Load Balancers are the original ELB service. They operate across Layers 4 and 7 and provide basic load balancing for EC2-Classic instances.

For exam purposes, understand that ALB dominates modern AWS deployments due to its content-based routing. NLB handles extreme scale scenarios. Each type requires different configuration approaches and integrates differently with Auto Scaling groups, target groups, and health checks.

Load Balancer Configuration and Health Checks

Proper configuration directly impacts application availability and performance. Health checks are critical components that monitor target health automatically.

Health Check Configuration

You configure health checks by specifying:

Protocol (HTTP, HTTPS, or TCP)
Port number
Path (for HTTP/HTTPS)
Timing intervals

The healthy threshold defines how many consecutive successful checks are needed. Typically this is 2-3 checks.

The unhealthy threshold specifies how many consecutive failures trigger removal. Usually 2-3 checks as well.

Interval settings determine how frequently health checks occur. Shorter intervals provide faster failure detection but use more resources.

Target Groups and Deregistration

Target groups are logical sets of resources that receive traffic. They can include EC2 instances, containers, Lambda functions, or on-premises servers.

You can create multiple target groups with different health check configurations for different applications.

Connection draining (for CLB) or deregistration delay (for ALB/NLB) ensures graceful shutdown. This allows in-flight requests to complete before closing connections.

Understanding these parameters is essential for designing architectures that recover quickly from failures. Your system will maintain consistent performance under varying load conditions.

Advanced Routing and Traffic Management

Application Load Balancers provide sophisticated routing capabilities essential for modern distributed architectures.

Content-Based Routing Options

Path-based routing directs traffic to different target groups based on URL paths. For example, requests to example.com/api route to one target group while example.com/images routes to another.

Hostname-based routing enables a single ALB to serve multiple domains and subdomains. This eliminates the need for separate load balancers for different websites.

Other routing options include:

HTTP header-based routing
Query parameter routing
HTTP method routing

Priority and Sticky Sessions

Priority rules determine the order in which routing rules are evaluated. Lower numbers are processed first, allowing you to create complex traffic patterns.

Sticky sessions maintain user sessions by routing subsequent requests from the same client to the same target. This is useful for applications maintaining in-memory state.

However, sticky sessions reduce load balancing effectiveness. They're typically avoided in modern stateless architectures.

Performance Features

SSL/TLS termination at the load balancer offloads cryptographic operations from application servers. This improves performance significantly.

Cross-zone load balancing distributes traffic evenly across availability zones. This prevents uneven load distribution when instances are unevenly distributed.

These routing mechanisms allow you to architect solutions that efficiently distribute traffic. They handle microservices deployments and implement sophisticated traffic management required by complex applications.

Integration with Auto Scaling and High Availability Patterns

Load balancers work in tandem with Auto Scaling to create resilient, self-healing architectures.

Automatic Instance Management

When you attach a load balancer to an Auto Scaling group, new instances automatically register as targets. They receive traffic immediately without manual configuration.

The load balancer's health checks integrate with Auto Scaling to replace unhealthy instances. This provides automatic recovery without manual intervention.

Target groups enable seamless scaling where instances are added or removed based on metrics like CPU utilization or request count.

Multi-Availability Zone Deployment

For maximum availability, deploy load balancers and targets across multiple availability zones. Each AZ hosts its own instances.

The load balancer distributes traffic proportionally across zones. If an entire AZ fails, the load balancer automatically routes all traffic to remaining zones.

Multi-Region Architectures

Multi-region architectures use Route 53 or Global Accelerator for load balancing across regions. Each region contains its own ELB and Auto Scaling group.

Connection pooling at load balancers improves performance by reusing connections to targets. This avoids establishing new connections for each request.

These integration patterns create self-scaling, self-healing architectures. They maintain availability and performance during failures, traffic spikes, and maintenance operations. Exam questions frequently test your understanding of how load balancers, Auto Scaling, and health checks work together.

Performance Optimization and Cost Considerations

Choosing the right load balancer type significantly impacts both performance and cost.

Performance by Load Balancer Type

Network Load Balancers handle extreme throughput scenarios at the lowest latency. They support millions of connections per second and ultra-low latency of microseconds. This performance comes at higher cost.

Application Load Balancers provide excellent performance for most web applications. They handle millions of requests per second while offering Layer 7 intelligence required for modern architectures.

Classic Load Balancers are less expensive but lack advanced routing features. They're being phased out for new applications.

Optimization Techniques

Connection multiplexing at ALBs and NLBs reduces the number of connections to backend targets. This improves target efficiency.

Keep-alive connections maintain TCP connections between load balancer and targets. This reduces latency by eliminating connection establishment overhead.

Load balancer capacity is determined by load balancer type, instance size, and traffic patterns. AWS automatically scales capacity, but understanding capacity planning helps optimize costs.

Cost Management Strategies

Cross-zone load balancing incurs additional data transfer charges. Consider disabling it if your architecture maintains balanced instance distribution across zones.

Reserved capacity for load balancers provides cost savings for predictable traffic patterns.

Monitoring metrics like active connections, processed bytes, and request count helps identify optimization opportunities. This ensures you're using the most cost-effective load balancer type for your workload.

Master AWS Load Balancing for Solutions Architect Certification

Create flashcards for ELB concepts, load balancer types, health check configurations, and routing strategies. Flashcards help you memorize specific features, quickly recall differences between ALB/NLB/CLB, and practice scenario-based questions. Study efficiently with spaced repetition and targeted practice on your weakest concepts.

Create Free Flashcards

Frequently Asked Questions

What is the main difference between ALB, NLB, and CLB?

Application Load Balancers (ALB) operate at Layer 7 and excel at content-based routing. They use hostnames, paths, and HTTP headers to make routing decisions.

ALBs are ideal for microservices and modern web applications.

Network Load Balancers (NLB) operate at Layer 4 and provide extreme performance. They deliver sub-millisecond latency and millions of requests per second.

NLBs are suited for non-HTTP protocols and ultra-high-performance applications.

Classic Load Balancers (CLB) are legacy load balancers supporting basic Layer 4 and 7 functionality.

For new deployments, AWS recommends ALB for most web applications and NLB for extreme performance scenarios. The exam heavily emphasizes understanding when to choose each type based on application requirements.

How do health checks improve application availability?

Health checks continuously monitor target instance health by sending periodic requests and analyzing responses.

If an instance fails to respond successfully within the unhealthy threshold (typically 2-3 consecutive failures), the load balancer automatically removes it from rotation. This prevents user traffic from reaching failed instances.

This automatic failure detection and recovery occurs without manual intervention. This enables self-healing architectures.

Healthy threshold settings (typically 2-3 successful checks) ensure instances are fully operational before receiving traffic.

Proper health check configuration is critical for maintaining availability during instance failures, deployments, and infrastructure issues. For exam questions, remember that health checks are separate from Auto Scaling replacement, though they often work together.

What are target groups and why are they important?

Target groups are logical collections of resources that receive traffic from load balancers. Resources can include EC2 instances, containers, Lambda functions, or on-premises servers.

Each target group has its own health check configuration, routing rules, and sticky session settings.

For ALBs, you can create multiple target groups and route different traffic patterns to different groups. You can use hostnames, paths, or headers to determine routing.

This enables complex architectures where a single load balancer serves multiple applications.

Target groups integrate seamlessly with Auto Scaling. They automatically add new instances and remove terminating instances.

When designing solutions, target groups enable decoupling between load balancing logic and target management. This allows flexible architecture patterns.

How do load balancers integrate with Auto Scaling groups?

When you attach a load balancer to an Auto Scaling group, new instances launched during scale-out automatically register with the target group. They begin receiving traffic immediately.

The load balancer's health checks inform Auto Scaling about instance health. This triggers replacement of unhealthy instances.

Auto Scaling uses CloudWatch metrics to trigger scaling decisions. The load balancer ensures traffic distribution remains balanced during scaling activities.

This integration creates self-healing, self-scaling architectures requiring minimal manual intervention.

During scale-in, connection draining settings ensure existing requests complete before instances are terminated. This tight integration between load balancers and Auto Scaling is fundamental to AWS's automatic scaling philosophy.

What is connection draining and why does it matter?

Connection draining (called deregistration delay on ALB/NLB) allows graceful instance shutdown. It waits for existing connections to complete before removing instances from the load balancer.

When an instance is marked for deregistration, the load balancer stops sending new requests to it. However, it allows in-flight requests to finish within a timeout period (default 300 seconds).

This prevents abrupt connection termination that could disrupt user sessions and cause poor user experience.

During Auto Scaling scale-in events or instance termination, connection draining ensures active requests complete before servers shut down.

You can adjust the timeout based on your longest expected request duration. For exam scenarios, understand that connection draining is essential for maintaining data consistency and user experience during planned maintenance and scaling events.