How does Google load balance requests?
--
The answer is Maglev.
Maglev is a L3/L4 software load balancer. A load balancer such as Nginx, and HAProxy distributes requests among multiple backend services.
Goals
It should allow seamless scalability so a set of load-balancers can be added/removed and the user does not notice any connection drop. A set of backend services can go on/off without affecting a large set of users.
Design
- Backend Service: A service has an assigned VIP.
- A VIP could be a subnet IP or any arbitrary IP.
- Maglev is configured with services and their VIP.
- Maglev announces all VIPs through BGP to the Internet.
- A client can connect to a VIP.
Detailed Design
- The router can choose any Maglev to forward the packet. The stateless nature of choosing Maglev help achieve the goal of adding/deleting Maglevs without affecting user connections. The router uses ECMP to pick a Maglev.
- The packet always goes to the same backend. A Maglev uses a consistent hash of the source IP, destination IP, source port, destination port, and protocol to pick a service backend instance. The consistent hash ensures that a backend service can crash and still most requests map to the same set of backends.
- Maglev needs a way to announce VIPs. A Maglev can go unhealthy and hence router should not send any requests to the Maglev. This is handled in a controller component.
- Maglev needs a way to forward request packets to a backend. The forwarder receives packets from the NIC card. It has to somehow send these packets to an appropriate backend. So the forwarder endcaps the packet using GRE which adds a new source and destination IP for the packet.
The source IP is Maglev’s IP and the destination IP is one of the backend services. The packet is then sent back to NIC and dispatched on the network. The packet eventually routes to the backend service which decaps the packet and processes the request packet.