HTTP2: Create additional connections when maximum active streams is reached

@davidfowl

API proposal

Rationale

HTTP/2 standard commands to open not more than 1 connection per server while handling concurrent requests via streams. This constraint is aimed to increase network usage efficiency in the most common browser-to-service scenario where many clients talk to single server. However, it can become a bottleneck for service-to-service communication on the cloud where a few service talk to each other because they usually need to make a lot of concurrent requests on behalf of their users while each service using the single HttpClient instance with single HTTP/2 connection for all calls. Thus, if that connection have SETTINGS_MAX_CONCURRENT_STREAMS set to 100 (default value), it won't allow to send more than 100 parallel request nor open the same number of concurrent gRPC streams.

It's proposed to add new API to SocketsHttpHandler and WinHttpHandler enabling opening multiple HTTP/2 connections per server.

SocketsHttpHandler

Native WinHTTP has only boolean option disabling HTTP/2 streams queueing (WINHTTP_OPTION_DISABLE_STREAM_QUEUE), but it doesn't allow to set the limit of open HTTP/2 connections per server. However it seems a bit risky, so for SocketsHttpHandler it's proposed to add an integer property MaxHttp2ConnectionsPerServer controlling the maximum HTTP/2 connections established to the same server. If this property is set to a value greater than 1, SocketsHttpHandler will open new HTTP/2 connections when all existing connections reached the maximal number of open streams. Once the number of open connections gets equal to MaxHttp2ConnectionsPerServer, streams queueing will be enabled again.

public sealed class SocketsHttpHandler : HttpMessageHandler
{
    public int MaxHttp2ConnectionsPerServer {get; set}
}

WinHttpHandler

It's proposed to only add the boolean property EnableMultipleHttp2Connections without any way to limit the number of connections to mirror the behavior of the underlying native implementation.

public class WinHttpHandler : HttpMessageHandler
{
    public bool EnableMultipleHttp2Connections {get; set}
}

Problem:

HTTP/2 has a SETTINGS_MAX_CONCURRENT_STREAMS setting that is configured by the server. This is the upper limit of active streams for a single connection. The limit exists to prevent a caller from using up resources on the server by starting an unbounded number of streams on one connection. The recommended lower default for SETTINGS_MAX_CONCURRENT_STREAMS is 100. This is the value that Kestrel uses. Some HTTP/2 servers have a slightly higher limit, but 100-200 appears to be the normal default.

Today HttpClient with HTTP/2 will open a single connection for a host, and all HTTP/2 requests open a new stream on a single connection. If there are already 100 active requests in-progress then SendAsync will await, a additional requests will form a FIFO queue, waiting for in-progress requests to complete. You can see this behavior discussed on issue #30596. FYI, if the client didn't hang and attempted to call the server anyway then the server will reject the request.

While the limit and queue behavior can make sense for multiple client applications that call a server, it is problematic for server to server communication. It is a common pattern in server applications to create a single HttpClient (either manually, or using the HttpClientFactory), and then use that connection for all calls to another server.

In server to server communication requests will be limited to 100 at a time, decreasing throughput and increasing latency as requests pileup in a queue. Additionally, technologies like gRPC support the concept of long-lived streaming calls. A server app that is using them for real-time communication with another server will hang after the 100th long-lived stream is started.

Two additional issues that worsen this situation:

This problem will only show up under load. Best case scenario, it will be picked up in load testing. Worse case scenario, the app will deployed into production and the problem will show up intermitently when the app is under heavy use
There is no feedback of what is going on. Why are HTTP/2 calls hanging? Is it an issue with the client, server, network, environment? Good knowledge of the HTTP/2 spec is required to understand what is going on.

It is hard customers to figure out what has gone wrong and how to fix it.

Solution:

Two broad solutions:

Increase or remove the default SETTINGS_MAX_CONCURRENT_STREAMS limit on the server.
Support HttpClient opening additional connections when it reaches the SETTINGS_MAX_CONCURRENT_STREAMS limit.

In my opinion increasing/removing SETTINGS_MAX_CONCURRENT_STREAMS on the server isn't a good solution. Hundreds or thousands of streams multiplexed on one connection will likely degrade performance. TCP level head of line blocking is a thing in HTTP/2, and one dropped packet will hold up every request.

A better solution is for HttpClient to support opening an additional connection to the server when SETTINGS_MAX_CONCURRENT_STREAMS is reached. This will allow a high-throughput of requests or many active gRPC streams without hanging on the client.

New setting on SocketsHttpHandler:

public class SocketsHttpHandler
{
    public bool StrictMaxConcurrentStreams { get; set; }
}

When StrictMaxConcurrentStreams is false then an additional connection to the server is created if all existing connections are at the SETTINGS_MAX_CONCURRENT_STREAMS limit.

The maximum number of HTTP/2 connections to a server will be limited by MaxConnectionsPerServer. When it is reached (max streams on max connections) then the existing behavior will resume of additional requests awaiting in a FIFO queue.

Note that opening a new connection like this to get around SETTINGS_MAX_CONCURRENT_STREAMS is discouraged by the HTTP/2 spec. I think that this guidance is focused at browsers, and doesn't fit for server to server communication in a microservice environment.

I will leave the decision of whether StrictMaxConcurrentStreams defaults to true or false up to networking team.

gRPC usage

Because gRPC is commonly used in microservice scenarios, and gRPC streaming is a popular concept, it makes sense for gRPC to not queue in the client when the limit is reached.

The .NET gRPC client creates its own HttpClient. It can configure the underlying handler so that StrictMaxConcurrentStreams = false.

Prior art

golang has StrictMaxConcurrentStreams - https://godoc.org/golang.org/x/net/http2#Transport. In the latest version of golang StrictMaxConcurrentStreams defaults to false and MaxConnsPerHost has no limit. The golang client will "just work".

WinHttp has WINHTTP_OPTION_DISABLE_STREAM_QUEUE - https://docs.microsoft.com/en-us/windows/win32/winhttp/option-flags#WINHTTP_OPTION_DISABLE_STREAM_QUEUE. I believe this is false by default so queuing is the default behavior.

@davidfowl @karelz @scalablecory @Tratcher @halter73 @stephentoub @shirhatti

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HTTP2: Create additional connections when maximum active streams is reached #35088

API proposal

Rationale

SocketsHttpHandler

WinHttpHandler

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

HTTP2: Create additional connections when maximum active streams is reached #35088

Description

API proposal

Rationale

SocketsHttpHandler

WinHttpHandler

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions