Description
API proposal
Rationale
HTTP/2 standard commands to open not more than 1 connection per server while handling concurrent requests via streams. This constraint is aimed to increase network usage efficiency in the most common browser-to-service scenario where many clients talk to single server. However, it can become a bottleneck for service-to-service communication on the cloud where a few service talk to each other because they usually need to make a lot of concurrent requests on behalf of their users while each service using the single HttpClient
instance with single HTTP/2 connection for all calls. Thus, if that connection have SETTINGS_MAX_CONCURRENT_STREAMS set to 100 (default value), it won't allow to send more than 100 parallel request nor open the same number of concurrent gRPC streams.
It's proposed to add new API to SocketsHttpHandler and WinHttpHandler enabling opening multiple HTTP/2 connections per server.
SocketsHttpHandler
Native WinHTTP has only boolean option disabling HTTP/2 streams queueing (WINHTTP_OPTION_DISABLE_STREAM_QUEUE), but it doesn't allow to set the limit of open HTTP/2 connections per server. However it seems a bit risky, so for SocketsHttpHandler it's proposed to add an integer property MaxHttp2ConnectionsPerServer
controlling the maximum HTTP/2 connections established to the same server. If this property is set to a value greater than 1, SocketsHttpHandler
will open new HTTP/2 connections when all existing connections reached the maximal number of open streams. Once the number of open connections gets equal to MaxHttp2ConnectionsPerServer
, streams queueing will be enabled again.
public sealed class SocketsHttpHandler : HttpMessageHandler
{
public int MaxHttp2ConnectionsPerServer {get; set}
}
WinHttpHandler
It's proposed to only add the boolean property EnableMultipleHttp2Connections
without any way to limit the number of connections to mirror the behavior of the underlying native implementation.
public class WinHttpHandler : HttpMessageHandler
{
public bool EnableMultipleHttp2Connections {get; set}
}
Problem:
HTTP/2 has a SETTINGS_MAX_CONCURRENT_STREAMS setting that is configured by the server. This is the upper limit of active streams for a single connection. The limit exists to prevent a caller from using up resources on the server by starting an unbounded number of streams on one connection. The recommended lower default for SETTINGS_MAX_CONCURRENT_STREAMS is 100. This is the value that Kestrel uses. Some HTTP/2 servers have a slightly higher limit, but 100-200 appears to be the normal default.
Today HttpClient with HTTP/2 will open a single connection for a host, and all HTTP/2 requests open a new stream on a single connection. If there are already 100 active requests in-progress then SendAsync
will await, a additional requests will form a FIFO queue, waiting for in-progress requests to complete. You can see this behavior discussed on issue #30596. FYI, if the client didn't hang and attempted to call the server anyway then the server will reject the request.
While the limit and queue behavior can make sense for multiple client applications that call a server, it is problematic for server to server communication. It is a common pattern in server applications to create a single HttpClient (either manually, or using the HttpClientFactory), and then use that connection for all calls to another server.
In server to server communication requests will be limited to 100 at a time, decreasing throughput and increasing latency as requests pileup in a queue. Additionally, technologies like gRPC support the concept of long-lived streaming calls. A server app that is using them for real-time communication with another server will hang after the 100th long-lived stream is started.
Two additional issues that worsen this situation:
- This problem will only show up under load. Best case scenario, it will be picked up in load testing. Worse case scenario, the app will deployed into production and the problem will show up intermitently when the app is under heavy use
- There is no feedback of what is going on. Why are HTTP/2 calls hanging? Is it an issue with the client, server, network, environment? Good knowledge of the HTTP/2 spec is required to understand what is going on.
It is hard customers to figure out what has gone wrong and how to fix it.
Solution:
Two broad solutions:
- Increase or remove the default SETTINGS_MAX_CONCURRENT_STREAMS limit on the server.
- Support HttpClient opening additional connections when it reaches the SETTINGS_MAX_CONCURRENT_STREAMS limit.
In my opinion increasing/removing SETTINGS_MAX_CONCURRENT_STREAMS on the server isn't a good solution. Hundreds or thousands of streams multiplexed on one connection will likely degrade performance. TCP level head of line blocking is a thing in HTTP/2, and one dropped packet will hold up every request.
A better solution is for HttpClient to support opening an additional connection to the server when SETTINGS_MAX_CONCURRENT_STREAMS is reached. This will allow a high-throughput of requests or many active gRPC streams without hanging on the client.
New setting on SocketsHttpHandler
:
public class SocketsHttpHandler
{
public bool StrictMaxConcurrentStreams { get; set; }
}
When StrictMaxConcurrentStreams
is false
then an additional connection to the server is created if all existing connections are at the SETTINGS_MAX_CONCURRENT_STREAMS limit.
The maximum number of HTTP/2 connections to a server will be limited by MaxConnectionsPerServer
. When it is reached (max streams on max connections) then the existing behavior will resume of additional requests awaiting in a FIFO queue.
Note that opening a new connection like this to get around SETTINGS_MAX_CONCURRENT_STREAMS is discouraged by the HTTP/2 spec. I think that this guidance is focused at browsers, and doesn't fit for server to server communication in a microservice environment.
I will leave the decision of whether StrictMaxConcurrentStreams defaults to true or false up to networking team.
gRPC usage
Because gRPC is commonly used in microservice scenarios, and gRPC streaming is a popular concept, it makes sense for gRPC to not queue in the client when the limit is reached.
The .NET gRPC client creates its own HttpClient. It can configure the underlying handler so that StrictMaxConcurrentStreams = false.
Prior art
golang has StrictMaxConcurrentStreams - https://godoc.org/golang.org/x/net/http2#Transport. In the latest version of golang StrictMaxConcurrentStreams defaults to false and MaxConnsPerHost has no limit. The golang client will "just work".
WinHttp has WINHTTP_OPTION_DISABLE_STREAM_QUEUE - https://docs.microsoft.com/en-us/windows/win32/winhttp/option-flags#WINHTTP_OPTION_DISABLE_STREAM_QUEUE. I believe this is false by default so queuing is the default behavior.
@davidfowl @karelz @scalablecory @Tratcher @halter73 @stephentoub @shirhatti