Choosing A Proxy Server
ApacheCon 2014
Bryan Call
ATS Committer / Yahoo
About Me
• Yahoo! Employee
– WebRing, GeoCities, Personals, Tiger Team, Platform
Architect, Edge Team, Research, ATS and HTTP
(HTTP/2 and TLS at IETF)
• Working on Traffic Server for 7 years
– Since 2007
• Part of the team that open sourced it in 2009
• ATS Committer
Overview
• Types of Proxies
• Features
• Architecture
• Cache Architecture
• Performance
• Pros and Cons
How are you going to
use a proxy server?
Reverse Proxy
Reverse Proxy
• Proxy in front of your own web servers
• Caching?
• Geographic location?
• Connection handling?
• SSL termination?
• SPDY support?
• Adding business logic?
Forward Proxy
Intercepting Proxy
Forward / Intercepting Proxy
• Proxy in front of the Internet
• Configure clients to use proxy?
• Caching?
• SSL - CONNECT?
• SSL - termination?
Choices
Plenty of Proxy Servers
PerlBal
Plenty of Proxy Servers
Features And Options
Features
ATS NGiNX Squid Varnish Apache httpd
mod_proxy
Reverse Proxy Y Y Y Y Y
Forward Proxy Y N Y N Y
Transp. Proxy Y N Y N Y
Plugin APIs Y Y partial Y Y
Cache Y Y Y Y Y
ESI Y N Y partial N
ICP Y N Y N N
SSL Y Y Y N Y
SPDY Y* Y N N partial
* 5.0.0 (May 2014)
SSL Features
Source: https://istlsfastyet.com/ - Ilya Grigorik
What type of proxy do you need?
• Of our candidates, only three fully supports all
proxy modes
HTTP/1.1 Compliance
HTTP/1.1 Compliance
• Accept-Encoding - gzip
• Vary
• Age
• If-None-Match
How things can go wrong: Vary
$ curl -D - -o /dev/null -s --compress http://10.118.73.168/
HTTP/1.1 200 OK
Server: nginx/1.3.9
Date: Wed, 12 Dec 2012 18:00:48 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8051
Connection: keep-alive
Cache-Control: public, max-age=900
Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie,Accept-Encoding
Content-Encoding: gzip
How things can go wrong: Vary
$ curl -D - -o /dev/null -s http://10.118.73.168/
HTTP/1.1 200 OK
Server: nginx/1.3.9
Date: Wed, 12 Dec 2012 18:00:57 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8051
Connection: keep-alive
Cache-Control: public, max-age=900
Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie,Accept-Encoding
Content-Encoding: gzip
EPIC FAIL!
Note: no gzip request
CoAdvisor HTTP protocol quality tests
for reverse proxies
0 100 200 300 400 500 600
ATS 3.3.1
Nginx 1.3.9
Squid 3.2.5
Varnish 3.0.3
Failures Violations Success
49%
81%
51%
68%
CoAdvisor HTTP protocol quality tests
for reverse proxies
0 100 200 300 400 500 600
ATS 3.3.1
Nginx 1.3.9
Squid 3.2.5
Varnish 3.0.3
Failures Violations Success
25%
6%
27%
15%
Architecture
Architecture And Process Models
• Multithreading
• Events
• Process
• Fibers
– Co-operative multitasking, getcontext/setcontext
Threads
Thread 1
Thread 2
Thread 3
Thread 1
Thread 3
Time
Single CPU
Thread 1 Thread 2
Thread 3
Thread 1
Thread 3
Time
Dual CPU
Threads
• Pros
– Easy to share memory
– Lightweight context switching
• Cons
– Easy to (accidently) share memory
• Overwriting another threads memory
– Locking
• Deadlocks, race conditions, starvation
Event Processing
Event
Loop
Scheduled
events
Network
events
Disk I/O
events
Disk
handler
HTTP state
machine
Accept
handler
Queue
Can generate new events
Problems with Event Processing
• Doesn’t work well with
blocking APIs
– open(), locking
• It doesn’t scale on SMP by
itself
Process Model And Architecture
ATS NGiNX Squid Varnish Apache httpd
mod_proxy
Threads X X X
Events X X X partial X
Processes X X X
Caching Architecture
Cache
• Mainly two types
– File system
– Database like
• In memory index
– Bytes per object
• Minimize disk seeks and system calls
Cache
ATS NGiNX Squid Varnish Apache httpd
mod_cache
File system X X X
mmap X
Raw disk/direct IO X X
Ram cache X X
Memory index X X X*
Persistent cache X X X X
Performance Testing
ATS Configuration
etc/trafficserver/remap.config:
map / http://origin.example.com
etc/trafficserver/records.config:
CONFIG proxy.config.http.server_ports STRING 80
CONFIG proxy.config.accept_threads INT 3
NGiNX Configuration
worker_processes 24;
access_log logs/access.log main;
proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m max_size=16384m inactive=600m;
proxy_temp_path /mnt/nginx_temp;
server {
set $ae "";
if ($http_accept_encoding ~* gzip) {
set $ae "gzip";
}
location / {
proxy_pass http://origin.example.com;
proxy_cache my-cache;
proxy_set_header If-None-Match "";
proxy_set_header If-Modified-Since "";
proxy_set_header Accept-Encoding $ae;
proxy_cache_key $uri$is_args$args$ae;
}
location ~ /purge_it(/.*) {
proxy_cache_purge example.com $1$is_args$args$myae
}
Squid Configuration
http_access allow all
http_port 80 accel
workers 24
cache_mem 4096 MB
memory_cache_shared on
cache_dir rock /usr/local/squid/cache 1000 max-size=32768
cache_peer origin.example.com parent 80 0 no-query
originserver
Varnish Configuration
backend default {
.host = ”origin.example.com”;
.port = "80";
}
Varnish Configuration (Cont)
sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p
thread_pool_max=4000
sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p
thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=200
sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p
thread_pool_max=2000 -p thread_pool_add_delay=2 -p
thread_pool_min=1000 -p session_linger=0
sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p
thread_pool_max=2000 -p thread_pool_add_delay=2 -p
thread_pool_min=1000 -p session_linger=10
Apache httpd Configuration
LoadModule cache_module modules/mod_cache.so
LoadModule cache_disk_module modules/mod_cache_disk.so
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
Include conf/extra/httpd-mpm.conf
ProxyPass / http://origin.example.com/
<IfModule mod_cache_disk.c>
CacheRoot /usr/local/apache2/cache
CacheEnable disk /
CacheDirLevels 5
CacheDirLength 3
</IfModule>
MaxKeepAliveRequests 10000
Benchmark 1
• 1,000 clients
• 8KB response
• 100% cache hit
• Keep-alive on
• 100K rps rate limited
• Squid used the most CPU
and the worst median
latency
• 95th percentile latency
with NiGNX, Squid and
httpd 0
500
1000
1500
2000
2500
ATS NGiNX Squid Varnish httpd
RPS / CPU Usage
0
20000
40000
60000
80000
100000
120000
ATS NGiNX Squid Varnish httpd
Requests Per Second
0
2
4
6
8
10
12
14
16
18
ATS NGiNX Squid Varnish httpd
Latency
Median
95th
Benchmark 2
• 1,000 clients
• 8KB response
• 100% cache hit
• Keep-alive off
• Squid used the most
CPU again
• NGiNX had latency
issues
• ATS most throughput 0
500
1000
1500
2000
2500
ATS NGiNX Squid Varnish httpd
RPS / CPU Usage
0
5000
10000
15000
20000
25000
30000
ATS NGiNX Squid Varnish httpd
Requests Per Second
0
5
10
15
20
25
30
35
40
ATS NGiNX Squid Varnish httpd
Latency
Median
95th
ATS
• Pros
– Scales well automatically, little config needed
– Best cache implementation
• Cons
– Too many config files
– Too many options in the default config files
NGiNX
• Pros
– Lots of plugins
– FastCGI support
• Cons
– HTTP/1.1 compliance
– Latency issues around accepting new connections
– Rebuild server for new plugins
Squid
• Pros
– Best HTTP/1.1 compliance
• Cons
– Memory index for cache using 10x that of ATS
– Least efficient with CPU
– Worst median latency for keep-alive benchmarks
Varnish
• Pros
– VCL (Varnish Configuration Language)
• Can do a lot without writing plugins
• Cons
– Thread per connection
– mmap for cache
• Persistence is experimental
– No SSL or SPDY support
Apache httpd
• Pros
– Lots of plugins
– Most used http server
– Best 95th percentile latency for non-keep-alive
• Cons
– SPDY Support
Why ATS?
• Scales well
– CPU Usage, auto config
• Cache scales well
– Efficient memory index, minimizes seeks
• Apache Community
• Plugin support
– Easy to port existing plugins over
References
• ATS - http://trafficserver.apache.org/
• NGiNX - http://nginx.org/
• Squid - http://www.squid-cache.org/
• Varnish - https://www.varnish-cache.org/
• Apache httpd - http://httpd.apache.org/
Choosing A Proxy Server - Apachecon 2014

Choosing A Proxy Server - Apachecon 2014

  • 1.
    Choosing A ProxyServer ApacheCon 2014 Bryan Call ATS Committer / Yahoo
  • 2.
    About Me • Yahoo!Employee – WebRing, GeoCities, Personals, Tiger Team, Platform Architect, Edge Team, Research, ATS and HTTP (HTTP/2 and TLS at IETF) • Working on Traffic Server for 7 years – Since 2007 • Part of the team that open sourced it in 2009 • ATS Committer
  • 3.
    Overview • Types ofProxies • Features • Architecture • Cache Architecture • Performance • Pros and Cons
  • 4.
    How are yougoing to use a proxy server?
  • 5.
  • 6.
    Reverse Proxy • Proxyin front of your own web servers • Caching? • Geographic location? • Connection handling? • SSL termination? • SPDY support? • Adding business logic?
  • 7.
  • 8.
  • 9.
    Forward / InterceptingProxy • Proxy in front of the Internet • Configure clients to use proxy? • Caching? • SSL - CONNECT? • SSL - termination?
  • 10.
  • 11.
    Plenty of ProxyServers PerlBal
  • 12.
  • 13.
  • 14.
    Features ATS NGiNX SquidVarnish Apache httpd mod_proxy Reverse Proxy Y Y Y Y Y Forward Proxy Y N Y N Y Transp. Proxy Y N Y N Y Plugin APIs Y Y partial Y Y Cache Y Y Y Y Y ESI Y N Y partial N ICP Y N Y N N SSL Y Y Y N Y SPDY Y* Y N N partial * 5.0.0 (May 2014)
  • 15.
  • 16.
    What type ofproxy do you need? • Of our candidates, only three fully supports all proxy modes
  • 17.
  • 18.
    HTTP/1.1 Compliance • Accept-Encoding- gzip • Vary • Age • If-None-Match
  • 19.
    How things cango wrong: Vary $ curl -D - -o /dev/null -s --compress http://10.118.73.168/ HTTP/1.1 200 OK Server: nginx/1.3.9 Date: Wed, 12 Dec 2012 18:00:48 GMT Content-Type: text/html; charset=utf-8 Content-Length: 8051 Connection: keep-alive Cache-Control: public, max-age=900 Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000 Expires: Sun, 19 Nov 1978 05:00:00 GMT Vary: Cookie,Accept-Encoding Content-Encoding: gzip
  • 20.
    How things cango wrong: Vary $ curl -D - -o /dev/null -s http://10.118.73.168/ HTTP/1.1 200 OK Server: nginx/1.3.9 Date: Wed, 12 Dec 2012 18:00:57 GMT Content-Type: text/html; charset=utf-8 Content-Length: 8051 Connection: keep-alive Cache-Control: public, max-age=900 Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000 Expires: Sun, 19 Nov 1978 05:00:00 GMT Vary: Cookie,Accept-Encoding Content-Encoding: gzip EPIC FAIL! Note: no gzip request
  • 21.
    CoAdvisor HTTP protocolquality tests for reverse proxies 0 100 200 300 400 500 600 ATS 3.3.1 Nginx 1.3.9 Squid 3.2.5 Varnish 3.0.3 Failures Violations Success 49% 81% 51% 68%
  • 22.
    CoAdvisor HTTP protocolquality tests for reverse proxies 0 100 200 300 400 500 600 ATS 3.3.1 Nginx 1.3.9 Squid 3.2.5 Varnish 3.0.3 Failures Violations Success 25% 6% 27% 15%
  • 23.
  • 24.
    Architecture And ProcessModels • Multithreading • Events • Process • Fibers – Co-operative multitasking, getcontext/setcontext
  • 25.
    Threads Thread 1 Thread 2 Thread3 Thread 1 Thread 3 Time Single CPU Thread 1 Thread 2 Thread 3 Thread 1 Thread 3 Time Dual CPU
  • 26.
    Threads • Pros – Easyto share memory – Lightweight context switching • Cons – Easy to (accidently) share memory • Overwriting another threads memory – Locking • Deadlocks, race conditions, starvation
  • 27.
  • 28.
    Problems with EventProcessing • Doesn’t work well with blocking APIs – open(), locking • It doesn’t scale on SMP by itself
  • 29.
    Process Model AndArchitecture ATS NGiNX Squid Varnish Apache httpd mod_proxy Threads X X X Events X X X partial X Processes X X X
  • 30.
  • 31.
    Cache • Mainly twotypes – File system – Database like • In memory index – Bytes per object • Minimize disk seeks and system calls
  • 32.
    Cache ATS NGiNX SquidVarnish Apache httpd mod_cache File system X X X mmap X Raw disk/direct IO X X Ram cache X X Memory index X X X* Persistent cache X X X X
  • 33.
  • 34.
    ATS Configuration etc/trafficserver/remap.config: map /http://origin.example.com etc/trafficserver/records.config: CONFIG proxy.config.http.server_ports STRING 80 CONFIG proxy.config.accept_threads INT 3
  • 35.
    NGiNX Configuration worker_processes 24; access_loglogs/access.log main; proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m max_size=16384m inactive=600m; proxy_temp_path /mnt/nginx_temp; server { set $ae ""; if ($http_accept_encoding ~* gzip) { set $ae "gzip"; } location / { proxy_pass http://origin.example.com; proxy_cache my-cache; proxy_set_header If-None-Match ""; proxy_set_header If-Modified-Since ""; proxy_set_header Accept-Encoding $ae; proxy_cache_key $uri$is_args$args$ae; } location ~ /purge_it(/.*) { proxy_cache_purge example.com $1$is_args$args$myae }
  • 36.
    Squid Configuration http_access allowall http_port 80 accel workers 24 cache_mem 4096 MB memory_cache_shared on cache_dir rock /usr/local/squid/cache 1000 max-size=32768 cache_peer origin.example.com parent 80 0 no-query originserver
  • 37.
    Varnish Configuration backend default{ .host = ”origin.example.com”; .port = "80"; }
  • 38.
    Varnish Configuration (Cont) sudo/usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=4000 sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=200 sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=1000 -p session_linger=0 sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=1000 -p session_linger=10
  • 39.
    Apache httpd Configuration LoadModulecache_module modules/mod_cache.so LoadModule cache_disk_module modules/mod_cache_disk.so LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_http_module modules/mod_proxy_http.so Include conf/extra/httpd-mpm.conf ProxyPass / http://origin.example.com/ <IfModule mod_cache_disk.c> CacheRoot /usr/local/apache2/cache CacheEnable disk / CacheDirLevels 5 CacheDirLength 3 </IfModule> MaxKeepAliveRequests 10000
  • 40.
    Benchmark 1 • 1,000clients • 8KB response • 100% cache hit • Keep-alive on • 100K rps rate limited
  • 41.
    • Squid usedthe most CPU and the worst median latency • 95th percentile latency with NiGNX, Squid and httpd 0 500 1000 1500 2000 2500 ATS NGiNX Squid Varnish httpd RPS / CPU Usage 0 20000 40000 60000 80000 100000 120000 ATS NGiNX Squid Varnish httpd Requests Per Second 0 2 4 6 8 10 12 14 16 18 ATS NGiNX Squid Varnish httpd Latency Median 95th
  • 42.
    Benchmark 2 • 1,000clients • 8KB response • 100% cache hit • Keep-alive off
  • 43.
    • Squid usedthe most CPU again • NGiNX had latency issues • ATS most throughput 0 500 1000 1500 2000 2500 ATS NGiNX Squid Varnish httpd RPS / CPU Usage 0 5000 10000 15000 20000 25000 30000 ATS NGiNX Squid Varnish httpd Requests Per Second 0 5 10 15 20 25 30 35 40 ATS NGiNX Squid Varnish httpd Latency Median 95th
  • 44.
    ATS • Pros – Scaleswell automatically, little config needed – Best cache implementation • Cons – Too many config files – Too many options in the default config files
  • 45.
    NGiNX • Pros – Lotsof plugins – FastCGI support • Cons – HTTP/1.1 compliance – Latency issues around accepting new connections – Rebuild server for new plugins
  • 46.
    Squid • Pros – BestHTTP/1.1 compliance • Cons – Memory index for cache using 10x that of ATS – Least efficient with CPU – Worst median latency for keep-alive benchmarks
  • 47.
    Varnish • Pros – VCL(Varnish Configuration Language) • Can do a lot without writing plugins • Cons – Thread per connection – mmap for cache • Persistence is experimental – No SSL or SPDY support
  • 48.
    Apache httpd • Pros –Lots of plugins – Most used http server – Best 95th percentile latency for non-keep-alive • Cons – SPDY Support
  • 49.
    Why ATS? • Scaleswell – CPU Usage, auto config • Cache scales well – Efficient memory index, minimizes seeks • Apache Community • Plugin support – Easy to port existing plugins over
  • 50.
    References • ATS -http://trafficserver.apache.org/ • NGiNX - http://nginx.org/ • Squid - http://www.squid-cache.org/ • Varnish - https://www.varnish-cache.org/ • Apache httpd - http://httpd.apache.org/

Editor's Notes

  • #6  A reverse proxy, aka a web accelerator, does not require the browser to cooperate in any special way. As far as the user (browser) is concerned, it looks like it’s talking to any other HTTP web server on the internet. The reverse proxy server on the other hand must be explicitly configured for what traffic it should handle, and how such requests are properly routed to the backend servers (aka. Origin Servers). Just as with a forward proxy, many reverse proxies are configured to cache content locally. It can also help load balancing and redundancy on the Origin Servers, and help solve difficult problems like Ajax routing.
  • #8 * Before we go into details of what drives Traffic Server, and how we use it, let me briefly discuss the three most common proxy server configurations.* In a forward proxy, the web browser has to be manually (or via auto-PAC files etc.) configured to use a proxy server for all (or some) requests. The browser typically sends the “full” URL as part of the GET request.The forward proxy typically is not required to be configured for “allowed” destination addresses, but can be configured with Access Control List, or blacklists controlling what requests are allowed, and by whom. A forward proxy is typically allowed to cache content, and a common use case scenario is inside corporate firewalls.
  • #9  An intercepting proxy, also commonly called a transparent proxy, is very similar to a forward proxy, except the client (browser) does not require any special configuration. As far as the user is concerned, the proxying happens completely transparently. A transparent proxy will intercerpt the HTTP requests, modify them accordingly, and typically “forge” the source IP before forwarding the request to the final destination. Transparent proxies usually also implements traffic filters and monitoring, allowing for strict control of what HTTP traffic passes through the mandatory proxy layer. Typical use cases include ISPs and very strictly controlled corporate firewalls. I’m very excited to announce that as of a few days ago, code for transparent proxy is available in the subversion tree.
  • #15 Squid – SPDY not on roadmap- http://wiki.squid-cache.org/Squid-3.5 or in the bugs for 3.5 – no progress http://wiki.squid-cache.org/Features/HTTP2ESI – Edge Side Includes - http://en.wikipedia.org/wiki/Edge_Side_IncludesICP - Internet Cache Protocol -http://www.ietf.org/rfc/rfc2186.txthttpd - mod_spdy uses Chromium&apos;s SpdyFramer class to encode and decode SPDY frames.
  • #16 https://istlsfastyet.com/ - IlyaGrigorik
  • #19 NGiNX – doesn’t handle accept-encoding or vary at all
  • #26  Multithreading allows a process to split itself, and run multiple tasks in “parallel”. There is significantly less overhead running threads compared to individual processes, but threads are still not free. They need memory resources, and incur context switches. It’s a known methodology for solving the concurrency problem, and many, many server implementations relies heavily on threads. Modern OS’es have good support for threads, and standard libraries are widely available.
  • #28  Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
  • #29  Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
  • #32 Squid - 72 or 104 bytes of metadata in memory for every object in your cache. http://wiki.squid-cache.org/SquidFaq/SquidMemory#Why_does_Squid_use_so_much_memory.21.3FATS – 10 bytes
  • #33 Squid – ufs (filesystem) – rock store (database style)Varnish – since it is a mmap cache and the index is part of the mmap it has a in memory indexATS – Using a “cyclone cache” similar to a log based file system – merges writes less seeking
  • #35 ATS – should auto config accept threads
  • #44 NIGX – uses the least CPU, but has really bad latenciesATS – most tuses a lot less CPU then Squid, Varnish, httpd
  • #48 VCL - https://www.varnish-cache.org/trac/wiki/VCLExamples
  • #49 httpd - mod_spdy uses Chromium&apos;s SpdyFramer class to encode and decode SPDY frames.