SNI-based reverse proxying with Go(lang)
Short version for readers who know all about this kind of stuff: we built a simple reverse-proxy server in Go that load-balances HTTP requests using the
Hosts
header and HTTPS using the SNIs from the client handshake. Backends are selected per-host from sets stored in a redis database. It works pretty well, but we won't be using it because it can't send the originating client IP to the backends when it's handling HTTPS. Code here.
We've been looking at options to load-balance our user's web applications at PythonAnywhere; this post is about something we considered but eventually abandoned; I'm posting it because the code might turn out to be useful to other people.
A bit of background first; if you already know what a reverse proxy is and how load-balancing and virtual hosting work, you can skip forward a bit.
Imagine an old-fashioned shared hosting environment. You're able to run a web
application on a machine that's being used by lots of other people, and you're
given that machine's IP address. You set up your DNS configuration so that your
domain points to that IP address, and it all works. When a connection comes in
from a browser to access your site, the web server on the machine needs to work
out which person's web app it should route it to. It does this by looking at
the HTTP request and finding a Host
header in it. So, by using the Host
header, the shared hosting provider can keep costs down by sharing an IP address
and a machine between multiple clients. This is called virtual hosting.
Now consider the opposite case -- a high-traffic website, where one machine isn't enough to handle all of the traffic. Processing a request for a page on a website can take a certain amount of machine resources -- database lookups, generating dynamic pages from templates, and so on. So a single web server might not be enough to cope with lots of traffic. In this case, people use what's called a reverse proxy, or load-balancer. In the simplest case, this is just a machine running on a single IP. When a request comes in, it selects a backend -- that is, one of a number of web servers, each of which is running the full website's code. It then just sends the request down to one of them, and copies all data that comes back from that backend up to the browser that made the request. Because just copying data around from backend to browser and vice versa is much easier work than processing the actual request, a single load-balancer can handle many more requests than any of the backend web servers could, and if it's configured to select backends appropriately it can spread the load smoothly across them. Additionally, this kind of setup can handle outages gracefully -- if one backend stops responding, it can stop routing to it and use the others as backups.
Now let's combine those two ideas. Imagine a platform-as-a-service, where each outward-facing IP might be responsible for handling large numbers of websites. But for reliability and performance, it might make sense to have each website backed by multiple backends. So, for example, a PaaS might have a thousand websites backed by one hundred different webservers, where website one is handled by backends one, two and three, website two by backends two, three and four, and so on. This means that the PaaS can keep costs down (running ten web apps per backend server) and reliability and performance up (each website having three independent backends).
So, that's the basics. There are a number of great tools which can be used to operate as super-efficient proxies that can handle this kind of many-hostnames-to-many-backends mapping. nginx is the most popular, but there are also haproxy and hipache. We are planning to choose one of these for PythonAnywhere (more about that later), but we did identify one slight problem with all of them. The code I'm shortly going to show was our attempt at working around that problem.
The description above of how virtual hosting works is fine when we're talking about HTTP. But increasingly, people want to use HTTPS for secure connections.
When an HTTPS connection comes in, the server has a problem. Before it can
decode what's in the request and get the Host
header, it needs to establish a
secure link. Its first step to establish that link is to send a certificate to
the client to prove it is who it says it is. But each of the different virtual
hosts on the machine will need a different certificate, because they're all on
different domains. So there's a chicken-and-egg problem; it needs to know which
host it is meant to be in order to send the right certificate, but it needs to
have sent the certificate in order to establish a secure connection to find out
which host it is meant to be. This was a serious problem until relatively
recently; basically, it meant that every HTTPS-secured site had to have its own
dedicated IP address, so that the server could tell which certificate to serve
when a client connected by looking at the IP address the connection came in on.
This problem was solved by an extension to the TLS protocol (TLS being the
latest protocol to underly HTTPS) called "Server Name Indication". Basically,
it takes the idea of the HTTP Host
header and moves it down the stack a bit.
The initial handshake message that a client connecting to a server used to just
say "here I am and here's the kind of SSL protocol I can handle -- now what's
your certificate?" With SNI the handshake also says "here's the hostname I
expect you to have"
So with SNI, a browser connects to a server, and the server looks at the
handshake to find out which certificate to use. The browser and server
establish a secure link and then the browser sends the normal HTTP request,
which has a Host
header, which it then uses to send the request to the
appropriate web app.
Let's get back to the proxy server that's handling incoming requests for lots of
different websites and routing them to lots of different backends. With all of
the proxies mentioned above -- nginx, hipache and haproxy -- a browser makes a
connection, the proxy does all of the SNI stuff to pick the right certificate,
it decodes the data from the client, works out which backend to send it to
using the Host
header in the decoded data, and then forwards everything on.
There's an obvious inefficiency here. The proxy shouldn't have to decode the
secure connection to get the Host
header -- after all, it already knows that
from the information in the SNI. And it gets worse. Decoding the secure
connection uses up CPU cycles on the proxy. And either the connection between
the proxy and the backends is non-secure, which could be an issue if a hacker
got onto the network, or it's secure, in which case the proxy is decoding and
then encoding everything that goes through it -- even more CPU load. Finally,
all of the certificates for every site that the proxy's handling -- and their
associated private keys -- have to be available to the proxy. Which is another
security risk if it gets hacked.
So, probably like many people before us, we thought "why not just route HTTPS based on the SNI? It can't be that hard!" And actually, it isn't. Here's a GitHub project with a simple Go application that routes HTTP requests using the hosts header, and HTTPS using the SNI. It never needs to know anything about the certificates for the sites it's proxying for, and all data is passed through without any decryption.
So why did we decide against using it? Access logs and spam filters. The thing is,
people who are running websites like to know who's been looking at their stuff --
for their website metrics, for filtering out spammy people (using tools like
Akismet), and so on. If you're using a proxy, then the backend sees every
request as coming from the proxy's IP, which isn't all that useful. So normally
a proxy will add an extra header to HTTP requests it passes through --
X-Forwarded-For
is the usual one.
And the problem with an SNI proxy is the same as its biggest advantage. Because it's not decoding the secure stream from the browser, it can't change it, so it can't insert any extra headers. So all HTTPS requests going over any kind of SNI-based reverse proxy will appear to come from the proxy itself. Which breaks things.
So we're not going to use this. And TBH it's not really production-level code -- it was a spike and is also the first Go code I've ever written, so it's probably full of warts (comments very much welcomed!). Luckily we realised the problem with the backends not knowing about the client's IP before we started work on rewriting it test-first.
On the other hand, it might be interesting for anyone who wants to do stuff like
this. The interesting stuff is mostly in handleHTTPSConnection
, which decodes
the TLS handshake sent by the client to extract the SNI.
I did a bit of very non-scientific testing just to make sure it all works. I started three backends servers with simple Flask apps that did a sleep on every request to simulate processing:
from flask import Flask
import time
from socket import gethostname
app = Flask(__name__)
@app.route("/")
def index():
time.sleep(0.05)
return "Hello from " + gethostname()
if __name__ == "__main__":
app.run("0.0.0.0", 80, processes=4)
Then ran the Apache ab
tool to see what the performance characteristics were
for one of them:
root@abclient:~# ab -n1000 -c100 http://198.199.83.71/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 198.199.83.71 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: Werkzeug/0.9.2
Server Hostname: 198.199.83.71
Server Port: 80
Document Path: /
Document Length: 19 bytes
Concurrency Level: 100
Time taken for tests: 21.229 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 172000 bytes
HTML transferred: 19000 bytes
Requests per second: 47.10 [#/sec] (mean)
Time per request: 2122.938 [ms] (mean)
Time per request: 21.229 [ms] (mean, across all concurrent requests)
Transfer rate: 7.91 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 3 7.4 0 37
Processing: 73 2025 368.7 2129 2387
Waiting: 73 2023 368.4 2128 2386
Total: 103 2028 363.7 2133 2387
Percentage of the requests served within a certain time (ms)
50% 2133
66% 2202
75% 2232
80% 2244
90% 2286
95% 2317
98% 2344
99% 2361
100% 2387 (longest request)
root@abclient:~#
Then, after adding records to the proxy's redis instance to tell it to route
requests with the hostname proxy
to any of the backends, and hacking the hosts
file on the ab client machine to make the hostname proxy
point to it:
root@abclient:~# ab -n1000 -c100 http://proxy/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking proxy (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: Werkzeug/0.9.2
Server Hostname: proxy
Server Port: 80
Document Path: /
Document Length: 19 bytes
Concurrency Level: 100
Time taken for tests: 7.668 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 172000 bytes
HTML transferred: 19000 bytes
Requests per second: 130.41 [#/sec] (mean)
Time per request: 766.803 [ms] (mean)
Time per request: 7.668 [ms] (mean, across all concurrent requests)
Transfer rate: 21.91 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 1.7 0 9
Processing: 93 695 275.4 617 1228
Waiting: 93 693 275.4 614 1227
Total: 99 696 274.9 618 1228
Percentage of the requests served within a certain time (ms)
50% 618
66% 799
75% 948
80% 995
90% 1116
95% 1162
98% 1185
99% 1204
100% 1228 (longest request)
root@abclient:~#
So, it works. I've not done ab testing with the HTTPS side of things, but I have hacked my own hosts file and spent a day accessing Google and PythonAnywhere itself via the proxy. It works :-)
As to what we're actually going to use for load-balancing PythonAnywhere:
- nginx is great but stores its routing config in files, which doesn't easily scale to large numbers of hosts/backends. It's doable, but it's just a nightmare to manage, especially if things go wrong.
- haproxy is the same -- worse, it needs to be fully restarted (interrupting ongoing connections) if you change the config.
- hipache stores data in redis (which is what inspired me to do something similar for this proxy) so it can gracefully handle rapidly-changing routing setups. But it's written in Node.js, so while it's pretty damn fast, it's not as fast as nginx.
But... as the dotcloud people who wrote hipache recently pointed out (bottom of the post), nginx's built-in lua scripting support is now at a level where you can store your routing config in redis -- so with a bit of work, you can get the speed of nginx with the ease of configuration of hipache. So that's where we're heading. We'll just have to make sure the proxy and its certificates are super-secure, and live with the extra CPU load.