- January 2025 (6)
- December 2024 (7)
- September 2024 (1)
- August 2024 (2)
- July 2024 (2)
- May 2024 (2)
- April 2024 (2)
- February 2024 (2)
- April 2023 (1)
- March 2023 (2)
- September 2022 (1)
- February 2022 (1)
- November 2021 (1)
- March 2021 (1)
- February 2021 (2)
- August 2019 (1)
- November 2018 (1)
- May 2017 (1)
- December 2016 (1)
- April 2016 (1)
- August 2015 (1)
- December 2014 (1)
- August 2014 (1)
- March 2014 (1)
- December 2013 (1)
- October 2013 (3)
- September 2013 (4)
- August 2013 (2)
- July 2013 (1)
- June 2013 (1)
- February 2013 (1)
- October 2012 (1)
- June 2012 (1)
- May 2012 (1)
- April 2012 (1)
- February 2012 (1)
- October 2011 (1)
- June 2011 (1)
- May 2011 (1)
- April 2011 (1)
- March 2011 (1)
- February 2011 (1)
- January 2011 (1)
- December 2010 (3)
- November 2010 (1)
- October 2010 (1)
- September 2010 (1)
- August 2010 (1)
- July 2010 (1)
- May 2010 (3)
- April 2010 (1)
- March 2010 (2)
- February 2010 (3)
- January 2010 (4)
- December 2009 (2)
- November 2009 (5)
- October 2009 (2)
- September 2009 (2)
- August 2009 (3)
- July 2009 (1)
- May 2009 (1)
- April 2009 (1)
- March 2009 (5)
- February 2009 (5)
- January 2009 (5)
- December 2008 (3)
- November 2008 (7)
- October 2008 (4)
- September 2008 (2)
- August 2008 (1)
- July 2008 (1)
- June 2008 (1)
- May 2008 (1)
- April 2008 (1)
- January 2008 (5)
- December 2007 (3)
- March 2007 (3)
- February 2007 (1)
- January 2007 (2)
- December 2006 (4)
- November 2006 (18)
- Programming (71)
- Python (47)
- AI (25)
- Resolver One (22)
- Resolver Systems (18)
- Linux (13)
- NSLU2 offsite backup project (13)
- PythonAnywhere (12)
- Funny (11)
- Fine-tuning LLMS (10)
- Business of Software (9)
- Gadgets (8)
- Robotics (8)
- LLM from scratch (7)
- Meta (7)
- Finance (6)
- 3D (5)
- Blogging (5)
- Personal (5)
- Uncategorized (5)
- Music (4)
- Oddities (4)
- Rants (4)
- Website design (4)
- Admin (3)
- Dirigible (3)
- Eee (3)
- Politics (3)
- Talks (3)
- GPU Computing (2)
- JavaScript (2)
- Memes (2)
- OLPC XO (2)
- Quick links (2)
- Space (2)
- VoIP (2)
- Copyright (1)
- Django (1)
- Musings (1)
- Raspberry Pi (1)
Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial single-threaded proxy
This is the first step along my road to building a simple C-based reverse
proxy/loadbalancer so that I can understand how
nginx /
OpenResty works --
more explanation here.
It's called rsp
, for Really Simple Proxy. This version listens for
connections on a particular port, specified on the command line; when one is
made it sends the request down to a backend -- another server with an associated
port, also specified on the command line -- and sends whatever comes back from
the backend back to the person who made the original connection. It can only
handle one connection at a time -- while it's handling one, it just queues up
others, and it handles them in turn. This will, of course, change later.
I'm posting this in the hope that it might help people who know Python, and some basic C, but want to learn more about how the OS-level networking stuff works. I'm also vaguely hoping that any readers who code in C day to day might take a look and tell me what I'm doing wrong :-)
Writing a reverse proxy/loadbalancer from the ground up in C, part 0: introduction
We're spending a lot of time on nginx configuration at PythonAnywhere. We're a platform-as-a-service, and a lot of people host their websites with us, so it's important that we have a reliable load-balancer to receive all of the incoming web traffic and appropriately distribute it around backend web-server nodes.
nginx is a fantastic, possibly unbeatable tool for this. It's fast, reliable, and lightweight in terms of CPU resources. We're using the OpenResty variant of it, which adds a number of useful modules -- most importantly for us, one for Lua scripting, which means that we can dynamically work out where to send traffic as the hits come in.
It's also quite simple to configure at a basic level. You want all incoming requests for site X to go to backend Y? Just write something like this:
server {
server_name X
listen 80;
location / {
proxy_set_header Host $host;
proxy_pass Y;
}
}
Simple enough. Lua scripting is pretty easy to add -- you just put an extra
directive before the proxy_pass
that provides some Lua code to run, and then
variables you set in the code can be accessed from the proxy_pass
.
But there are many more complicated options. worker_connections
,
tcp_nopush
, sendfile
, types_hash_max_size
... Some are reasonably easy to
understand with a certain amount of reading, some are harder.
I'm a big believer that the best way to understand something complex is to try to build your own simple version of it. So, in my copious free time, I'm going to start putting together a simple loadbalancer in C. The aim isn't to rewrite nginx or OpenResty; it's to write enough equivalent functionality that I can better understand what they are really doing under the hood, in the same way as writing a compiler for a toy language gives you a better understanding of how proper compilers work. I'll get a good grasp on some underlying OS concepts that I have only a vague appreciation of now. It's also going to be quite fun coding in C again. I've not really written any since 1997.
Anyway, I'll document the steps I take here on this blog; partly because there's a faint chance that it might be interesting to other experienced Python programmers whose C is rusty or nonexistent and want to get a view under the hood, but mostly because the best way to be sure you really understand it is to try to explain it to other people.
I hope it'll be interesting!
Here's a link to the first post in the series: Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial one-shot proxy.