I write on this blog mostly to share what I learn in my career and also to improve my writing. I had very low expectation when I posted a link to a recent article I wrote on reddit. I never expected to shoot to number one in a matter of minutes in the technology subreddit. Looking at my Google analytic dashboard , The maximum I attained was over 1700 people at a time reading my blog and that was plenty to overwhelm my server.
The article in question is title: "The PC is not dead, we just don't need new ones". If you haven't read it already give it a go, you won't regret.
I host this site on DigitalOcean, I never thought I would need more than the lowest plan they offer, which is 2.3GHz core with 512MB of RAM and a 20GB SSD. Oh! it only cost and $5 a month. I run Apache and MySQL with my own PHP framework which is very lightweight but still experimental. So before I tell you how I managed to handled this large amount of request, let me tell you how everything went wrong at first.
Everything breaking apart.
Low memory means dropped requests
The first thing I noticed was that my server was running very low on memory. Through the
top Linux command it showed that I only had 6 MB of ram available which was weird because the website was still very much responsive. It's only after reading ,"Linux ate my Ram", that I realized that I still had some more resources available. MySQL was the main application eating all the memory. I restarted the service a few times only to watch it grab all the memory instantly.
By the way, the correct command for checking the memory available is:
My first solution was to implement Memcached directly on my server. Memcached is a caching service that could be used to limit the number of connections to the database and store and retrieve data directly from memory. Writing code in a production server can be very stressful. Any little mistake can break the site. To limit the risk of failure, I decided to only implement memcached on the article pages. After a successful implementation, the memory consumption decreased considerably.
CPU was at 100%
Memory was only part of the problem. At that point I started getting even more traffic and this time Apache was getting overwhelmed. I watched helplessly as the server was failing requests. I couldn't reach it anymore. I restarted the server every 10 minutes because the CPU was at 100%.
My solution this time was to move the static files to a different server or a CDN. With all the stress of coding directly on my server, I couldn't get the DNS to register my new CDN domain name. I struggled for a while and then gave up. I went on reddit to see what people are saying about my article and as they put it so well, We gave him the good old reddit hug. The good news was that people started posting google cached version of my page which according to Google analytics accounted for more than 5,000 page views. (Un)fortunately after 4 hours of being number one on the technology subreddit, my article was marked as SPAM and yanked of the page. The only explanation I got from the moderators was that this post was an opinion, not news. In some way I was relieved as I was seeing the traffic going down. It was still a lot but my server could handle it.
Why was the server overwhelmed in the first place?
After the fact, I went through my error logs, access logs, database schema, and the PHP framework to see what was the problem. My blog receives very little traffic on a daily basis. Although I have been on hackernews front page before it was nothing compare to this experience. Inefficient code is fine on a small scale, but on a larger scale the little optimizations you ignore can comeback to bite you. My tables were indexed but there was a lot of redundant code marked,
// TODO: improve this function, that caused bottlenecks in my framework. Test code was still present on my server, but like I said this framework is experimental.
The one thing that hit my server the hardest was a buggy experimental RSS service I was testing. For each request a user made 2 more request were made in the background on my RSS page. And 2 more were made just to double check the data. So each request on my article page cost 5 requests on my server.
Another big issues was using Apache to serve my static files. It is an unspoken truth on the Internet that Nginx is better suited than Apache to server static files. On this page, I was serving 17 files directly from my Apache, with some of them being large PNGs.
Optimizing the code.
First of all, this blog is an experiment. This is the place where I get to test my code in a real environment. If I screw up and my website is down for 24h it is not a big deal, it's just another challenge to surmount.
The first thing I started doing was take my time (no pressure this time) to move all my static assets to different server running Nginx. I didn't use amazon or any of those fancy services because I am still in a learning process. This server has a little more resources, 1GB of RAM which is more than enough for my needs.
I added a nice integration of Memcached directly in my framework which gives me much better control. Articles are cached once every hour and the number can be easily increased if I get a burst of traffic. Complex queries to get related articles, latest articles being read and many more have also being cached and the expiration time is adjusted automatically when I receive more visits.
Although this is just a simple blog, I do a lot of things in the background. Database inserts can become a bottle neck if they are done too many times. Instead of inserting on each request, I now create a queue which is saved in memcached and periodically inserted in the database in a batch.
Did these changes make a difference?
Due to the popularity of the post on reddit (I assume), It was also posted on hackernews short after. It didn't take very long to rise to number one. This time, the server was doing just fine. My article was on the front-page for a little more than 24 hours. The traffic is less then reddit, nevertheless everything was functioning properly with no complaints whatsoever.
You might say why didn't I just use a static blog engine like Jekyll and the likes. Yes I could have, but that wouldn't be much of a challenge wouldn't it? I have been working on my framework for quite a while and my Blog is the perfect place for me to stress test it.
$ ab -n 1000 -c 5 http://idiallo.com/
It would be nice to use this command to stress test but it wouldn't be very realistic to me. The lesson I learned from watching my server fail right before my eyes is much more valuable.
I wouldn't say these were very big improvement to my code, but they sure did make a difference. Using a CDN (just nginx), caching (memcached), and small optimizations to the code was all that was necessary to have $15.00 a month handle a total of 2.9 million web requests. $5.00 for the server hosting the code and $10.00 for the static file server (which also runs other services). I hope to have the opportunity to watch my current setup also fail and face the challenge of optimizing it.
You can give loader.io a try for load testing your site. Disclaimer, the company I work for built it in our labs department.
Let's hear your thoughts