For those who don't know, Nodejs does not come with static file serving built in (for good reason), user's therefore need to look to external modules to serve files statically.
Lightnode
I've just released lightnode, a static file server for node js, which aims to replace my usage of Lighttpd for serving static files for web applications. It's called lightnode for many reasons, the fact that it sounds like 'lightning' kind of tipped it - I really wanted to take advantage of node's speed, and it's designed to be a replacement for lighttpd which itself is very fast for serving static files. The other important reason for it's name is that it's meant to be lightweight in terms of concepts, so that it's easy to understand and manipulate, which I believe is a very important aspect to utilizing node as a file server. In this benchmark I'll go over the simple means through which lightnode attains it's speed, and give some pointers about writing efficient code for node.
Benchmarking
So far I've seen very few benchmarks for node js based static file servers, many of which are sub 500 line modules, so they have likely been created fairly quickly by their authors. Lightnode is no exclusion, except that I've created it to try take advantage of node's potential directly in comparison to the traditional server's such as Apache and Lightnode, so I intend to keep it around, and therefore it's important for me to know how efficient a node based static file server can be in production environments.
Here I've benchmarked my server against the super fast lighttpd, as well as against another node option, which is the connect framework's static file server module (I'm not a fan of the framework, but their static server made by one of the learnboost guys seems to have been the best of the bunch).
I created the first version of lightnode with all the features I wanted such as hierarchical servers and http caching - leaving request efficiency for later - then started the benchmarking.
I've used the apache benchmarking program (ab) and set it to send 10 000 requests over 100 concurrent connections. I've included the raw results in the my lightnode repo. The most important number that ab churns out is the milliseconds per request, which I'll focus on.
Pre-Optimization Comparison
At this first phase it performed quite poorly, but that was to be expected. I wanted to get a feeling for what optimizations were important and how important they were.
Lighttpd was outperforming my pre-optimized server almost three fold, taking 19ms per request while lightnode was taking 56ms, Lighttpd was serving 5267 requests / second, whereas lightnode was serving 1769 requests per second. What was most concerning about this result however, is that all - except one - of the static file server modules published for node so far were doing the exact same thing as my server was, so they probably have this same performance.
I wanted to make sure that this result wasn't something that I was doing terribly wrong so I benchmarked another server. I decided to try the connect server because I hadn't looked at it's source yet.
I was quite surprised to see that the connect static server was actually as fast as lighttpd. When I checked out the source I realized that it was the only node server that was doing in memory caching of the file contents. Switching this caching off caused the connect server to perform the same as my server.
It turns out that in memory caching of static file contents can increase performance three fold, and give you efficiency equal to the fastest servers out there. The memory usage for most sites would be very small, provided you choose not to cache the big files.
In general you want to stream the larger files and dynamic content, and you want to cache smaller files completely, especially in Node applications. The operating system does keep a cache of file access, however, it still needs to make a user land copy of that memory for each request a process makes for it. As far as I know, in the case of node, copying that memory into the javascript application involves an expensive data encoding operation, so you really want to cache node's data buffers in your code if your application logic will allow it.
However, all is not that well in the land of efficient static servers for node. In order for the connect server to cache the contents it has made the considerably undesirable tradeoff that any changes made to the file during the lifespan of the server are completely ignored. That means that you need to reset your production server every time you want even a small change to your static files to be reflected, potentially cutting off clients, and obviously an inconvenience to development. This is not how lighttpd achieves it's speed. What I've done for lightnode is create a simple caching system that still reflects changes immediately while achieving the same speed.
Lightnode's Serving Process
What lightnode, as well as all the other servers are doing every time they try to serve a file is calling stat() on the filename to see if it exists and when last it was modified. The file is then read in completely with readFile(). In the case of connect that file content is cached, and they then choose to avoid doing a stat() on any subsequent requests.
Caching Properly
One of the main efficiencies I was worried about for lightnode was doing that stat() on every request. It turns out that it contributes to an extra 10ms per request, the caching of the file contents contributes about 25ms, so they are both something that we want to avoid. However, seen as we are dealing in the range of milliseconds, and aiming to serve some 5000 requests per second, we shouldn't really be doing the stat on every request anyways, instead we should do it once and then only repeat it again a second or so later, at least for this application of serving 'static' files.
An Optimized Lightnode
It turns out that we don't need to avoid it on every request to achieve the same level of efficiency. Lightnode's caching mechanism will avoid calling stat if the last stat was done in as little as 0.5 seconds. This is sufficient to use as a development environment where files are changing and being viewed immediately, and achieves the same efficiency. The file content itself is cached and only refreshed when requested if the stat shows a change. This is a technique of stat caching and file contents caching is something that carries over to almost any re-use of file contents and something every node developer should know about to achieve better efficiency.
The philosophy of node opens up a lot of potential to us that wasn't previously available, so we should get used to taking advantage of them. In a PHP application for example, it was much more difficult to be able to cache something for the lifespan of the server
(and is still conceptually at odds with the programming paradigm).
The optimized lightnode server is now performing almost equally with lighttpd with this work load. It's processing a request in 23ms (was 56ms) while lighttpd does it in 19, and it's serving 4348 request per second (was only 1769) while lighttpd does 5267 requests per second.
NodeJs static servers can be just as fast as other servers, and offer the potential for much greater flexibility and ease of use in comparison to traditional servers. With proper caching mechanisms there is no need for unnecessary tradeoffs and the server functionality is more suitable for production.
Check out the new lightnode server for nodeJS, other than lightning fast static file serving it offers a lightweight framework for control handling even in non-file server (dynamic) applications, which helps tame the immense flexibility that node provides.