We really need to migrate off of NGINX. The news of their commercial offering earlier this year wasn’t well received but the magnitude of the problem wasn’t clear to me until earlier today. We’re working on centralizing logging (what a concept), not just as a best practice for issue resolution and health monitoring, but to reduce the write load on our beleaguered old NetApp at One Wilshire.
You see, Black Friday is coming up and we’re already observing latency spikes during the 8am sales event. Cyber Monday is one flash sale event every hour; I don’t have to do the math there. We’re virtually guaranteed to crush the NetApp with this onslaught. Even though our CEO is a great guy who understands our saddling of technical debt, I can’t help but get a sinking feeling when the website takes a fat dump and he moseys on down from the 3rd floor to ask “how we’re doing” and “how can we mitigate future issues.”
So back to NGINX. Our noble HTTP server will generate, in aggregate, around 50M events a day. These all get flushed to disk from hundreds of VMs, whose hosts in turn flush their images to the NetApp. Getting these NGINX log events into a remote server is easy with rsyslog. But preventing the log events from writing to disk in the first place? Fucking impossible, unless you a) don’t want logs or b) pony up for the commercial offering that patches direct syslog into the NGINX logging engine. Yep, they took a feature that Apache httpd has had literally forever and put it behind a pay wall.
Oh sure, there are patches that’ll add in that ignominious feature. It’s trivial to pull down the NGINX source, pull down the patch, patch the source, test the build, create a spec file, test the RPM, add it into the yum repository, and stage an untested major update to every web server just before the holiday rush. It’s just that I shouldn’t have to do any of this shit. Tested, reliable packages are the entire point of upstream package sources like EPEL. Patching NGINX in-house is a huge step back.
So what next? In the short term we’ll probably do something hackish like mounting /var/log/nginx on a ram disk and get some serious logrotate action going. It’ll eliminate those dastardly writes and get us out of the woods. As a proper solution, httpd is sounding quite nice all of a sudden. We’re using php-fpm anyway; the performance difference between nginx and httpd in this scenario is negligible.
I don’t blame NGINX for wanting money in exchange for what is admittedly amazing software. I just don’t want any part of it. This is just the first feature of presumably many that the FOSS NGINX distribution will lack compared to the commercial offering. In short, this is Oracle and MySQL all over again. I wonder if we’ll see the same apparent exodus from NGINX as we have from MySQL? Who will pick up the mantle and actively develop the next game-changing FOSS web server?