My image management system for work is really cranking. It is now hosting 22 million images on a single server. Mind you it’s a beefy server, but that’s still a lot of images. What still surprises me is the performance of plack + starman + inline::C.
Let’s say that you need to write a piece of software that stores millions upon millions of images. Well, you can do that pretty easily with a standard web service and back it with something like Amazon::S3. But, in my experience, is that rarely enough. Usually you need to be able to modify the images somehow. People want a PNG when you are storing a JPG, people want a thumbnail, they want it cropped, they want it resized, etc…
You might reach for Image::Magick or some standard tool. Please don’t, that is very very slow. How slow? Let me give you an example. For a sample image, just a nice normal book cover, this is how long it takes for convert on the command line:
-bash-3.2$ time convert 1156067855.jpg -resize 50% output.jpeg
That’s not too bad, but that’s quite a while. Convert (one of the Image magick tools) is a very generalized tool. It’s not just a resizer, and it’s going to do lots of stuff to get ready to resize, when really, you just need a freaking resize. This is where you can use a specialzied image processing library to get bloody stinking fast performance. I use the awesome library leptonica. So, using some of the sample code (prog/scaletest1.c) we can see that to do the same resize takes a considerably smaller amount of time:
-bash-3.2$ time ./scaletest1 1156067855.jpg .5 .5 output.jpeg
So, how can we use this in Perl? Well, through the awesomeness that is the Inline::C module. Using it, allows us to throw together some nice code that lets you use leptonica’s low level image maniuplation primitives right from a Moose object, and let’s you take advantage of all the performance you could ever want:
So how fast is it? Well, here’s an example of using curl to retrieve a resized image from a plack/starman web service:
-bash-3.2$ time curl -O https://SECRET_HOST/2011-04-01/Images/front_cover/x200/sku/1156067855.jpg
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 5774 100 5774 0 0 25600 0 –:–:– –:–:– –:–:– 5638k
So the entire web stack added 1/10th of a second. Not bad at all. This web service, hosting 22+ million images, running on a single server, is serving 10s of millions of requests per day, every single day, in production for over a year, with lots of 9′s of reliability. So yeah, using a little bit of creativity and some fun perl modules, you can make some ridiculously fast performing services.