Search engine crawlers have a miserable cache hitrate

I’m looking to migrate some load off my main webserver/database, so I was looking into which of our pages render the slowest. While I was doing that, I discovered that most of our rendering time is due to just a few client IP addresses, and they turned out to be search indexers. If I group all requests by user agent (“robots”, which is only googlebot and bingbot, and “humans”, which is everybody else), I get:

Robot reqs:8062 total_sz:85MB avg_sz:10kB avg_upstr_time: 1925ms total_upstr_time: 15521s cache_hitrate: 51.3%
Human reqs:414898 total_sz:5132MB avg_sz:12kB avg_upstr_time: 32ms total_upstr_time: 13520s cache_hitrate: 98.7%

So the average robot request is some 60 times slower to render than the average human one. This is because they spend most of their time loading old pages that nobody else cares about, which are never in cache and incur heavy disk IO times from our database.

I plan to create a readonly database replica and second webserver which will be dedicated to handling requests from these search crawlers. That will stop the caches on our primary server from being wasted on old content that no humans want to see.

Installing Ubuntu Raring’s ImageMagick 6.7.7-10 on Precise

Ubuntu Precise’s package for ImageMagick is currently at version 6.6.9.7. In my PHP application, I take a transparent PNG, scale it down, then write it back out. I found that this version had a bug that caused the transparent edges of the PNG to get a dirty black background applied to them, and this problem was fixed by the time Ubuntu Raring’s version 6.7.7.10 of ImageMagick was released. So this post is about how to install Raring’s ImageMagick package into Precise.

Start by installing build pre-requisites:

sudo apt-get update
sudo apt-get install build-essential fftw3-dev liblcms2-dev liblzma-dev fakeroot perlmagick
sudo apt-get build-dep imagemagick

Download the sourcecode for ImageMagick from Raring by using the download links on this page

wget https://launchpad.net/ubuntu/raring/+source/imagemagick/8:6.7.7.10-5ubuntu2.1/+files/imagemagick_6.7.7.10.orig.tar.bz2
wget https://launchpad.net/ubuntu/raring/+source/imagemagick/8:6.7.7.10-5ubuntu2.1/+files/imagemagick_6.7.7.10-5ubuntu2.1.debian.tar.bz2
wget https://launchpad.net/ubuntu/raring/+source/imagemagick/8:6.7.7.10-5ubuntu2.1/+files/imagemagick_6.7.7.10-5ubuntu2.1.dsc

Now use dpkg-source to unpack that sourcecode for you and apply the patches:

dpkg-source -x imagemagick*.dsc

Enter that unpacked directory:

cd imagemagick-6.7.*

Edit debian/rules in your favourite text editor, find the build-stamp section and add parameters to the command-line for “./configure” that change the quantum to Q8 (meaning that ImageMagick will use the faster 8-bit pipeline internally instead of the slow Q16 one) and disable TIFF support. TIFF support requires libtiff5 which I couldn’t be bothered porting to Precise:

MagickDocumentPath="/usr/share/doc/imagemagick" ./configure --with-quantum-depth=8 --without-tiff \

Edit the install scripts to replace references to Q16 modules to Q8:

sed -i 's/Q16/Q8/' debian/libmagickcore5.install
sed -i 's/Q16/Q8/' debian/libmagickcore5-extra.install

Remove the dependency upon libtiff5-dev:

sed -i 's/libtiff5-dev//' debian/control

Disable running the test-suite (which takes forever):

export DEB_BUILD_OPTIONS=nocheck

Now, from the imagemagick-6.7.7.10 folder, build the package. The options mean don’t clean the source before build, the source is unsigned, changes are unsigned, and don’t build source packages (just binary packages):

dpkg-buildpackage -rfakeroot -nc -us -uc -b

Now you should have a full set of .deb files for ImageMagick in the parent directory! Change into the parent directory and install those:

cd ..
sudo dpkg -i *.deb

Since I wanted to be able to use ImageMagick from PHP, I also needed to install the newer version of php5-imagick to go along with it. Similar process:

sudo apt-get install cdbs
sudo apt-get build-dep php5-imagick

wget http://archive.ubuntu.com/ubuntu/pool/universe/p/php-imagick/php-imagick_3.1.0~rc1-1build2.dsc
wget http://archive.ubuntu.com/ubuntu/pool/universe/p/php-imagick/php-imagick_3.1.0~rc1.orig.tar.gz
wget http://archive.ubuntu.com/ubuntu/pool/universe/p/php-imagick/php-imagick_3.1.0~rc1-1build2.debian.tar.gz

dpkg-source -x php-imagick_3.1.*.dsc
cd php-imagick-3.1.*

dpkg-buildpackage -rfakeroot -nc -uc -b
cd ../
sudo dpkg -i php5-imagick*.deb

Then I restarted Apache so PHP could reload and pick up the new extension:

sudo service apache2 restart

And finally, according to PHP’s phpinfo(), I had the newer version of ImageMagick running!

Screen Shot 2013-11-27 at 10.07.26 pm

My macro shooting rig

I thought I would do a little write-up on the equipment I’m shooting with for macro, since my rig is pretty unusual. I’m using a lens built for old Pentax M42-mount film cameras, along with a double cable release that allows the lens’ aperture to be closed down to my shooting aperture, and the camera’s shutter to be triggered, with one smooth motion.

My macro setup

Read on for details of these components! Continue reading My macro shooting rig

Carl Zeiss Jena Sonnar 135mm f/3.5 test on full-frame

This is a nice lens for 35mm cameras, with a built-in hood. The version I tested was M42 (Pentax screw) mount.

Briefly:

  • Vignetting is only an issue wide open, improves a lot at f/5.6 and is basically gone by f/8
  • The center of the frame is already sharp at f/3.5, and only improves a little bit at f/5
  • The corners are pretty good at f/5.6 and sharpest at f/8

Please read the full review with sample images here.