Friday, July 13, 2007

Digg: Technologies Used & Challenges Faced

This is a very interesting presentation (see below) on the technologies used to make Digg, the challenges faced along the way and how they overcame it. I would recommend it to any Web 2.0 startup architects.In short Digg uses multiple MySQL slaves with a single master, multiple load balanced PHP servers which connects to random MySQL slave (for load balancing obviously). They also use Memcached with multiple specialized pools like a separate pool for search.

Digg is the poster boy of PHP driven high volume sites.

It wasn’t clear from the slides whether they actually used sharding (breaking your database into small segment say based on data ranges, tables etc.) in any form. Looking at their architecture I don’t think they can easily use sharding in future too. Can you guess why?