Yesterday, Brad Fitzpatrick and SixApart hosted the second MogileFS Summit at SixApart’s offices in San Francisco. The initial response on the mailing list suggested a handful of local users would attend, but in the end more than twenty people showed up from a wide range of companies. In addition to a few folks from Danga / SixApart, there was a small group from Guba, two guys from Bloglines, Matt from Wordpress, and developers from lots of other sites both large (in some cases, massive) and small.
MogileFS, for the unanointed, is a specialized distributed filesystem originally built to power LiveJournal. Like its siblings memcached and perlbal, Mogile is open source software. True distributed filesystems are unweildy and complex; Mogile makes a number of assumptions and simplifications that make it easy to deploy, fast, and developer-friendly. Mogile doesn’t mount like a traditional UNIX filesystem (though in the summit, we saw a demo of a FUSE+webdav mount hack) and stores files based on a flat domain / key structure. It’s up to the application using Mogile to add files via a simple API, enforce permissions, map keys to filenames if needed, and query for and cache the locations of stored files. Where Mogile shines is replicating files across pools of cheap, usually non-RAID disk arrays and handling drive and device failures, some level of capacity balancing, and future growth. Mogile really nails a sweet spot for the kind of storage problems many websites face. At Wikispaces, we’ve got millions of files in Mogile and it has been rock-solid since day one.
We spent the majority of the summit talking about what’s coming in Mogile 2 - which is already partially running in production for LiveJournal - and what’s on people’s minds for future releases. Aside from a number of code cleanup and performance enhancements, big changes in Mogile 2 include a new plugin architecture and pluggable replication rules. In Mogile 1.x, replication was controlled by a “mindevcount” setting, the minimum number of devices that a file had to be stored on. In Mogile 2, you can write a replication ruleset that mandates files span racks, datacenters, a certain number of fast systems, etc. Best of all, Mogile 2 is API-compatible with Mogile 1.x, so we can drop it in on the fly.
To Brad, Junior, and everyone from SixApart who’s hacking on Mogile, memcached, and friends - thanks! It was an awesome summit.