Skip to main content

Diigo Home

Let the microblogs bloom - RussellBeattie.com - The Diigo Meta page

www.russellbeattie.com/...let-the-microblogs-bloom - Cached

This link has been bookmarked by 2 people . It was first bookmarked on 04 Jul 2008, by EricSchlissel.

  • 07 Jul 08
    newsmaven
    Brent Sordyl

    Evan did a great job creating a nice little project with some cool features like OpenID, Jabber support and the beginnings of a federation system. Looking at the code, however, it's doomed.

    twitter scalability rubyonrails

  • 04 Jul 08
    ericschlissel
    EricSchlissel

    Let the microblogs bloom
    Posted Thursday, July 3, 2008 12:35 pm

    [image]

    I was just about to embark on a post yesterday about my latest obsession which is web-based forums (actually, it's a return of an old obsession) when identi.ca launched with their open source PHP-based Twitter clone, so I just had to try it out. I threw it up on foozik.com if you want to see. It took me a while to get the dependencies working, but it seems pretty cool.

    It's a great effort, looks good, and promoted in all the right ways. Evan (the guy behind identi.ca and the laconi.ca code base) did a great job creating a nice little project with some cool features like OpenID, Jabber support and the beginnings of a federation system.

    Looking at the code, however, it's doomed.

    The core architecture just isn't made to scale, and a day after it launched identi.ca already seems to be paying the price, even after adding a bunch more servers. Here's the the problem in a few lines of code:


    $notice = DB_DataObject::factory('notice');

    # XXX: chokety and bad

    $notice->whereAdd('EXISTS (SELECT subscribed from subscription where subscriber = '.$profile->id.' and subscribed = notice.profile_id)', 'OR');

    $notice->whereAdd('profile_id = ' . $profile->id, 'OR');

    $notice->orderBy('created DESC');

    Even the comments express this is "chokety and bad". Ignoring the use of the PEAR::DB data object stuff (that's adding abstractions on top of your database that you can't afford to have) this code shows that the design of the system is fundamentally flawed. The core problem is the query itself - it's expensive as hell: "Get all the notices (messages) where I am subscribed to the publisher." Oh, man. As the database grows, the indexes will have to get huge, and as there's more subscribers and more subscriptions between subscribers, it's going to be impossible for that query to keep up.

    The lesson from Twitter is that microblogs aren't Content Management Systems at all, but are instead Messaging systems, and have to be architected as such. SMTP or EDI are

    social