Brian "Krow" Aker (krow) wrote,
Brian "Krow" Aker

Scaling, Systems Required

People keep asking me about components that make up large scale systems. Below is a dump of the systems I typically see/build. I am sure people draw pretty pictures, but to me it is more of a check off list :)

Asset Management

  • Relational

    This operation is typically split between two different
    groups. One group uses data for presentation layers and for
    the feeding of live requests. The other group does data
    analytics for traffic, etc. A third group will also exist in
    some cases to do work for "near time" responses. That data is
    used to handle DOS attacks and other security related

  • Unstructured (Images, Sound, etc) Serving
  • Geographical
  • Fulltext
  • Graph (Social Network information)
  • Identity
  • System Image Backup System (this serves backups and possibly deployment for Jumpstarts)


    The trick to scaling is to create asyncronous actions. Typically
    queue are used to setup jobs like the sending of email, the
    transformation of images, and the harvesting of text. These falls
    into jobs that "must be done" and jobs which are "if we lose it,
    it does not matter". This is used for incoming data and serves as
    a governor for most systems (aka to prevent self inflicted DOS).

  • Image/Video Converters/Trans-coding
  • XML Builders (RSS/etc)
  • Graph Rebuilds
  • Stats building

  • DNS
  • Email (typically bulk sending/reque'ing)

Load Balancing

    Traffic routing to correct software nodes (or static content
    nodes). Will also typically handle the shuttling of SSL data to
    different backends (see Pound server as an example). Either
    Cisco/Linux style routing. The big key with this is

  • Sharding Infrastructure
  • HA Solution routing.


  • Page Caches
  • Object Caches

Software/Jumpstart Systems

Asset software for deployment (Debiab/RHEL package repositories).
Puppet/CFEngine deployment systems.

Monitoring System

Geographical Messaging

  • IM

    XMPP is the favored solution at this point.

  • Replication

    Typically this is the reason why MySQL has gotten used (and
    why systems like MogileFS which do replication are also
    commonly used). This should not be confused with local
    replication, which is more of an HA/Scale out issue (see
    Facebook's example using MySQL + Memcached as a geographical

  • Post a new comment


    Comments allowed for friends only

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded