Brian "Krow" Aker (krow) wrote,
Brian "Krow" Aker

Social Networks, Databases, Brad's post

Finally got around to reading Brad's article on Social Networks.

I am not going to comment too much on his ideas. I thought he presented them in a very straight forward manner and I can notfind much of a reason to disagree with what he has said. I have concerns about how to control the information, but part of me just thinks that we shouldn't be too concerned with the privacy issue. Sign up and use it if you want to, otherwise reenter your information into each system you want to use.

Ease of use wins over privacy for a lot of people.

When I implemented Zoo for Slashdot back in 2003 (or was it 2002?) I made a point of making all of the social graph public:

You can even "pull" the data via rss:

I had hoped that people would find that interesting enough to build applications around, but Slashdot doesn't have the sort of leverage in the consumer market to make that happen. One thing you can note about Slashdot though, is that it exposes the entire range of relationships:

Friends -> People you like
Foe -> Someone you dislike like

And then the perception of the relationship:
Fans -> People who like you (sometimes known as stalkers)
Freaks -> People who dislike you

The "Freaks" part is still unusual for social systems, since almost none of the systems understand the concept of perception. I find that I still have to explain this to people :)

I have been thinking more about the concept of mapping these relationships. Brad's paper talks about they why, but the how to is more my thing. A simple system to say "give me all of user A" is pretty simple. Any database that can handle a lot of rows, can handle this. Mapping the relationship though is a bit different.

Graphs are not really native to relational databases. They can be made to work in them, but it is far from perfect (and some kudo's to Oracle for their CONNECT BY implementation).

Friendster during its days as a six degree site, developed a graph engine for the storing and retrieving of this sort of data (and they eventual made this a storage engine under MySQL). Whether an engine like this needs to be a part of MySQL is open for debate, but what is needed is an open source graph engine that can handle the data.

It should be query-able for searches on the relationships. It will also need to be designed to be/have:

  • Distributed (multiple nodes)
  • Highly available
  • Be Able to Globally Replicate.

    The last piece I believe is important, since I can see sites who build tools wanting to be able to efficiently suck up the content from the system as it comes in. Sure, it could be done via "web pings" or polling, but that won't work for the largest systems. I also like the idea of many sites having the "central data". One thing which has bothered me about Wikipedia and other "shared" sorts of sites is the ability to replicate and fork the site. While forking is understood in the Open Source world, I think the concept is not understood just yet in the collaborative web projects.

    Distributed is a given. We don't live in a day and age where we can build "super" single computers to handle everything. Shared nothing architectures have worked very well for the web, and this sort of methodology is the clear winner.
  • Subscribe
    • Post a new comment


      Comments allowed for friends only

      Anonymous comments are disabled in this journal

      default userpic

      Your reply will be screened

      Your IP address will be recorded