Amazon's Dynamo
« previous entry | next entry »
Oct. 3rd, 2007 | 02:06 pm
http://www.allthingsdistributed.com/200 7/10/amazons_dynamo.html
"Most of these services only store and retrieve data by primary key
and do not require the complex querying and management functionality
offered by an RDBMS. This excess functionality requires expensive
hardware and highly skilled personnel for its operation, making it a
very inefficient solution. In addition, the available replication
technologies are limited and typically choose consistency over
availability."
1) Most web work is primary key.
2) Its not transactional.
3) Availability is more important then a lost data
"Most of these services only store and retrieve data by primary key
and do not require the complex querying and management functionality
offered by an RDBMS. This excess functionality requires expensive
hardware and highly skilled personnel for its operation, making it a
very inefficient solution. In addition, the available replication
technologies are limited and typically choose consistency over
availability."
1) Most web work is primary key.
2) Its not transactional.
3) Availability is more important then a lost data
(no subject)
from:
itman
date: Oct. 3rd, 2007 10:10 pm (UTC)
Link
BTW, why have not they mentioned Bigtable, which seems to be the only (or one of the few) distributed storage engines described so far.
Reply | Thread
(no subject)
from:
krow
date: Oct. 3rd, 2007 10:25 pm (UTC)
Link
I talk about it all the time at conferences :)
These large scale systems are of keen interest to me right now.
I'd argue that there are a few of these system available right now, and I think more will be coming online (especially open source ones). Anyone running a website right now that is scaling should be figuring out how to make one of these.
Put it another way, what is memcached? Its just a big one of these that happens to not be durable (yet).
Reply | Parent | Thread
(no subject)
from:
itman
date: Oct. 3rd, 2007 10:45 pm (UTC)
Link
Not you, they the authors of the Amazon's article. As far as I understand it, you are not among them. Correct me please, if I am wrong.
>I'd argue that there are a few of these system available right now, and I think more will be coming online (especially open source ones). Anyone running a website right now that is scaling should be figuring out how to make one of these.
I'd, too. But I am talking about publications. There are fewer publications than available systems :-)
>Put it another way, what is memcached? Its just a big one of these that happens to not be durable (yet).
Durability is an enormous challenge.
Reply | Parent | Thread
allmydata.org
from:
zooko
date: Oct. 22nd, 2007 11:00 pm (UTC)
Link
http://allmydata.org
Regards,
Zooko
Reply | Parent | Thread
(no subject)
from:
krow
date: Oct. 3rd, 2007 10:39 pm (UTC)
Link
Reply | Parent | Thread
(no subject)
from:
itman
date: Oct. 3rd, 2007 10:46 pm (UTC)
Link
Reply | Parent | Thread
(no subject)
from:
krow
date: Oct. 4th, 2007 01:17 am (UTC)
Link
Some sites do, but the ones that do limit themselves to particular queries where it is crucial.
Reply | Parent | Thread
(no subject)
from:
awfief
date: Oct. 10th, 2007 03:39 am (UTC)
Link
Not require.
The great thing about humans is that we're fault tolerant. Machines are very inflexible. Humans will "understand what you mean" whereas machines choke if a semicolon isn't in the right place.
Meaning that if Amazon.com messes up one order in 100, which is millions of orders per year (!!!) it's actually OK, and cheaper all around to hire someone to apologize (customer service) and fix the problem than it is to have extreme database ACIDity.
Reply | Parent | Thread