Skip to main content

Drop ACID and Think About Data | High Scalability - The Diigo Meta page

highscalability.com/drop-acid-and-think-about-data - Cached - Annotated View

Share This

Bookmarking History
Comments (1)

This link has been bookmarked by 49 people . It was first bookmarked on 06 May 2009, by feng bo.

22 Jun 13

Giacomo Cariello
database ACID architecture scalability programming
30 Sep 12

Marc-Alexandre Gagnon
database scalability ACID architecture toread base drop distributed
22 May 12

benyblack
29 May 11

Dear Xanitia
- Everyone who builds big applications builds them on CAP and BASE
- compression - great gains in throughput, can store more, reduces IO bottleneck
- single master - one node knows everything about all the other node (backed up and cached).
- row database - store objects together
  
  column database - store attributes of objects together. Makes sequential retrieval very fast, allows very efficient compression, reduces disks seeks and random IO.
- eventually consistent - append only system using a row time stamp. When a client queries they get several versions and the client is in charge of picking the most recent.
- Uses consistent hashing to distribute data to one or more nodes for redundancy and performance.
- Consistency between nodes is based on vector clocks and read repair.
- Read repair - When a client does a read and the nodes disagree on the data it's up to the client to select the correct data and tell the nodes the new correct state.
- Highly Available for Write
- Clients have to be smart to handle read-repair
- Not suitable for column-like workloads, it's just a key-value store
- Distributed databases are the new web framework.
- Pick one and start submitting patches. Don't start another half-baked clone.
- Similar replication strategy to MySQL. Not useful for scalability as it limits the write throughput to one node.
- it understands your values so you can operate on them
- Can match on key spaces. You can look for all keys that match an expression.
- Understands lists and sets.
- it requires that full data store in RAM
- Documents can be nested unlike CouchDB which requires applications keep relationships.
- Advantage is that the whole object doesn't have to be written and read because the system knows about the relationship.
- Each column is stored separately so IO is efficient as only the columns of interest are scanned. When using column database you are almost always scanning the entire column.
- Bitmap indexes for fast sequential scans.
- No query language; generally need to iterate over each row using MapReduce to do queries
- Only has an index for the row key
- Even though both Yahoo (Pig) and Facebook (Hive) have their own analytics apps based on Hadoop, neither uses HBase for storage
- And don't forget Yahoo Everest, which is basically a MPP column store for PostgreSQL
24 more annotations...
10 Mar 11

pshah2k
Partition Tolerance - if one or more nodes fails the system still works and becomes consistent when the system comes on-line.

nosql options Comparison
23 Jan 11

Marcus Zander
Internet Explorer Imported from Del.icio.us user hamish40 (6 6 2009 )
04 Nov 10

Grant Slade
ict4005 rdbms future acid scalability
28 Mar 10

Yasutaka Nishimura
database scalability architecture performance distributed computing
10 Nov 09

menbom
database nosql
15 Oct 09

Ryan Baldwin
scalability
05 Aug 09

Mark Alexander
acid data Tags
16 Jul 09

ashish chawla
01 Jun 09

database acid cap base architecture scalability performance
flaviomori
acid base cap
25 May 09

Jaap Steinvoorte
data performance
Dave
database programming architecture
Dan O'Neill
scalability database
24 May 09

FirstN@me L@stN@me
database database_optimization
justin pitts
database scalability toread importfromdelicious
dinh cuong vu
database scalability architecture mysql
Francesco Uliana
sql scalability database
22 May 09

Daniele Mazzini
toread
18 May 09

Joseph Elliott
scalability
14 May 09

Daniel Quirino Oliveira
12 May 09

Mike Bosch
Database
10 May 09

Joel Liu
database scalability Architecture
Yushi H
This talk explores the landscape of new technologies available today to augment your data layer to improve performance and reliability.

read scalability database
07 May 09

Alex MIkhalev
Drop ACID and Think About Data

ACID database distributed
alfred westerveld
drop acid scalability presentation video
06 May 09

Georgene
database architecture
Sérgio Lopes
acid database architecture performance storage scalability cloud bigtable
zhen ma
architecture base
Terry Jones
drop ACID ippolito
Sven Duzont
key value bigtable benchmark video
database scalability storage non-relational to-read delicious
sgravitz
database scalability
retro one
database performance scalability couchdb caching storage
feng bo
- - 之前看过类似的，总结过一小点：
    http://web2.0coder.me/archives/630252
  Add Sticky Note

Public Stiky Notes

feng bo on 2009-05-06

之前看过类似的，总结过一小点：
http://web2.0coder.me/archives/630252

Would you like to comment?

Join Diigo for a free account, or sign in if you are already a member.

Other bookmarks from the site highscalability.com »

Check out another URL