13

Beating the CAP Theorem Checklist

 3 years ago
source link: https://ferd.ca/beating-the-cap-theorem-checklist.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Beating the CAP Theorem Checklist

Your ( ) tweet ( ) blog post ( ) marketing material ( ) online comment
advocates a way to beat the CAP theorem. Your idea will not work. Here is why
it won't work:

( ) you are assuming that software/network/hardware failures will not happen
( ) you pushed the actual problem to another layer of the system
( ) your solution is equivalent to an existing one that doesn't beat CAP
( ) you're actually building an AP system
( ) you're actually building a CP system
( ) you are not, in fact, designing a distributed system

Specifically, your plan fails to account for:

( ) latency is a thing that exists
( ) high latency is indistinguishable from splits or unavailability
( ) network topology changes over time
( ) there might be more than 1 partition at the same time
( ) split nodes can vanish forever
( ) a split node cannot be differentiated from a crashed one by its peers
( ) clients are also part of the distributed system
( ) stable storage may become corrupt
( ) network failures will actually happen
( ) hardware failures will actually happen
( ) operator errors will actually happen
( ) deleted items will come back after synchronization with other nodes
( ) clocks drift across multiple parts of the system, forward and backwards in time
( ) things can happen at the same time on different machines
( ) side effects cannot be rolled back the way transactions can
( ) failures can occur while in a critical part of your algorithm
( ) designing distributed systems is actually hard
( ) implementing them is harder still

And the following technical objections may apply:

( ) your solution requires a central authority that cannot be unavailable
( ) read-only mode is still unavailability for writes
( ) your quorum size cannot be changed over time
( ) your cluster size cannot be changed over time
( ) using 'infinite timeouts' is not an acceptable solution to lost messages
( ) your system accumulates data forever and assumes infinite storage
( ) re-synchronizing data will require more bandwidth than everything else put together
( ) acknowledging reception is not the same as confirming consumption of messages
( ) you don't even wait for messages to be written to disk
( ) you assume short periods of unavailability are insignificant
( ) you are basing yourself on a paper or theory that has not yet been proven

Furthermore, this is what I think about you:

( ) nice try, but blatantly false advertising
( ) you are badly reinventing existing concepts and should do some research
( ) in particular, you should read the definition of the word 'theorem'
( ) also you should read the definition of 'distributed system'
( ) you have no idea what you are doing
( ) do you even know what a logical clock is?
( ) you shouldn't be in charge of people's data

Also thanks to tef for some editing.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK