20200423
Plan
60min: GFS
Notes
Why hard?
- performance => sharding
- faults => tolerance by replication
- inconsistency because of replication
- low performance because of consistency
What?
Consitency
strong consistency ensures client interactes with cluster(many machines) just like with single matchine.
Replication
- double write: update disorder(inconsistency).
How is GFS built?
Features
- Automatic Recovery
- Single data center
- Storing Large file
- For sequential access
- no gurantee for correct data
Archtecture
- Single master keeps meta data for all chunks located in chunk servers.
- Multiple chunk servers take responsibility for storing chunks whose size is 64MB based on linux filesystem.
Read/Write procedure
check these procedure in paper or the course vedio.
Shortcomings
- single master => no ha of master
- no strong onsistency gurantee
Interesting Q&A
(check answers in next note)
- why do all replica need a primary?
- why to keep primary role by leasing?
- why to use heartbeat to grant a lease of primary?
Conclusion
(check them in next note)