Vitess
- Horizontally scaled MySQL
- Vitess is a CNCF project, expected to graduate soon
- As of 2019/07/11 areas of improvement
- Getting started is still too difficult
- Documentation
- Largest known cluster is 10000 tablets
Raphael Chicon (Slack)
- Architecture
- Slack specific details
- consul used to define local and global topology
- the consul approach didn’t work very well because consul was breaking configuration isolation
- distributed systems problems eventually took over
- switched over to using consul with a global topology and breaking configuration down to each availability zone
- vtgate
- Routes traffic to vttablet
- By default it is not possible to route traffic accross DCs
- vttablet
- vtclt
- The key to resliency is isolation
Davis Weitzman (Pinterest)
- Migrating from vanilla MySQL
- Started with a single MySQL shard
- Existing shards would continue to grow over time so it had to change over time
- Gate discovery - amazon ELB
- Configuration - Zookeeper
- Tablets - existing mysql hosts
- Opevability - OpenTSDB
- Rollout (Phase 1)
- Made some incremental changes up front
- Validated the changes
- unit + integration tests
- application-level dark reads
- replay writes
- Incremental rollout (% traffic)
- Rollout took 2 months
- Afterwards (Phase 1)
- improved visibility
- improved db protection
- found bugs along the way
- Rollout (Phase 2)
- unit + integration tests
- remove auto-inc from new shards
- moratorium on schema changes
- migrate master w/ reverse replication
Sugu Sougoumarane
- vreplication
- lets say you’ve got orders with foreign key relationships to products and people
- you have to chose where to put the orders, and you usually put it closer to the highest throughput table
- cross shard queries are more expensive
- vreplication solves slow cross shard replication by copying the source table into the target and keep it in sync
- using vreplication certain queries break because there are copies of the same table
- the is solved by routing rules that define which table to use
- during the resharding, writes are stopped, shards are moved over and then writes are resumed