Home: April 2021

Data is the glue that holds distributed systems together. Data has it's own lifecycle starting from origination to flow, transformation, modification and finally deletion. At a high level these are some of the data design considerations

Database Type

Relational, KV-store, Column Family, Document, Graph etc

Indexing and Querying

LSM, B-Tree, B+-Tree
R/W Access patterns
Range queries

Schema, Metadata, Encoding and Evolution

Versioning

Forward and Backward compatibility

Format

Avro, Thrift, Protobuf

Transaction

Atomicity
Isolation level

Lost update
Read committed
Read repeatable (snapshot isolation - Readers never block writers and vice versa)
Prevent lost update (read-modify-update cycle)

CSET, Atomic write, Conflict resolution,

Phantom (generalisation of lost update)

Materialising conflict

Serialisability

Actual serial execution
2 phase locking

Writers don’t just block other writers; they also block readers and vice versa.
Deadlocks
Predicate lock, index range lock

SSI (Optimistic concurrency control)

all reads within a transaction are made from a consistent snapshot of the database
Pessimistic concurrency control - 2PL, ASE
Optimistic concurrency control - MVCC

Data durability

WAL Logs
Backup and Restore
Error handling, RRD, Save points

Replication

Master slave

Read your writes, monotonic read, consistent prefix reads

Multi master (couchdb)

Conflict resolution / avoidance

Leaderless (Dynamo style)

Quorum, Sloppy quorum, LWW, Concurrent writes, Version vectors

Data partitioning

Hotspot
Partition by Key range, consistent hashing, Compound primary key
Partition index - Local index (scatter-gather), Global index
Rebalancing - Fixed, Dynamic
Request routing

Trouble with Distributed systems

Faults and partial failures
Unreliable networks - synchronous vs asynchronous networks
Unreliable clocks

Don’t rely on the accuracy of the clock
NTP sync is not accurate
Confidence interval in time - Spanner

Process pause
Knowledge truth lies

Truth is defined by majority
Fencing tokens is required for distributed lock
Safety and liveness

Linearisablity

Serializability is an isolation property of transactions
Linearisablity is a recency guarantee
To make a system appear as if there is only a single copy of the data.
Single leader replication is potentially linearisable. Multi-leader and leaderless are not linearisable
A network interruption forces a choice between linearisability and availability.
CAP - Linearisable / Available when partitioned
The reason for dropping linearisability is performance , not fault tolerance.
Most DB are neither CAP-linearisable nor CAP-available

MVCC is intentionally non-linearisable
Single leader replication (async) is non-linearisable
Partition makes it non-available

ZooKeeper by default is neither CAP-consistent (CP) nor CAP-available (AP) – it’s really just “P”.

You can optionally make it CP by calling sync if you want,
and for reads (but not for writes) it’s actually AP, if you turn on the right option.

Error handling , Recovery

Replay logs
Discover partitions
Replication catchup
Restores

Source of truth

Primary DB
Derived data

Caches
Materialised Views
CQRS

Batch processing

Putting the computation near the data
Mapping and Chaining
Sort merge joins
GroupBy
Handling Skew
MapSide joins

Broadcast hash joins - joining a large dataset with a small dataset
Partitioned hash joins - partition and reduce the dataset
Mapside merge joins - if input is partitioned and sorted appropriately

Use cases - Search index, Key Value stores
The output of a reduce-side join is partitioned and sorted by the join key, whereas the output of a map-side join is partitioned and sorted in the same way as the large input
Input is immutable and no side effects
In an environment where tasks are not so often terminated, the design decisions of MapReduce make less sense.
Materialising intermediate state

Makes fault tolerance easy
Sometimes they are not needed

Streaming processing

Messaging

Brokers

Routing: Fanout, load balancing
Acknowledgement and redelivery
Message ordering

Partitioned logs - Kafka

Unified log
Partitioning to scale
Replayable
Consumer offsets
Durability
Immutable input and idempotent operations

CDC - Change Data Capture

Implemented using log based broker
Connectors - Debezium etc

Event sourcing

Immutable events written to event log (this is used in accounting where delta is captured)
CQRS - separate forms for read and write and allowing several different read views

Stream processing

Produce a new stream, real time dashboard, write it to a end database
Has operators - sorting doesn’t make sense
Uses: CEP, Stream analytics, Maintaining materialised views
Event time vs Processing time
Stream Joins

Stream-Stream joins (click through rate)
Stream-Table join (enrichment)

Similar to map side hash join
Table can be kept upto date using CDC

Table-Table join (materialised view maintenance)

Twitter

Time dependence of joins

Slowly changing dimension

Big Data

Data Analytics
ETL
Data warehouse

Data Security

Encryption at Rest
Data Retention
Data classification
(Customer) Data isolation
Data Integrity
Data Availability
Data anonymisation
Backup and recovery
Database Security
Access Control

Micro services though useful come with a lot of baggage. Discretion is needed to decide if they're really needed. Monoliths are not bad. Most likely what is needed is a clean interface separation between various components in the monolith.

Monoliths can serve for a long period unless you hit issues.

Release velocity is affected because of dependencies between components. This hampers development, testing and deployment time
Scaling characteristics of different components are different such that they cause unreliable use of the resources of underlying hardware due to differing traffic patterns

Capacity planning becomes hard
Performance becomes unpredictable
Resource exhaustion happens frequently and randomly

Need to develop and scale the component independently and make it available as a service

Micro-services takes a heavy toll on SRE; without the required automation and SRE firepower, it is really hard to maintain sanity of the entire system. With micro services proliferation the problem increases manifold with the web of inter service traffic.

Micro services needs to be implemented with discretion. Keep in mind the following considerations

Architecture
	12 factor app	https://12factor.net
	Availability	How is the service fault tolerant?
	Scalability	What’s the horizontal and vertical scalability?
	Statelessness	Is the service stateless?
	Async	Can it use Lambda / Async services?
	Security Considerations	2FA, HTTPS, Tokens, Encryption, GDPR, Penetration testing, App testing
	API	Contracts, Versioning, Dependency
	Network	• Proxy • Sync, Async, Batch • Multithreaded, Event based, Coroutine
	Load Handling	• Load balancer • Circuit breaker • Throttling
	Replication	Consistency
	Data	• Transactions across services • Partitioning • Schema, Metadata, Evolution • Indexing, Querying • DB type
	Caching	• Object caching • Page Caching
	Service Mesh	• Istio
	Shutdown	Graceful shutdown
	i18n Considerations
SRE
	Backup / Restore	• RPO - Recovery Point Object, • RTO - Recovery Time Objective
	Reliability	• MTTF - Mean time to failure • MTTR - Mean time to Recovery • MTBF - Meantime between failure • Uptime • Fault tolerance
	Performance / SLAs	• SLO's - Service Level Objectives • Response time • Latency • Throughput • Uptime
	Release Management Change Management Config Management	• Zero Downtime upgrade, • Rolling deployments, • Automated deployments
	Container and Orchestration	Docker / Docker Swarm or K8S
	Dev / QA environment	Automated Dev / QA environments
	CI/CD pipeline	Code Deploy, Circle CI, Codeship, Jenkins
	Upgrades / (0 Downtime)	Zero downtime upgrade, Rolling upgrades, Canary rollout
	Deployment	Ansible / Puppet
	CI/CD pipeline	Code Deploy, Circle CI, Codeship
	Service Monitoring & Alerting	Pingdom, Nagios, CloudWatch, Prometheus, DataDog
	Logging	Logstash, Fluentd
	Cost	Cost tags, Analytics, Cost structure, Reserved Instances, Projections, Cost Optimisations (Tools like Botmetrics)
	Capacity Planning
	Security	IAM Roles, Encryption, HTTPS
	Networking	Diagram, VPC
	Fleet management	Tagging, AMI images, Versions, Upgrades, Consolidation, Pruning
	Incident Management and Incident Response	Outages, Load Management, Latency, Security Incidents
	Process Management	Process group, Process monitoring
	OnCall	Pager Duty, VictorOps
	Versioning and Packaging
Dev Process
	Git Flow	Branching and Development process
	API Docs	Swagger
	Sentry	Error monitoring
	Metrics	Concurrency, System metrics, Engineering Metrics
	Testing	• Automation, • API testing, • Integration, Load, • Unit testing, • Deployment testing, • Checklist, • Regression
General
	Language Version	Eg: Python 3.x/ Java 7
	Framework Version	Eg: Django Version
	Library Version	Eg: PyMongo Version
	Licenses	Apache, MIT, GPL
Others
	Metrics	Deployment Frequency % of failed Deployments Time from Checkin to Deployment

Home

Data Architecture

Overall Security

Microservices