Data Architecture

Data is the glue that holds distributed systems together. Data has it's own lifecycle starting from origination to flow, transformation, modification and finally deletion. At a high level these are some of the data design considerations

Database Type 

  • Relational, KV-store, Column Family, Document, Graph etc

Indexing and Querying

  • LSM, B-Tree, B+-Tree
  • R/W Access patterns
  • Range queries

Schema, Metadata, Encoding and Evolution

  • Versioning
    • Forward and Backward compatibility
  • Format
    • Avro, Thrift, Protobuf


  • Atomicity
  • Isolation level
    • Lost update
    • Read committed
    • Read repeatable (snapshot isolation - Readers never block writers and vice versa)
    • Prevent lost update (read-modify-update cycle)
      • CSET, Atomic write, Conflict resolution, 
    • Phantom (generalisation of lost update)
      • Materialising conflict
    • Serialisability
      • Actual serial execution
      • 2 phase locking
        • Writers don’t just block other writers; they also block readers and vice versa. 
        • Deadlocks
        • Predicate lock, index range lock
      • SSI (Optimistic concurrency control)
        • all reads within a transaction are made from a consistent snapshot of the database
        • Pessimistic concurrency control - 2PL, ASE
        • Optimistic concurrency control - MVCC
  • Data durability 
    • WAL Logs
    • Backup and Restore
    • Error handling, RRD, Save points


  • Master slave
    • Read your writes, monotonic read, consistent prefix reads
  • Multi master (couchdb)
    • Conflict resolution / avoidance
  • Leaderless (Dynamo style)
    • Quorum, Sloppy quorum, LWW, Concurrent writes, Version vectors

Data partitioning

  • Hotspot
  • Partition by Key range, consistent hashing, Compound primary key
  • Partition index - Local index (scatter-gather), Global index
  • Rebalancing - Fixed, Dynamic
  • Request routing 

Trouble with Distributed systems

  • Faults and partial failures
  • Unreliable networks - synchronous vs asynchronous networks
  • Unreliable clocks
    • Don’t rely on the accuracy of the clock
    • NTP sync is not accurate
    • Confidence interval in time - Spanner
  • Process pause
  • Knowledge truth lies
    • Truth is defined by majority
    • Fencing tokens is required for distributed lock
    • Safety and liveness


  • Serializability is an isolation property of transactions
  • Linearisablity is a recency guarantee
  • To make a system appear as if there is only a single copy of the data. 
  • Single leader replication is potentially linearisable. Multi-leader and leaderless are not linearisable
  • A network interruption forces a choice between linearisability and availability. 
  • CAP - Linearisable / Available when partitioned
  • The reason for dropping linearisability is performance , not fault tolerance.
  • Most DB are neither CAP-linearisable nor CAP-available 
    • MVCC is intentionally non-linearisable
    • Single leader replication (async) is non-linearisable
    • Partition makes it non-available
  • ZooKeeper by default is neither CAP-consistent (CP) nor CAP-available (AP) – it’s really just “P”. 
    • You can optionally make it CP by calling sync if you want, 
    • and for reads (but not for writes) it’s actually AP, if you turn on the right option.

Error handling , Recovery

  • Replay logs
  • Discover partitions
  • Replication catchup
  • Restores

Source of truth 

  • Primary DB
  • Derived data
    • Caches
    • Materialised Views
    • CQRS

Batch processing

  • Putting the computation near the data
  • Mapping and Chaining
  • Sort merge joins
  • GroupBy
  • Handling Skew
  • MapSide joins
    • Broadcast hash joins - joining a large dataset with a small dataset
    • Partitioned hash joins - partition and reduce the dataset
    • Mapside merge joins - if input is partitioned and sorted appropriately
  • Use cases - Search index, Key Value stores
  • The output of a reduce-side join is partitioned and sorted by the join key, whereas the output of a map-side join is partitioned and sorted in the same way as the large input 
  • Input is immutable and no side effects
  • In an environment where tasks are not so often terminated, the design decisions of MapReduce make less sense.
  • Materialising intermediate state
    • Makes fault tolerance easy
    • Sometimes they are not needed

Streaming processing

  • Messaging
    • Brokers
      • Routing: Fanout, load balancing
      • Acknowledgement and redelivery
      • Message ordering 
  • Partitioned logs - Kafka
    • Unified log
    • Partitioning to scale
    • Replayable
    • Consumer offsets
    • Durability
    • Immutable input and idempotent operations
  • CDC - Change Data Capture
    • Implemented using log based broker
    • Connectors - Debezium etc
  • Event sourcing
    • Immutable events written to event log (this is used in accounting where delta is captured)
    • CQRS - separate forms for read and write and allowing several different read views
  • Stream processing
    • Produce a new stream, real time dashboard, write it to a end database
    • Has operators - sorting doesn’t make sense
    • Uses: CEP, Stream analytics, Maintaining materialised views
    • Event time vs Processing time
    • Stream Joins
      • Stream-Stream joins (click through rate)
      • Stream-Table join (enrichment)
        • Similar to map side hash join
        • Table can be kept upto date using CDC
      • Table-Table join (materialised view maintenance)
        • Twitter
      • Time dependence of joins
        • Slowly changing dimension

Big Data

  • Data Analytics
  • ETL
  • Data warehouse 

Data Security

  • Encryption at Rest
  • Data Retention
  • Data classification
  • (Customer) Data isolation
  • Data Integrity
  • Data Availability
  • Data anonymisation
  • Backup and recovery
  • Database Security
  • Access Control

Overall Security



Micro services though useful  come with a lot of baggage. Discretion is needed to decide if they're really needed. Monoliths are not bad. Most likely what is needed is a clean interface separation between various components in the monolith. 

Monoliths can serve for a long period unless you hit issues.
  • Release velocity is affected because of dependencies between components. This hampers development, testing and deployment time
  • Scaling characteristics of different components are different such that they cause unreliable use of the resources of underlying hardware due to differing traffic patterns
    • Capacity planning becomes hard
    • Performance becomes unpredictable
    • Resource exhaustion happens frequently and randomly
  • Need to develop and scale the component independently and make it available as a service

Micro-services takes a heavy toll on SRE; without the required automation and SRE firepower, it is really hard to maintain sanity of the entire system. With micro services proliferation the problem increases manifold with the web of inter service traffic.

Micro services needs to be implemented with discretion. Keep in mind the following considerations


12 factor app


How is the service fault tolerant?


What’s the horizontal and vertical scalability?


Is the service stateless?


Can it use Lambda / Async services?

Security Considerations

2FA, HTTPS, Tokens, Encryption, GDPR, Penetration testing, App testing


Contracts, Versioning, Dependency


• Proxy

• Sync, Async, Batch 

• Multithreaded, Event based, Coroutine

Load Handling

• Load balancer

• Circuit breaker

• Throttling




• Transactions across services 

• Partitioning

• Schema, Metadata, Evolution

• Indexing, Querying

• DB type


• Object caching

• Page Caching

Service Mesh

• Istio


Graceful shutdown

i18n Considerations


Backup / Restore

• RPO - Recovery Point Object, 

• RTO - Recovery Time Objective


• MTTF - Mean time to failure

• MTTR - Mean time to Recovery

• MTBF - Meantime between failure

• Uptime

• Fault tolerance

Performance / SLAs

• SLO's - Service Level Objectives

• Response time

• Latency

• Throughput

• Uptime

Release Management

Change Management

Config Management

• Zero Downtime upgrade, 

• Rolling deployments, 

• Automated deployments

Container and Orchestration

Docker / Docker Swarm or K8S

Dev / QA environment

Automated Dev / QA environments

CI/CD pipeline

Code Deploy, Circle CI, Codeship, Jenkins

Upgrades / (0 Downtime)

Zero downtime upgrade, Rolling upgrades, Canary rollout


Ansible / Puppet

CI/CD pipeline

Code Deploy, Circle CI, Codeship 

Service Monitoring & Alerting

Pingdom, Nagios, CloudWatch, Prometheus, DataDog


Logstash, Fluentd


Cost tags, Analytics, Cost structure, Reserved Instances, Projections, Cost Optimisations

(Tools like Botmetrics)

Capacity Planning


IAM Roles, Encryption, HTTPS


Diagram, VPC

Fleet management

Tagging, AMI images, Versions, Upgrades, Consolidation, Pruning

Incident Management and 

Incident Response

Outages, Load Management, Latency, Security Incidents

Process Management

Process group, Process monitoring 


Pager Duty, VictorOps

Versioning and Packaging

Dev Process

Git Flow

Branching and Development process

API Docs



Error monitoring


Concurrency, System metrics, Engineering Metrics


• Automation, 

• API testing, 

• Integration, Load, 

• Unit testing, 

• Deployment testing, 

• Checklist,

• Regression


Language Version

Eg: Python 3.x/ Java 7

Framework Version

Eg: Django Version

Library Version

Eg: PyMongo Version


Apache, MIT, GPL



Deployment Frequency

% of failed Deployments

Time from Checkin to Deployment