Home: Microservices

Micro services though useful come with a lot of baggage. Discretion is needed to decide if they're really needed. Monoliths are not bad. Most likely what is needed is a clean interface separation between various components in the monolith.

Monoliths can serve for a long period unless you hit issues.

Release velocity is affected because of dependencies between components. This hampers development, testing and deployment time
Scaling characteristics of different components are different such that they cause unreliable use of the resources of underlying hardware due to differing traffic patterns

Capacity planning becomes hard
Performance becomes unpredictable
Resource exhaustion happens frequently and randomly

Need to develop and scale the component independently and make it available as a service

Micro-services takes a heavy toll on SRE; without the required automation and SRE firepower, it is really hard to maintain sanity of the entire system. With micro services proliferation the problem increases manifold with the web of inter service traffic.

Micro services needs to be implemented with discretion. Keep in mind the following considerations

Architecture
	12 factor app	https://12factor.net
	Availability	How is the service fault tolerant?
	Scalability	What’s the horizontal and vertical scalability?
	Statelessness	Is the service stateless?
	Async	Can it use Lambda / Async services?
	Security Considerations	2FA, HTTPS, Tokens, Encryption, GDPR, Penetration testing, App testing
	API	Contracts, Versioning, Dependency
	Network	• Proxy • Sync, Async, Batch • Multithreaded, Event based, Coroutine
	Load Handling	• Load balancer • Circuit breaker • Throttling
	Replication	Consistency
	Data	• Transactions across services • Partitioning • Schema, Metadata, Evolution • Indexing, Querying • DB type
	Caching	• Object caching • Page Caching
	Service Mesh	• Istio
	Shutdown	Graceful shutdown
	i18n Considerations
SRE
	Backup / Restore	• RPO - Recovery Point Object, • RTO - Recovery Time Objective
	Reliability	• MTTF - Mean time to failure • MTTR - Mean time to Recovery • MTBF - Meantime between failure • Uptime • Fault tolerance
	Performance / SLAs	• SLO's - Service Level Objectives • Response time • Latency • Throughput • Uptime
	Release Management Change Management Config Management	• Zero Downtime upgrade, • Rolling deployments, • Automated deployments
	Container and Orchestration	Docker / Docker Swarm or K8S
	Dev / QA environment	Automated Dev / QA environments
	CI/CD pipeline	Code Deploy, Circle CI, Codeship, Jenkins
	Upgrades / (0 Downtime)	Zero downtime upgrade, Rolling upgrades, Canary rollout
	Deployment	Ansible / Puppet
	CI/CD pipeline	Code Deploy, Circle CI, Codeship
	Service Monitoring & Alerting	Pingdom, Nagios, CloudWatch, Prometheus, DataDog
	Logging	Logstash, Fluentd
	Cost	Cost tags, Analytics, Cost structure, Reserved Instances, Projections, Cost Optimisations (Tools like Botmetrics)
	Capacity Planning
	Security	IAM Roles, Encryption, HTTPS
	Networking	Diagram, VPC
	Fleet management	Tagging, AMI images, Versions, Upgrades, Consolidation, Pruning
	Incident Management and Incident Response	Outages, Load Management, Latency, Security Incidents
	Process Management	Process group, Process monitoring
	OnCall	Pager Duty, VictorOps
	Versioning and Packaging
Dev Process
	Git Flow	Branching and Development process
	API Docs	Swagger
	Sentry	Error monitoring
	Metrics	Concurrency, System metrics, Engineering Metrics
	Testing	• Automation, • API testing, • Integration, Load, • Unit testing, • Deployment testing, • Checklist, • Regression
General
	Language Version	Eg: Python 3.x/ Java 7
	Framework Version	Eg: Django Version
	Library Version	Eg: PyMongo Version
	Licenses	Apache, MIT, GPL
Others
	Metrics	Deployment Frequency % of failed Deployments Time from Checkin to Deployment

Home

Microservices

No comments:

Post a Comment