SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Designing for Operability and Managability
Gaurav Bahrani
CTO,
Shanker Balan
Managing Consultant, sysCredence
Introduction
● Gaurav Bahrani, CTO, MeTripping
○ Building intelligent search engine for travel
○ Expertise in building large scale distributed systems
■ SQL, NoSQL, Big Data
■ Database engines
■ Fault-tolerant systems
○ ex-VPE Cloud Lending Solutions (Fin-tech startup), ex-Yahoo, ex-MS, ex-HP
● Shanker Balan, Freelance DevOps Consultant
○ Infrastructure & Cloud
○ DevOps Consulting For Startups
■ Infibeam, Instamojo, Logistimo, Widas, Quintype, dAlchemy IOT
○ ex-InMobi, ex-Yahoo
Agenda
1. MeTripping - Introduction
2. Operability & Manageability Challenges
3. Design & Architecture Best Practices
4. Q & A
MeTripping - Introduction (1)
MeTripping - Introduction (2)
Architecture
Challenges
● Scale and performance
● Varying user traffic
● Data integration with 10s of data provides - different formats and SLAs
dynamic
data
static
data
Operability / Managability Challenges
● Infrastructure & Environment
● Build / Release Process
● Metrics & Availability
● Scaling & Cost Management
● Security & Compliance
● Team Structure
Infrastructure & Environment
● OS Standardisation
○ Latest LTS Releases / Minimal Container OS
○ Minimal Docker Images (Alpine / Atomic)
● Package Management
○ Tarball Installation vs. Package Repos
○ Adopt Docker
● Config Management
○ Hand Manage
○ Ansible vs. Chef vs. Puppet
● Service Management
○ Manual start / stop of services
○ Supervisor vs. Systemd
Build & Release Process
● Build on laptops
● Using IDE For Deployment
● Hand Manage artifacts to remote servers
● Version Management
Metrics & Availability
● Health Checks & External Service Availability
○ Site 24x7 / Uptime Robot / Gomez
● Server Health Monitoring
○ CloudWatch, DataDog, Nagios, Sensu etc
● Application Performance Monitoring
○ Istio / Hystrix
○ Newrelic, App Dynamics, Elastic APM, StackDriver
○ CloudWatch, sysDig
● Logs (ELK)
Security & Compliance
● Secure Coding Guidelines
○ OWASP Top 10
○ Follow Industry Best Practices (PCI, HIPAA)
● Access Controls
○ Central User Management
○ Do not use shared accounts
○ Follow least privilege model
● Restrict Network Access
○ Use both Public & Private Networks
○ Restrict login access only to trusted networks
○ Protect Admin Pages with Google SSO + .htaccess
Application Availability and Scalability
● Resource allocation issues
○ Compute
■ Using old generation servers
■ Using “burstable” instances for production
■ Using high CPU instances without looking at actual CPU utilisation
○ Storage
■ Using magnetic storage
■ Under-provisioning / over-provisioning of storage
■ Provisioned IOPS with Databases
■ Using ephemeral storage
○ Network
■ Ephemeral IPs for Internet facing servers
■ SSL Termination on Application (Apache / Nginx)
■ Nginx / Apache as Application Load Balancers
■ Serving static assets from application
■ Mapping domains to Load Balancer IPs
Managing Costs
● Use less SaaS & PaaS
○ Binpack with Docker
○ Run local MySQL, ElasticSearch, Kafka, ELK etc
● Separate Accounts For BUs & Environments
○ Non Prod Environments (staging, dev etc)
○ Prod Environments
● Shutdown Non Prod Environments when not in use
● Housekeep regularly
Team Structure
● DevOps is hardest to hire (and retain)
● Training freshers in DevOps is time consuming
● What works well
○ Make Engineering Self Sufficient With Operations (Dev+Ops)
■ Make monitoring and deployment as self-service
○ Use Infrastructure As Code tools (Terraform)
○ Rotate oncall within the Dev Team
● Have a shared team to manage Infra
○ Account management
○ IT Stuff
○ Backup / Restore etc
Design & Architecture Best Practices
● System instrumentation - Systems and application monitoring
● Web-services architecture
● System standardisation (dockers)
○ Consistent environments
○ Simplified builds / releases
○ Scalable architecture
● Data systems best practices
○ Design for scale and performance
System Instrumentation - Systems / application monitoring
● Application monitoring setup is “must-have” requirement for all applications
○ Helps identify system and application deficiencies
○ Helps identify problems, proactively
○ Results in efficient (performance and cost effective) systems
Web-services architecture
● Create web-services and not “spider-web” of services
● Create fewer “power packed” services vs. many, many “simplistic” services
○ Push down complex data relationships into application code / database
● Create separate services for different data response times
○ Web-services for data stored in redis / memcached / elasticsearch be kept separate from web-services for
data from RDBMS
● Use tools such as Postman and Swagger to author and document web-services
Elasticsearch Postgres / Mongo Web Crawler
Hadoop / Spark
Middle Tier
Redis
System standardisation (1)
● Standard AMI for all systems
System standardisation (2)
● Minimalistic “coreos” and manage configurations via infrastructure with
Terraform
System standardization (3)
● Standard base docker image for all
dockers
○ OS: Ubuntu 16.04
○ Python: 3.4
○ Setup non-system user
System standardisation (4)
● Separate Git repository for build and
configurations
○ MeTrippingDeloyment has docker compose ymls for build
and deployment settings for dev / stage / prod
environments
○ .env files contain environment settings (sourced in by
docker-compose)
System standardisation (5)
● Build: docker-compose.sh -f docker-compose-common.yml -rv v1 -rt 2018.03.19 build mt-ranker-build
● Deploy: docker-compose.sh -f docker-compose-staging.yml -rv v1 -rt 2018.03.19 up -d mt-ranker
Data Systems Best Practices
● Embrace hybrid (SQL + NoSQL + Big Data) system design
○ Store transaction data in RDBMS
■ Consider data partitioning
■ Move archive data to Big Data systems with Long Term Storage Backend
○ Store dimension / non-transaction data in NoSQL
■ MondoDB vs. CouchDB vs. Elasticsearch / Solr
○ Move complex data joins to backend data pipelines
○ Simplify star schema
● System design considerations
○ Use “non-constrained” CPUs
○ Use SSDs for data
Summary
● Code -> Build -> Deploy -> Manage -> Burn, Burn, Burn -> Re-Design ->
Re-Code -> Re-Build -> Re-Deploy -> Burn, Burn
vs.
● Design -> Code -> Build -> Deploy -> Manage -> Burn Less
Q & A
Thank You!
Gaurav (gaurav@metripping.com), Shanker (shanker@syscredence.com)

Contenu connexe

Tendances

Webinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBWebinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBSeveralnines
 
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Severalnines
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBSeveralnines
 
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebula Project
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlSeveralnines
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLMariaDB plc
 
How to power microservices with MariaDB
How to power microservices with MariaDBHow to power microservices with MariaDB
How to power microservices with MariaDBMariaDB plc
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorMariaDB plc
 
Getting started in the cloud for developers
Getting started in the cloud for developersGetting started in the cloud for developers
Getting started in the cloud for developersMariaDB plc
 
CCV: migrating our payment processing system to MariaDB
CCV: migrating our payment processing system to MariaDBCCV: migrating our payment processing system to MariaDB
CCV: migrating our payment processing system to MariaDBMariaDB plc
 
What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2MariaDB plc
 
How Pixid dropped Oracle and went hybrid with MariaDB
How Pixid dropped Oracle and went hybrid with MariaDBHow Pixid dropped Oracle and went hybrid with MariaDB
How Pixid dropped Oracle and went hybrid with MariaDBMariaDB plc
 
TiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupMorgan Tocker
 
CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME Ceph Community
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB plc
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebula Project
 
Introducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtIntroducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtMorgan Tocker
 
2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest management2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest managementDaliya Spasova
 
SJTU Summary report
SJTU Summary reportSJTU Summary report
SJTU Summary reportYves Chan
 

Tendances (20)

Webinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBWebinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDB
 
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
 
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQL
 
How to power microservices with MariaDB
How to power microservices with MariaDBHow to power microservices with MariaDB
How to power microservices with MariaDB
 
Introducing the R2DBC async Java connector
Introducing the R2DBC async Java connectorIntroducing the R2DBC async Java connector
Introducing the R2DBC async Java connector
 
Getting started in the cloud for developers
Getting started in the cloud for developersGetting started in the cloud for developers
Getting started in the cloud for developers
 
Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017
 
CCV: migrating our payment processing system to MariaDB
CCV: migrating our payment processing system to MariaDBCCV: migrating our payment processing system to MariaDB
CCV: migrating our payment processing system to MariaDB
 
What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2What to expect from MariaDB Platform X5, part 2
What to expect from MariaDB Platform X5, part 2
 
How Pixid dropped Oracle and went hybrid with MariaDB
How Pixid dropped Oracle and went hybrid with MariaDBHow Pixid dropped Oracle and went hybrid with MariaDB
How Pixid dropped Oracle and went hybrid with MariaDB
 
TiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup Group
 
CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introduction
 
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
OpenNebulaConf2017EU: Hyper converged infrastructure with OpenNebula and Ceph...
 
Introducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtIntroducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live Frankfurt
 
2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest management2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest management
 
SJTU Summary report
SJTU Summary reportSJTU Summary report
SJTU Summary report
 

Similaire à Designing for operability and managability

introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro servicesSpyros Lambrinidis
 
Modern Elastic Datacenter Architecture
Modern Elastic Datacenter ArchitectureModern Elastic Datacenter Architecture
Modern Elastic Datacenter ArchitectureWeston Bassler
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the CloudAmihay Zer-Kavod
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023Nelson Calero
 
Introducing TiDB Operator
Introducing TiDB OperatorIntroducing TiDB Operator
Introducing TiDB OperatorKevin Xu
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthNicolas Brousse
 
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons LearntLast Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons LearntMark Grebler
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKKriangkrai Chaonithi
 
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...Andrejs Prokopjevs
 
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Build an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data ScientistsBuild an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data ScientistsShawn Zhu
 
Best Practices with Sitecore
Best Practices with SitecoreBest Practices with Sitecore
Best Practices with SitecoreAnant Corporation
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Next gen software operations models in the cloud
Next gen software operations models in the cloudNext gen software operations models in the cloud
Next gen software operations models in the cloudAarno Aukia
 
Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]Kevin Xu
 

Similaire à Designing for operability and managability (20)

introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Modern Elastic Datacenter Architecture
Modern Elastic Datacenter ArchitectureModern Elastic Datacenter Architecture
Modern Elastic Datacenter Architecture
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023
 
Introducing TiDB Operator
Introducing TiDB OperatorIntroducing TiDB Operator
Introducing TiDB Operator
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons LearntLast Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OK
 
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
 
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
 
Build an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data ScientistsBuild an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data Scientists
 
Best Practices with Sitecore
Best Practices with SitecoreBest Practices with Sitecore
Best Practices with Sitecore
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Next gen software operations models in the cloud
Next gen software operations models in the cloudNext gen software operations models in the cloud
Next gen software operations models in the cloud
 
Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]
 

Dernier

Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 

Dernier (20)

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 

Designing for operability and managability

  • 1. Designing for Operability and Managability Gaurav Bahrani CTO, Shanker Balan Managing Consultant, sysCredence
  • 2. Introduction ● Gaurav Bahrani, CTO, MeTripping ○ Building intelligent search engine for travel ○ Expertise in building large scale distributed systems ■ SQL, NoSQL, Big Data ■ Database engines ■ Fault-tolerant systems ○ ex-VPE Cloud Lending Solutions (Fin-tech startup), ex-Yahoo, ex-MS, ex-HP ● Shanker Balan, Freelance DevOps Consultant ○ Infrastructure & Cloud ○ DevOps Consulting For Startups ■ Infibeam, Instamojo, Logistimo, Widas, Quintype, dAlchemy IOT ○ ex-InMobi, ex-Yahoo
  • 3. Agenda 1. MeTripping - Introduction 2. Operability & Manageability Challenges 3. Design & Architecture Best Practices 4. Q & A
  • 5. MeTripping - Introduction (2) Architecture Challenges ● Scale and performance ● Varying user traffic ● Data integration with 10s of data provides - different formats and SLAs dynamic data static data
  • 6. Operability / Managability Challenges ● Infrastructure & Environment ● Build / Release Process ● Metrics & Availability ● Scaling & Cost Management ● Security & Compliance ● Team Structure
  • 7. Infrastructure & Environment ● OS Standardisation ○ Latest LTS Releases / Minimal Container OS ○ Minimal Docker Images (Alpine / Atomic) ● Package Management ○ Tarball Installation vs. Package Repos ○ Adopt Docker ● Config Management ○ Hand Manage ○ Ansible vs. Chef vs. Puppet ● Service Management ○ Manual start / stop of services ○ Supervisor vs. Systemd
  • 8. Build & Release Process ● Build on laptops ● Using IDE For Deployment ● Hand Manage artifacts to remote servers ● Version Management
  • 9. Metrics & Availability ● Health Checks & External Service Availability ○ Site 24x7 / Uptime Robot / Gomez ● Server Health Monitoring ○ CloudWatch, DataDog, Nagios, Sensu etc ● Application Performance Monitoring ○ Istio / Hystrix ○ Newrelic, App Dynamics, Elastic APM, StackDriver ○ CloudWatch, sysDig ● Logs (ELK)
  • 10. Security & Compliance ● Secure Coding Guidelines ○ OWASP Top 10 ○ Follow Industry Best Practices (PCI, HIPAA) ● Access Controls ○ Central User Management ○ Do not use shared accounts ○ Follow least privilege model ● Restrict Network Access ○ Use both Public & Private Networks ○ Restrict login access only to trusted networks ○ Protect Admin Pages with Google SSO + .htaccess
  • 11. Application Availability and Scalability ● Resource allocation issues ○ Compute ■ Using old generation servers ■ Using “burstable” instances for production ■ Using high CPU instances without looking at actual CPU utilisation ○ Storage ■ Using magnetic storage ■ Under-provisioning / over-provisioning of storage ■ Provisioned IOPS with Databases ■ Using ephemeral storage ○ Network ■ Ephemeral IPs for Internet facing servers ■ SSL Termination on Application (Apache / Nginx) ■ Nginx / Apache as Application Load Balancers ■ Serving static assets from application ■ Mapping domains to Load Balancer IPs
  • 12. Managing Costs ● Use less SaaS & PaaS ○ Binpack with Docker ○ Run local MySQL, ElasticSearch, Kafka, ELK etc ● Separate Accounts For BUs & Environments ○ Non Prod Environments (staging, dev etc) ○ Prod Environments ● Shutdown Non Prod Environments when not in use ● Housekeep regularly
  • 13. Team Structure ● DevOps is hardest to hire (and retain) ● Training freshers in DevOps is time consuming ● What works well ○ Make Engineering Self Sufficient With Operations (Dev+Ops) ■ Make monitoring and deployment as self-service ○ Use Infrastructure As Code tools (Terraform) ○ Rotate oncall within the Dev Team ● Have a shared team to manage Infra ○ Account management ○ IT Stuff ○ Backup / Restore etc
  • 14. Design & Architecture Best Practices ● System instrumentation - Systems and application monitoring ● Web-services architecture ● System standardisation (dockers) ○ Consistent environments ○ Simplified builds / releases ○ Scalable architecture ● Data systems best practices ○ Design for scale and performance
  • 15. System Instrumentation - Systems / application monitoring ● Application monitoring setup is “must-have” requirement for all applications ○ Helps identify system and application deficiencies ○ Helps identify problems, proactively ○ Results in efficient (performance and cost effective) systems
  • 16. Web-services architecture ● Create web-services and not “spider-web” of services ● Create fewer “power packed” services vs. many, many “simplistic” services ○ Push down complex data relationships into application code / database ● Create separate services for different data response times ○ Web-services for data stored in redis / memcached / elasticsearch be kept separate from web-services for data from RDBMS ● Use tools such as Postman and Swagger to author and document web-services Elasticsearch Postgres / Mongo Web Crawler Hadoop / Spark Middle Tier Redis
  • 17. System standardisation (1) ● Standard AMI for all systems
  • 18. System standardisation (2) ● Minimalistic “coreos” and manage configurations via infrastructure with Terraform
  • 19. System standardization (3) ● Standard base docker image for all dockers ○ OS: Ubuntu 16.04 ○ Python: 3.4 ○ Setup non-system user
  • 20. System standardisation (4) ● Separate Git repository for build and configurations ○ MeTrippingDeloyment has docker compose ymls for build and deployment settings for dev / stage / prod environments ○ .env files contain environment settings (sourced in by docker-compose)
  • 21. System standardisation (5) ● Build: docker-compose.sh -f docker-compose-common.yml -rv v1 -rt 2018.03.19 build mt-ranker-build ● Deploy: docker-compose.sh -f docker-compose-staging.yml -rv v1 -rt 2018.03.19 up -d mt-ranker
  • 22. Data Systems Best Practices ● Embrace hybrid (SQL + NoSQL + Big Data) system design ○ Store transaction data in RDBMS ■ Consider data partitioning ■ Move archive data to Big Data systems with Long Term Storage Backend ○ Store dimension / non-transaction data in NoSQL ■ MondoDB vs. CouchDB vs. Elasticsearch / Solr ○ Move complex data joins to backend data pipelines ○ Simplify star schema ● System design considerations ○ Use “non-constrained” CPUs ○ Use SSDs for data
  • 23. Summary ● Code -> Build -> Deploy -> Manage -> Burn, Burn, Burn -> Re-Design -> Re-Code -> Re-Build -> Re-Deploy -> Burn, Burn vs. ● Design -> Code -> Build -> Deploy -> Manage -> Burn Less
  • 24. Q & A
  • 25. Thank You! Gaurav (gaurav@metripping.com), Shanker (shanker@syscredence.com)