Translating Developer Productivity to Netflix Customer Delight

•Télécharger en tant que PPTX, PDF•

0 j'aime•534 vues

Slides from my talk at the Edge Engineering Meetup on June 9, 2016 at Netflix HQ, Los Gatos, CA. The talk covers why developer productivity is important for the Netflix experience based API system and takes a look at the kinds of problems we attempt to solve for Netflix developers. This talk was part of a series of talks about the Netflix Edge. http://www.slideshare.net/danieljacobson/netflix-edge-engineering-open-house-presentations-june-9-2016

Technologie

Translating Developer Productivity
to Netflix Customer Delight
Edge Engineering Open House
June 9, 2016 @ Netflix
Vasanth Asokan
Edge Developer Experience
@vasanthasokan

DEVELOP
(rapidly)
DEPLOY
(reliably)
OPERATE
(effectively)
Experimentation
driven innovation
~700 apps, dozens of pushes a day
15+ client teams, ~200 developers
~50 direct services, 100s of AB tests,
dozens of new features
The Innovation Funnel
API
Devices
Netflix Services
Client Adaptor
Applications

Why care about DevEx?
Developer
Productivity
Product
Innovation
Tools
Automation
Insights
Customer
Satisfaction

App Development and Management
DEVELOP
(rapidly)
DEPLOY
(reliably)
OPERATE
(effectively)

SERVICELAYER
Netflix
Microservices
app
WAN
Boundary
API SERVER JVM
js java
Developer Ergonomics
app
...
app
app
CLIENTLIBRARIES
Large / Complex
SERVICELAYER

REMOTESERVICELAYER
app
API SERVER JVM
Developer Ergonomics ...
app
...
app
app
CLIENTLIBRARIES
js javajs
DOCKER
CONTAINERS
WAN
Boundary
Netflix
Microservices

Setup Canary
SupportProd Push
Pre-Prod
Metrics
Tracing
Lifecycle
Alerts
Build
Bootstrap
API Discovery
REPL
Unit Test
SDK Debug Logging
Profiling
Audits
Security
Custom Routing
Dependency Management
Client Application Development Critical Component!
Dx
Developer
Experience

$ newt init
Just bring your
Javascript business
logic
NeWT: Netflix Workflow Toolkit
Continuous Integration
Deployment Pipelines
Autoscaling
Dashboards
Alerting
Logging
Lifecycle Management
Audits and Analytics
Container tooling
Canaries
Dependency Management

Titus
ATLAS
NeWT: Netflix Workflow Toolkit

$ newt auto-deploy -d
nodeJS
project
Docker Machine
node-inspector
Debugger
File watcher / live reload trigger
File watcher agent
NeWT: Local Container Development
Local
Container
docker build / run

$ newt auto-deploy -d
Docker Machine
NeWT: Local Container Development
Local
Container
Cloud
Microservices
Cloud
Proxy
Terminate
security
DiscoveryAgent
Service
Discovery
Local
System
Cloud

App Operations and Insights
DEVELOP
(rapidly)
DEPLOY
(reliably)
OPERATE
(effectively)

• Low Latency, High throughput, Highly Efficient
• Handle bursty or large scale loads
• Extensible programming model
600 jobs in production, 8M messages/sec at peak, 100Gbps network throughput
Mantis - Stream Processing Platform

Monitoring facets of aggregate application health, globally
Aggregate Insights

Analyze in real-time, requests matching a precise set of conditions
Surgical Insights

Surgical Insights - Real-time Stream Queries

Automatic monitoring of high cardinality data across multiple dimensions
Real-time Anomaly Detection

• Scaling developer productivity with business growth
•Provide fully managed PaaS experience to client developers
• Shift Left Insights to power smart development
• Curated, blended visualizations that simplify devops
In conclusion...

Contenu connexe

Tendances

Scalable Microservices at Netflix. Challenges and Tools of the TradeC4Media

(DEV309) From Asgard to Zuul: How Netflix’s Proven Open Source Tools Can Help...Amazon Web Services

How Netflix Directs 1/3rd of Internet TrafficC4Media

Top 10 Lessons Learned from the Netflix API - OSCON 2014Daniel Jacobson

Maintaining the Netflix Front Door - Presentation at Intuit MeetupDaniel Jacobson

Rethinking Cloud ProxiesMikey Cohen - Hiring Amazing Engineers

(ISM309) Efficient Innovation:High-Velocity Cost Management at NetflixAmazon Web Services

Building High Quality Video Operations in the Cloud - SynacorAmazon Web Services

Microservices at NetflixKatharina Probst

(SPOT302) Availability: The New Kind of Innovator’s DilemmaAmazon Web Services

API World 2013 - Transforming the Netflix APIBenjamin Schmaus

MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012Amazon Web Services

CDN_Netflix_analysisSanket Jain

Netflix Cloud Architecture and Open Sourceaspyker

(DVO203) The Life of a Netflix Engineer Using 37% of the InternetAmazon Web Services

RMG202 Rainmakers: How Netflix Operates Clouds for Maximum Freedom and Agilit...Amazon Web Services

Netflix APIDaniel Jacobson

Speeding Up InnovationAdrian Cockcroft

Liveperson DLD 2015 LivePerson

Web Scale Applications using NeflixOSS Cloud PlatformSudhir Tonse

Tendances (20)

Scalable Microservices at Netflix. Challenges and Tools of the Trade

(DEV309) From Asgard to Zuul: How Netflix’s Proven Open Source Tools Can Help...

How Netflix Directs 1/3rd of Internet Traffic

Top 10 Lessons Learned from the Netflix API - OSCON 2014

Maintaining the Netflix Front Door - Presentation at Intuit Meetup

Rethinking Cloud Proxies

(ISM309) Efficient Innovation:High-Velocity Cost Management at Netflix

Building High Quality Video Operations in the Cloud - Synacor

Microservices at Netflix

(SPOT302) Availability: The New Kind of Innovator’s Dilemma

API World 2013 - Transforming the Netflix API

MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012

CDN_Netflix_analysis

Netflix Cloud Architecture and Open Source

(DVO203) The Life of a Netflix Engineer Using 37% of the Internet

RMG202 Rainmakers: How Netflix Operates Clouds for Maximum Freedom and Agilit...

Netflix API

Speeding Up Innovation

Liveperson DLD 2015

Web Scale Applications using NeflixOSS Cloud Platform

Similaire à Translating Developer Productivity to Netflix Customer Delight

DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer ToolsAmazon Web Services

Hire React Native Developer | Baseline IT DevelopmentBaselineit Development

DevOps on AWS - Building Systems to Deliver FasterAmazon Web Services

Agility and DevOps on AWSAmazon Web Services

Reactive Microservices with QuarkusNiklas Heidloff

DevOps at Amazon: A Look at Our Tools and ProcessesAmazon Web Services

Slaying Monoliths with Node and DockerYunong Xiao

Getting started with AWS amplifyMarc Schröter

DevNet Express - Spark & Tropo API - Lisbon May 2016Cisco DevNet

Gowrisankar_ResumeGOWRISANKAR M

The Modern Tech Stack: Microservices - The Dark SideAggregage

Client Extensions 101 - DEVCON 2023peychevi

Mobile application development platformi4consulting.org

How to use apolloJS on React ?Jonathan Jalouzot

DevOps at Amazon: A Look at Our Tools and Processes by Matthew Trescot, Manag...Amazon Web Services

TiConf.eu -- Titanium Developer Conference in Europe, 2013Jeff Haynie

Webinar: Automate Your Environment Provisioning for Mobile App Development Skytap Cloud

DevOps in Amazon.com Amazon Web Services

How modernizing enterprise applications gives you a competitive advantageEdward Burns

JCON_15FactorWorkshop.pptxGrace Jansen

Similaire à Translating Developer Productivity to Netflix Customer Delight (20)

DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools

Hire React Native Developer | Baseline IT Development

DevOps on AWS - Building Systems to Deliver Faster

Agility and DevOps on AWS

Reactive Microservices with Quarkus

DevOps at Amazon: A Look at Our Tools and Processes

Slaying Monoliths with Node and Docker

Getting started with AWS amplify

DevNet Express - Spark & Tropo API - Lisbon May 2016

Gowrisankar_Resume

The Modern Tech Stack: Microservices - The Dark Side

Client Extensions 101 - DEVCON 2023

Mobile application development platform

How to use apolloJS on React ?

DevOps at Amazon: A Look at Our Tools and Processes by Matthew Trescot, Manag...

TiConf.eu -- Titanium Developer Conference in Europe, 2013

Webinar: Automate Your Environment Provisioning for Mobile App Development

DevOps in Amazon.com

How modernizing enterprise applications gives you a competitive advantage

JCON_15FactorWorkshop.pptx

Dernier

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

CloudStudio User manual (basic edition):comworks

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

"ML in Production",Oleksandr BaganFwdays

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz

Take control of your SAP testing with UiPath Test SuiteDianaGray10

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web

CloudStudio User manual (basic edition):

Advanced Test Driven-Development @ php[tek] 2024

SAP Build Work Zone - Overview L2-L3.pptx

DMCC Future of Trade Web3 - Special Edition

Developer Data Modeling Mistakes: From Postgres to NoSQL

Connect Wave/ connectwave Pitch Deck Presentation

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

WordPress Websites for Engineers: Elevate Your Brand

Are Multi-Cloud and Serverless Good or Bad?

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

How AI, OpenAI, and ChatGPT impact business and software.

"ML in Production",Oleksandr Bagan

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost

Take control of your SAP testing with UiPath Test Suite

Vertex AI Gemini Prompt Engineering Tips

SIP trunking in Janus @ Kamailio World 2024

Ensuring Technical Readiness For Copilot in Microsoft 365

The Ultimate Guide to Choosing WordPress Pros and Cons

Translating Developer Productivity to Netflix Customer Delight

1. Translating Developer Productivity to Netflix Customer Delight Edge Engineering Open House June 9, 2016 @ Netflix Vasanth Asokan Edge Developer Experience @vasanthasokan

2. Developer Experience?

3. DEVELOP (rapidly) DEPLOY (reliably) OPERATE (effectively) Experimentation driven innovation ~700 apps, dozens of pushes a day 15+ client teams, ~200 developers ~50 direct services, 100s of AB tests, dozens of new features The Innovation Funnel API Devices Netflix Services Client Adaptor Applications

4. Why care about DevEx? Developer Productivity Product Innovation Tools Automation Insights Customer Satisfaction

5. App Development and Management DEVELOP (rapidly) DEPLOY (reliably) OPERATE (effectively)

6. SERVICELAYER Netflix Microservices app WAN Boundary API SERVER JVM js java Developer Ergonomics app ... app app CLIENTLIBRARIES Large / Complex SERVICELAYER

7. REMOTESERVICELAYER app API SERVER JVM Developer Ergonomics ... app ... app app CLIENTLIBRARIES js javajs DOCKER CONTAINERS WAN Boundary Netflix Microservices

8. Setup Canary SupportProd Push Pre-Prod Metrics Tracing Lifecycle Alerts Build Bootstrap API Discovery REPL Unit Test SDK Debug Logging Profiling Audits Security Custom Routing Dependency Management Client Application Development Critical Component! Dx Developer Experience

9. $ newt init Just bring your Javascript business logic NeWT: Netflix Workflow Toolkit Continuous Integration Deployment Pipelines Autoscaling Dashboards Alerting Logging Lifecycle Management Audits and Analytics Container tooling Canaries Dependency Management

10. Titus ATLAS NeWT: Netflix Workflow Toolkit

11. Edge PaaS UI

12. $ newt auto-deploy -d nodeJS project Docker Machine node-inspector Debugger File watcher / live reload trigger File watcher agent NeWT: Local Container Development Local Container docker build / run

13. $ newt auto-deploy -d Docker Machine NeWT: Local Container Development Local Container Cloud Microservices Cloud Proxy Terminate security DiscoveryAgent Service Discovery Local System Cloud

14. App Operations and Insights DEVELOP (rapidly) DEPLOY (reliably) OPERATE (effectively)

15. • Low Latency, High throughput, Highly Efficient • Handle bursty or large scale loads • Extensible programming model 600 jobs in production, 8M messages/sec at peak, 100Gbps network throughput Mantis - Stream Processing Platform

16. Monitoring facets of aggregate application health, globally Aggregate Insights

17. Aggregate Insights

18. Analyze in real-time, requests matching a precise set of conditions Surgical Insights

19. Surgical Insights - Real-time Stream Queries

20. Surgical Insights - Real-time Stream Queries

21. Surgical Insights - Real-time Stream Queries

22. Monitoring server side calling pattern and internal application profile Session Tracing

23. Session Tracing

24. Session Tracing - Request Profile

25. Session Tracing - Per Node Profile

26. Automatic monitoring of high cardinality data across multiple dimensions Real-time Anomaly Detection

27. Real-time Anomaly Detection

28. • Scaling developer productivity with business growth •Provide fully managed PaaS experience to client developers • Shift Left Insights to power smart development • Curated, blended visualizations that simplify devops In conclusion...

29. Tech Soup

Notes de l'éditeur

At the Netflix Edge Developer Experience team, we are all about translating developer productivity into Netflix customer delight.
Wait, What developer experience? Let’s get a show of hands -- how many of you are developers who write code and ship applications and in your daily life? Good, so you know how important developer experience is to being productive at your work.
But who are these developers that we are talking about? The Netflix Edge is all about an experience based API -- Netflix client application developers write an API that creates the best experience possible for their device. Check out http://techblog.netflix.com/2013/01/optimizing-netflix-api.html, http://techblog.netflix.com/2014/03/the-netflix-dynamic-scripting-platform.html We are talking about these internal Netflix client application developers The innovation velocity for these client applications is very high - there are nearly 700 client adaptor applications deployed today, deploying dozens of times a day, They are authored by nearly 15+ client teams, totalling ~200 developers For a service tier that is funnelling billions of requests a day, there is a large appetite for high velocity changes We would like those developers to be able to develop rapidly, deploy reliably and operate their application effectively. And given Netflix’s experimentation driven culture, those applications are constantly evolving based on AB tests, from which they learn and do further development.
While a client developer does get to have a lot of fun creating cool new UI experiences for customers, they are also the final feature integration point, for both client and server code. The slightest friction causes a lot of pain, missed deadlines and suboptimal features So developer productivity at Edge leads to faster, more reliable innovation of product, which in turns helps keep our 81m subscriber base happy and growing. Our strategy to achieve developer productivity is to invest in tools, insights and automation and grow their value as our service grows.
Let’s dive a little deeper. Let’s take a look at innovations we are making in the areas of app development and management
This is today’s awesome dynamic scripting API server, where apps run on the JVM At the demo stations you will get to learn about Primer, our dynamic app delivery and deployment system. With primer a developer can push one of these apps to production globally effecting change for customers within five minutes However, given our future scale, there is a developer ergonomics challenge with this architecture, that we would like to solve. First, there is a tech stack mismatch -- most Netflix UIs are JS, and the groovy stack at API makes for an unnatural fit The API JVM is large and complex which means that devs cannot run debug their apps by running the server locally Also a complex, overloaded and changing application profile makes it hard to provide guarantees about performance of any individual script in production
With edge rearchitecture, we are separating client app scripts into own process isolated services implemented as docker containers The ergonomics story improves tremendously. But remember, UI developers typically do not operate services. Just want to write JS, not operate a service tier.
Stakes are very high - the criticality of the component means that developers have to manage a lot of concerns. What starts out as developmental concerns, quickly grows into various aspects of managing a server application at scale. What developers need is a Platform as a Service Solution
We are excited about something we are working on towards this, called Newt or the Netflix workflow toolkit. Newt brings docker container based, managed application development concepts to a developer’s hands NeWT itself is a command line tool, but it represents wrapping all of the platform facilities underneath to simplify app development and operations. A newt project gets all of these subsystems initialized, wrapped. Its also about the backend systems and maintaining them on behalf of the application developer Our goal is that developers have to just bring their javascript code!
Here’s a different view of the various systems that NeWT abstracts You might see some familiar open source systems there, and a few others that are Netflix specific telemetry and container cloud systems.
The PaaS experience is not just about a CLI but also about the corresponding UI experience This is a preview of our Edge PaaS UI which provides user / team personalized access to apps with integrations into other platform systems much like its CLI counter part. It also had deep integration into operational insights systems which we will talk about shortly.
Newt is also our main container tooling wrapper Recall that today’s API platform prevents effective debug of client application code With NeWT and containers we are looking to change that around How many of you love live reload debugging? Lots - oh cool, so you will love this None - well, I hope I can entice you towards using live reload debugging by the end of the day <walk through> newt auto-deploy takes your nodeJS project (or pull an image from production) Provisions a docker machine, and builds and runs your app inside a container on the docker machine Installs a file watcher agent that monitors your code And as you make edits, pushes the changes to your container, respins the node process A debugger connection is also established seamlessly via the docker machine allowing you to debug as you edit.
Our client apps typically expect to terminate security at our proxy layer, so when doing local development, it would be cumbersome to run the proxy locally too. Instead, newt will launch a network agent that creates a cloud like service discovery and registration setup And traffic can flow seamlessly from the proxy in the cloud, to the local container, and then onto downstream cloud systems
Let’s switch gears a little bit. Now that client application developers run services, we need to extend devops workflows to them They need to be able to operate their deployed code effectively and/or understand client application behavior quickly. We have numerous curated insights tools, we don’t have time to cover them all, but let’s look at a few of them
Before we look at our actual solutions, I would like to tip my hat towards Mantis which my colleagues in the Edge Realtime Events team work on So, what is Mantis? Low latency, high throughput, stream processing platform Because it is sharded and auto-scaleable Because queries are evaluated at source, and you stream only wha ct matches a query Can chain jobs, a variety of sources and sinks And the numbers speak for themselves… Mantis powers a lot of the insights tools you will be seeing next
Once an app is deployed, “How does one know the aggregate health of the application, say, globally?”
For this purpose, we created application specific dashboards, with critical health metrics laid out together so that an engineer can draw correlations. Blends historical metric data with real-time visualizations Also blends contextual information such as API pushes or server pushes to point to source of problems
Let’s say you have found a latency problem How do you surgically analyze those slow requests? You will want to collect samples
Enter our queriable real-time data explorer A user can pick a streaming data source, enter a set of conditions, in this specific case, say requests that take > 5 seconds Hit Submit
And they get aggregated results from all the cloud instances that matched their query Here you see all the slow requests being listed
We are talking about JS developers here, they can use our real-time javascript mapper to turn that stream and filter, map, reduce on it to further get to an actionable dataset. In this example, they choose to further ignore slow requests from mexico... They can also turn this data stream into a numeric metric, and then plot graphs or create alerts on deviations of that metric’s value This creates a nimble system for transient, on demand metrics. You no longer have to code up all metrics ahead of time into your source
Maybe you have identified a few devices exhibiting those slow requests At this point a developer can use our session tracing tools to get a view of their device session
Here is an example Here you see the the client’s view of requests over time, plotted on a timeline that you can zoom in and zoom out of
Maybe you spotted one or two specific slow requests in that session You can drill down and get a server side call graph / trace for that specific request We try to highlight hotspots in the call graph, as well as annotate each node with rich node specific insights data when available
And let’s say you identify that the hotspot is within your service You can then get a method level execution profile within your request
Surgical insights and alerting is possible, when you know the specific dimension that has issues and are trying to debug an issue after it has happened. But in reality, our applications have numerous dimensions and long tail characteristics -- a device could be having issues only in a certain country, and only for a given title. But the cardinality of the data can be really high for each of those dimensions What if we could automatically analyze all combinations of a set of known dimensions - say country / title, or device / title, or uiversion / title, and alert about anomalies in real-time. Not only that, but provide relevant debug data right next to the alert
We have just the system based on Mantis Here you see a specific title starting to show an increase in errors relative to historical values Future work here is looking towards auto-triaging and enriching the alert signal with a set of correlated data Thus you can send a more targeted alert
In conclusion you have gotten a flavor for ideas around Providing a managed PaaS experience Shift left insights for powering smart development Curated, blended insights for simplifying devops
For the tech fans out here, here is a short but by no means comprehensive set of technologies we employ in these solutions Come talk to us at the demo stations to learn more or if you have a great idea, come tell us what we could be doing differently!

Translating Developer Productivity to Netflix Customer Delight

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Translating Developer Productivity to Netflix Customer Delight

Similaire à Translating Developer Productivity to Netflix Customer Delight (20)

Dernier

Dernier (20)

Translating Developer Productivity to Netflix Customer Delight

Notes de l'éditeur