SlideShare une entreprise Scribd logo
1  sur  73
Télécharger pour lire hors ligne
Uber Networking: Challenges and
Opportunities
Ganesh Srinivasan & Minh Pham
Where do users need Uber the most?
Rider
Application
Partner
Application
On The Road and On The Go
Last-Mile Latency
Latency latency everywhere
Control Plane Latency
User Plane Latency
Core Network Latency
Internet Routing Latency
Last-Mile Latency (cont.)
Control and User-Plane Latency
3G 4G
Control Plane 200 - 2500 ms 50 - 100 ms
User Plane 50 ms 5 - 10 ms
Last-Mile Latency (cont.)
Core Network Latency
LTE HSPA+ HSPA EDGE GPRS
40 - 50 ms 100 - 200 ms 150 - 400 ms 600 - 750 ms 600 - 750 ms
Data from AT&T for deployed 2G - 4G networks
Handovers
Handovers are seamless, or not?
Handovers between cell towers
Handovers between different
networks
On AT&T network, it takes 6.5s to
switch from LTE to HSPA+.
Dead Zones
Where’s your coverage?
Loss of connectivity is not the
exception but the rule.
More chances for network to
become unavailable or transient
failure to happen.
Real-time Interactions
What makes Uber run?
There are a lot of real-time
interactions between a rider and
a driver.
Most of these interactions have
to be real-time to matter.
Celestial
Global network heatmap
Location
Time
Carrier
Device
Signal Strength
Latency
Dynamic Network Client
Adapt to any network conditions
Rule based system
● City, Carrier, Device
● Fine location, Time
Configure different parameters
● Timeout
● Retry
● Protocol
● Number of connections
uTimeout
Context is king
Suggest timeout based on
context: location, carrier, time,
etc.
Examples
● Dispatch Timeout
● Push TTL
Suggested Pickup Points
No more dead zones
Guiding riders and drivers to
avoid dead zones.
Integrated with suggested pickup
points to create a smoother
overall user experience.
Prediction and Planning
Future-time is the new real-time
Advance Route planning
● Connectivity
● Handovers
● Dead zone
Thank you
Proprietary and confidential © 2016 Uber Technologies, Inc. All rights reserved. No part of this document may be
reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any
information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the
use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise
exempt from disclosure under applicable law. All recipients of this document are notified that the information contained
herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any
way disclose this document or any of the enclosed information to any person other than employees of addressee to the
extent necessary for consultations with authorized personnel of Uber.
Ganesh Srinivasan & Minh Pham
Mobile Platform, Uber
(Later day) Evolution of High
Performance Networking in
Chromium:
Speculation + SPDY→ QUIC
Jim Roskind jar @ chromium.org
Opinions expressed are mine.
Presented to Amazon on 5/12/2016
Use of High Performance
Client-side Instrumentation in
Chromium (without
explaining how Histograms
work in Chrome)
Opinions expressed are still mine
Who is Jim Roskind
● 7+ years of Chromium development work at Google
○ Making Chromium faster… often in/around networking
○ Driving and/or implementing instrumentation design/development
● Many years at Netscape, working in/around Navigator
○ e.g., Java Security Architect, later VP/Chief Scientist
○ Helped to “free the source” of Mozilla
● InfoSeek co-founder
○ Implemented Python’s Profiler (used for 20 years!!!)
● Sleight of hand card magician
Overview
1. Example of Client Side Instrumentation: Histograms
2. Review of SPDY pros/cons and QUIC
3. Instrumentation of Experiments leading to QUIC Protocol Design
a. Include forward-looking QUIC elements (not yet in QUIC!)
Example:
How long does TCP Connecting take?
● Monitor duration from connection request, until availability for data
transfer
○ To see actual instrumentation code, [search for TCP_CONNECTION_LATENCY on cs.
chromium.org to find src/net/socket/transport_client_socket_pool.cc]
● In chromium, for your browsing results, visit:
○ about:histograms/Net.TCP_Connection_Latency
Histogram: Net.TCP_Connection_Latency recorded 3481 samples, average = 301.3ms
0 ...
11 --O (3 = 0.1%) {0.0%}
12 -------------------------------------------------O (132 = 3.8%) {0.1%}
14 ------------------------------------------------------------------------O (195 = 5.6%) {3.9%}
16 ---------------------------------------------------O (137 = 3.9%) {9.5%}
18 -------------------------------------------------O (132 = 3.8%) {13.4%}
20 ---------------------------------------------------O (208 = 6.0%) {17.2%}
23 -----------------------------------------------O (189 = 5.4%) {23.2%}
26 -----------------------------------------O (165 = 4.7%) {28.6%}
29 ----------------------------------------O (216 = 6.2%) {33.4%}
33 -----------------------------------O (192 = 5.5%) {39.6%}
37 --------------------------------O (219 = 6.3%) {45.1%}
42 -----------------------------O (199 = 5.7%) {51.4%}
48 -------------------O (129 = 3.7%) {57.1%}
54 -------------------O (130 = 3.7%) {60.8%}
61 --------------O (98 = 2.8%) {64.5%}
69 ------------------O (120 = 3.4%) {67.3%}
78 ------------------------------O (200 = 5.7%) {70.8%}
88 -----------------------------------O (237 = 6.8%) {76.5%}
100 ---------------------O (140 = 4.0%) {83.3%}
113 ----------------O (110 = 3.2%) {87.4%}
128 ----------O (69 = 2.0%) {90.5%}
145 -----O (36 = 1.0%) {92.5%}
164 -----O (35 = 1.0%) {93.5%}
186 --------O (56 = 1.6%) {94.5%}
211 ----O (28 = 0.8%) {96.2%}
239 -O (10 = 0.3%) {97.0%}
271 O (0 = 0.0%) {97.2%}
307 -O (5 = 0.1%) {97.2%}
348 ...
446 -O (4 = 0.1%) {97.4%}
505 ...
941 O (1 = 0.0%) {97.5%}
1065 -O (7 = 0.2%) {97.5%}
1206 O (3 = 0.1%) {97.7%}
1365 -O (4 = 0.1%) {97.8%}
1546 ...
2243 O (2 = 0.1%) {97.9%}
2540 O (0 = 0.0%) {98.0%}
2876 ------O (43 = 1.2%) {98.0%}
3256 ...
8795 -O (6 = 0.2%) {99.2%}
9958 ...
20979 -O (5 = 0.1%) {99.4%}
23753 -O (4 = 0.1%) {99.5%}
26894 -O (5 = 0.1%) {99.7%}
30451 ...
39037 -O (7 = 0.2%) {99.8%}
TCP over Comcast
from my home
Mode 14ms
Median 37ms
Mean 301ms
97% under 271ms
1.2% around 3 seconds!
...perhaps because Windows
retransmits SYN at 3 seconds!
Sample of Global TCP
Connection Latency on Windows
● Over 9 billion samples in graph
● Includes 20% under 15ms
○ Probably preconnections
● Mode around 70ms
● Median around 60ms
○ Excluding preconnects, median around 80ms
● 90% under 300ms
● 1% around 3 seconds!?!
Note: change from 11 to 12 ms is a graphical artifact
Network Stack Evolution
Sample Features Driven By Measurements
● Static page analysis, and DNS Pre-resolution
● Speculative race of second TCP connection
○ Most critical on Windows machines
● SDCH (Shared Dictionary Compression over HTTP)
○ Historically used and evaluated for Google search
● Simplistic Personalized Machine Learning: Sub-resource Speculation
○ Visit about:DNS to see what *your* Chromium has learned about *your* sites!
○ DNS pre-resolution of speculated sub-resources
○ TCP pre-connection of speculated sub-resources
● MD5 Retirement
○ ...only after use became globally infrequent
SPDY (HTTP/2): Benefits
● Multiplex multitude of HTTP requests
○ Removed HTTP/1 restriction(?) of 6 pending requests
● Multiplexed (prioritized) responses
○ Send responses asap (rather than HTTP Pipelining required order
○ Server push can send results before being requested!
● Shared congestion control pipeline
○ Reduced variance (separate HTTP responses don’t fight)
● Always encrypted (via TLS)
SPDY (HTTP/2): Issues
● TCP is slow to connect (SYN… SYN-ACK round trip)
○ TCP Fastopen worked to help
● TLS is slow to connect (CHLO SHLO handshakes)
○ Snap-start worked to help
○ Large certificate chains result in losses and delays
● TCP and TLS have head-of-line (HOL) blocking
○ OS requires in-order TCP delivery
○ TLS uses still larger encrypted blocks (often with block chaining)
● Congestion Avoidance Algorithms evolve slowly
○ 5-15 year trial/deployment cycle
QUIC: Improving upon SPDY
● Focus on Latency: 0-RTT Connection with Encryption
○ Speculative algorithms collapse together all HELLO messages
○ Compressed certificate chains reduce impact of packet loss during connections
● Remove HOL blocking
○ Each IP packet can be separately deciphered, and data can be delivered
● Congestion Control Algorithms free to Rapidly Evolve
○ Move from OS to application space
○ Precise packet loss info via rebundling (improvement over TCP retransmission)
○ Algorithms can cater to application, mobile environment, etc. etc.
● More details: QUIC: Design Document and Specification Rationale
Reachability Question:
Can UDP be used by Chrome users?
● Can UDP packets consistently reach Google??
○ Gamers use UDP… but are they “the lucky few” with fancy connections?
○ How often is it blocked?
● What size packets should be used?
○ Don’t trust “common wisdom”
Recording results of experiments:
Research for QUIC development
● PMTU (Path Max Transmission Unit) won’t work for UDP
○ UDP streams are sessionless, and there is no API to “get” an ICMP response!?
○ ...so we needed a good initial estimate of packet sizes for QUIC
● Stand up UDP echo servers around the world
○ Test a variety of UDP packet sizes (learn about the “real” world!)
○ Use two histograms, recording data for random packet sizes.
■ For each size, number of UDP packets sent by client
■ For each size, number of successful ACK responses
● About 5-7% of Chrome users couldn’t reach Google via UDP
○ QUIC has to fall-back gracefully to TCP (and often SPDY)
QUIC/UDP Connectivity
User based:
One vote per user
per size
Usage based:
One vote per user
per 30 minutes of
usage
Future QUIC MTU gains
● QUIC uses (static / conservative) 1350 MTU size for (IPv4) UDP packets
○ Download payload size currently around 1331 bytes of data (per QUIC packet) max
■ 19 bytes QUIC overhead + UDP overhead (28 for IPv4; 48 for IPv6)
■ Currently max is around 96.6% efficient for IPv4 (1331 / 1378)
● Instead of relying on PMTU, integrate exploration of MTU into QUIC
○ Periodically transmit larger packets, such as padded ACK packets
■ Monitor results, without assuming congestive loss
● Efficiency is important to large data transfers (YouTube? Netflix?)
● P2P may allow extreme efficiency, with potential for Jumbo packets
How quickly will NAT (Network Address
Translation) drop its bindings?
● NAT boxes (e.g., home routers) “understand” TCP, and will warn (reset
connection?) when they drop a binding
● NAT boxes don’t “understand” UDP connections
○ They can’t notify anything when they drop a NAT binding
● Use an echo server that accepts a delay parameter
○ Echo server can “wait” before sending its ACK response
■ See if the NATing router still properly routes response (i.e., has intact binding)
○ Evaluate “probability” of success for each delay
■ Use two histograms, with buckets based on delay
■ One counts attempts. One counts successes.
QUIC can control NAT In The Future
● Port Control Protocol (RFC 6887)
○ Not deployed today… but QUIC can evolve to use it as it becomes available
Creative use of Histogram:
Packet loss statistics
● Make 21 requests to a UDP Echo server
○ Request that echo server ACK each numbered packet
○ Histogram with 21 buckets records arrival of each possible packet number
● Look at impact of pacing UDP packets
○ Either “blast” or send at “reasonable pacing rate”
■ “Reasonable pacing” is based on an initial blast to estimate bandwidth
Packet 2, in unpaced initial transfer, is
almost twice as likely to be lost as
packets 1 or 3!?!?! The problem “goes
away” after initial transfer.
Without pacing, buffer-full(?)
losses commonly appear after
12 or 16 packets are sent.
Pacing improves survival rate for later packets
Packet loss statistics:
How much does packet size matter?
● Make 21 requests to a UDP Echo server
○ Request that echo server ACK each numbered packet
○ Histogram with 21 buckets to record arrival of each possible packet number
● Look at impact of packet sizes:
○ 100 vs 500 vs 1200 bytes
Smaller 100 byte
packets are lost more
often initially, and
packet 2 is especially
vulnerable!
Loss “cliff” at 16
unpaced-packets is
independent of
packet sizes!
Future QUIC Gains around 0-RTT
● 2nd packet is critical to effective 0-RTT connection
○ 2.5%+ “extra” probability of losing packet number 2, above and beyond 1-2%
○ Redundantly transmit packet 2 contents proactively!
● 1st packet contains critical CHLO (crypto handshake)
○ 1-2% probability of that packet being lost (critical path for packet number 2!!!)
● Proactive redundancy in 0-RTT handshake/request gains 5+% reliability
○ Uplink channel is underutilized, so redundancy is “cost free”
○ RTO of at least 200ms ⇒ Average savings of at least 10ms
● See “Quicker QUIC Connections” for more details
Estimate Potential of FEC for UDP packets
● Sent 21 numbered packets to an ACKing echo server
○ Create 21 distinct histograms, one histogram for each prefix of first-k packets
■ There are (effectively) a about 21 distinct histograms! (one per prefix)
○ Increment the nth bucket if n out of k packets were ACKed
● Example: When sending first 17 packets, find probability of getting 17 vs 16
vs 15 vs … acks, by recording in a single histogram
○ If we get 16 or more acks, then a simple XOR FEC would recover (without retransmission)
○ If we get 15 or more acks, then 2-packet-correcting FEC would recover.
Pacing significantly
helps after about 12
packets are sent. (blue
vs green line)
1-FEC reduces
retransmits much more
than 2-FEC would help
FEC Caveats:
They are not good for everything!
● NACK based transmits are more efficient
○ Don’t waste bandwidth on FEC when BDP is much smaller than total payload
○ It is better to observe a loss, and *only* then retransmit
● Largest potential gains are for stream creation (client side)
○ Client upload bandwidth is usually underutilized
○ Payload is tiny (compresed HTTP GET?) , and it is all on the critical path for a response
● Smaller (but possible) gain potentials for tail loss probe via FEC packet
○ Don’t use if tail latency is not critical, or bandwidth is at a premium
Summary:
Client side histograms are very useful!!
● Creative application provides tremendous utility
● Simple developer API provides wide-spread use
○ Developers will actually measure, before and after deploying!!!
○ There are 2100 *active* histograms in a recent Chrome release!!!
● Mozilla and Chromium now have supporting code
○ Open source is the source ;-)
● Features, such as Networking protocols, can greatly benefit from detailed
instrumentation and analysis
Acknowledgements:
Topics described were massive team efforts
● Thanks to the many members of the Google Chrome team for facilitating
this work, and producing a Great Product to build upon!
● Special thanks to the QUIC Team!
● Extra special shout-out for their support on several discussed topics to:
○ Mike Belshe, Roberto Peon: SPDY and pre-QUIC discussions
○ Jeff Bailey: UDP echo test server rollout
○ Raman Tenneti: UDP echo servers; QUIC team member
○ Thanks to scores of Googlers for reviews and contributions to QUIC Design/Rationale!
● Thanks to Google, for providing a place to change the Internet world!
○ Linus Upson: Thanks for providing Google Management Cover
gRPC: Universal RPC
Makarand Dharmapurikar, Eric Anderson
History
Google has had 4 generations of internal RPC
systems, called Stubby
● Used in all production applications and systems
● Over 1010
RPCs per second, fleet-wide
● Separate IDL; APIs for C++, Java, Python, Go
● Tightly coupled with infrastructure
(infeasible to usable externally)
Very happy with Stubby
● Services available from any language
● One integration point for load balancing, auth,
logging, tracing, accounting, billing, quota
gRPC History
Need solution for more connected world
● Cloud needs same high performance
● Use same APIs from Mobile/Browser
gRPC is the next generation of Stubby.
Goal: Usable everywhere
● Servers to Mobile to microcontrollers (IoT)
● Awesome networks to horrible networks
● Lots more languages/platforms
● Must support pluggability
● Open Source; developed in the open
gRPC History
Overview
● Android, iOS; 10+ languages
○ Idiomatic, language-specific APIs
● Payload agnostic. We’ve implemented Protobuf
● HTTP/2
○ Binary, multiplexing
● QUIC support in process of open-sourcing (via Cronet)
○ No head-of-line blocking; 0 RTT
● Layered and pluggable
○ Use-specific hooks. e.g., naming, LB
○ Metadata. e.g., tracing, auth
● Streaming with flow control. No need for long polling!
● Timeout and cancellation
gRPC Features
Key insights. Mobile is not that different
● Google already translating 1:1 REST, with Protobuf, to RPCs
● Very high-performance services care about memory and CPU
● Microcontrollers make mobile look beefy
● High latency cross-continent. Home networks aren’t great. Black holes happen
● Many features convenient everywhere, like tracing and streaming
Universal RPC - Mobile and cloud
● Mobile depends on Cloud
● Developers should expect same great experience
● Some unique needs, but not overly burdensome
○ Power optimization, platform-specific network integration (for resiliency)
gRPC and Mobile
Compatibility with ecosystem (current or planned)
● Supports generic HTTP/2 reverse proxies
○ Nghttp2, HAProxy, Apache (untested), Nginx (in progress), GCLB (in progress)
● grpc-gateway
○ A combined gRPC + REST server endpoint
● Name resolver, client-side load balancer
○ etcd (Go only)
● Monitoring/Tracing
○ Zipkin, Open Tracing (in progress)
gRPC: Universal RPC
Example
Hello, world!
service Greeter {
rpc SayHello (HelloRequest) returns (HelloReply);
}
message HelloRequest {
string name = 1;
}
message HelloReply {
string message = 1;
}
Example (IDL)
// Create shareable virtual connection (may have 0-to-many actual connections; auto-reconnects)
ManagedChannel channel = ManagedChannelBuilder.forAddress(host, port).build();
GreeterBlockingStub blockingStub = GreeterGrpc.newBlockingStub(channel);
HelloRequest request = HelloRequest.newBuilder().setName("world").build();
HelloReply response = blockingStub.sayHello(request);
// To release resources, as necessary
channel.shutdown();
Example (Client)
Server server = ServerBuilder.forPort(port)
.addService(new GreeterImpl())
.build()
.start();
server.awaitTermination();
class GreeterImpl extends GreeterGrpc.AbstractGreeter {
@Override
public void sayHello(HelloRequest req, StreamObserver<HelloReply> responseObserver) {
HelloReply reply = HelloReply.newBuilder().setMessage("Hello, " + req.getName()).build();
responseObserver.onNext(reply);
responseObserver.onCompleted();
}
}
Example (Server)
Some of the adopters
Site: grpc.io
Mailing List: grpc-io@googlegroups.com
Twitter Handle: @grpcio
Amazing mobile data pipelines
Karthik Ramgopal
About us
▪ World’s largest professional social network.
▪ 433M members worldwide.
▪ > 50% members access LinkedIn on mobile.
▪ Huge growth in India and China.
About me
▪ Mobile Infrastructure Engineer
▪ Android platform and Sitespeed lead
LinkedIn app portfolio
▪ LinkedIn Flagship
▪ Lookup
▪ Pulse
▪ Job Seeker
▪ Elevate
▪ Groups
▪ Sales Navigator
▪ Recruiter
▪ Student Job Seeker
▪ Lynda.com
The leaky pipe
▪ Mobile Networks are flaky
▪ Speeds range from 80Kbps (GPRS/India) to
over 10 Mbps (LTE/US)
▪ Last mile latency
▪ Routing/peering issues
▪ Frequent disconnects and degradation is
common
Diversity in devices
▪ Fragmented Android ecosystem. Older
iPhones prevalent in emerging markets.
▪ Lowest end devices have 256M of RAM and
single core CPUs.
How do we optimize?
▪ Network connect
▪ Server time
▪ Response download/upload
▪ Parsing and caching
▪ Robust client side infrastructure
▪ Measure, measure and measure
Network connect
▪ Sprinkle PoPs and CDNs close to members
▪ Early initialization
▪ Custom DNS cache
▪ SSL session cache
▪ Retries and timeouts tuned by network type
Response download/upload
▪ Native multiplexing using SPDY.
▪ Custom dispatcher/response processor
▪ Content resumption
▪ Rest.li multiplexer
▪ Progressive JPEG for images
Payload size reduction
▪ Delta sync
▪ Brotli compression
▪ SDCH
Parsing
▪ Stream parse and decode
▪ Schema aware JSON parser
▪ Custom image decoder
Caching
▪ Traditional request/response caches are
passé.
▪ Fission: Decompose and cache
▪ Memory mapped disk cache
▪ No memory cache
Thank You!
Questions?

Contenu connexe

En vedette

Uber a modern age business strategy
Uber   a modern age business strategyUber   a modern age business strategy
Uber a modern age business strategyDhruvajyoti Roy
 
Uber Taxi Service in India - Voice of Customer (VoC)
Uber Taxi Service in India - Voice of Customer (VoC)Uber Taxi Service in India - Voice of Customer (VoC)
Uber Taxi Service in India - Voice of Customer (VoC)Valoriser Consultants
 
TAXI WARS IN INDIA (OLA CABS VS UBER)
TAXI WARS IN INDIA (OLA CABS VS UBER)TAXI WARS IN INDIA (OLA CABS VS UBER)
TAXI WARS IN INDIA (OLA CABS VS UBER)Rushin Shah
 
UPS case study analysis
UPS case study analysisUPS case study analysis
UPS case study analysisr-dilara
 
10 Step Marketing Plan - UBER
10 Step Marketing Plan - UBER10 Step Marketing Plan - UBER
10 Step Marketing Plan - UBERKEN5JK
 
8 Tips for Scaling Mobile Users in China by Edith Yeung
8 Tips for Scaling Mobile Users in China by Edith Yeung8 Tips for Scaling Mobile Users in China by Edith Yeung
8 Tips for Scaling Mobile Users in China by Edith YeungEdith Yeung
 

En vedette (9)

Uber a modern age business strategy
Uber   a modern age business strategyUber   a modern age business strategy
Uber a modern age business strategy
 
Uber for india
Uber for indiaUber for india
Uber for india
 
Uber Taxi Service in India - Voice of Customer (VoC)
Uber Taxi Service in India - Voice of Customer (VoC)Uber Taxi Service in India - Voice of Customer (VoC)
Uber Taxi Service in India - Voice of Customer (VoC)
 
TAXI WARS IN INDIA (OLA CABS VS UBER)
TAXI WARS IN INDIA (OLA CABS VS UBER)TAXI WARS IN INDIA (OLA CABS VS UBER)
TAXI WARS IN INDIA (OLA CABS VS UBER)
 
Ups
UpsUps
Ups
 
UPS case study analysis
UPS case study analysisUPS case study analysis
UPS case study analysis
 
10 Step Marketing Plan - UBER
10 Step Marketing Plan - UBER10 Step Marketing Plan - UBER
10 Step Marketing Plan - UBER
 
UBER Strategy
UBER StrategyUBER Strategy
UBER Strategy
 
8 Tips for Scaling Mobile Users in China by Edith Yeung
8 Tips for Scaling Mobile Users in China by Edith Yeung8 Tips for Scaling Mobile Users in China by Edith Yeung
8 Tips for Scaling Mobile Users in China by Edith Yeung
 

Similaire à Uber mobility - High Performance Networking

Mobile web performance - MoDev East
Mobile web performance - MoDev EastMobile web performance - MoDev East
Mobile web performance - MoDev EastPatrick Meenan
 
Better Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes ConnectBetter Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes ConnectThousandEyes
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceDatabricks
 
Keeping the Internet Fast and Resilient for You and Your Customers
Keeping the Internet Fast and Resilient for You and Your CustomersKeeping the Internet Fast and Resilient for You and Your Customers
Keeping the Internet Fast and Resilient for You and Your CustomersCloudflare
 
SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0Mike Belshe
 
University of Delaware - Improving Web Protocols (early SPDY talk)
University of Delaware - Improving Web Protocols (early SPDY talk)University of Delaware - Improving Web Protocols (early SPDY talk)
University of Delaware - Improving Web Protocols (early SPDY talk)Mike Belshe
 
Reconsider TCPdump for Modern Troubleshooting
Reconsider TCPdump for Modern TroubleshootingReconsider TCPdump for Modern Troubleshooting
Reconsider TCPdump for Modern TroubleshootingAvi Networks
 
WebRTC: A front-end perspective
WebRTC: A front-end perspectiveWebRTC: A front-end perspective
WebRTC: A front-end perspectiveshwetank
 
Primer to Browser Netwroking
Primer to Browser NetwrokingPrimer to Browser Netwroking
Primer to Browser NetwrokingShuya Osaki
 
Routing, Network Performance, and Role of Analytics
Routing, Network Performance, and Role of AnalyticsRouting, Network Performance, and Role of Analytics
Routing, Network Performance, and Role of AnalyticsAPNIC
 
Interconnection Automation For All - Extended - MPS 2023
Interconnection Automation For All - Extended - MPS 2023Interconnection Automation For All - Extended - MPS 2023
Interconnection Automation For All - Extended - MPS 2023Chris Grundemann
 
Cloudflare lower network latency = faster website loads
Cloudflare lower network latency = faster website loadsCloudflare lower network latency = faster website loads
Cloudflare lower network latency = faster website loadsVu Long Tran
 
Master Class : TCP/IP Mechanics from Scratch to Expert
Master Class : TCP/IP Mechanics from Scratch to ExpertMaster Class : TCP/IP Mechanics from Scratch to Expert
Master Class : TCP/IP Mechanics from Scratch to ExpertAbhishek Sagar
 
Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security
Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security
Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security GLC Networks
 
Tcp and udp.transmission control protocol.user datagram protocol
Tcp and udp.transmission control protocol.user datagram protocolTcp and udp.transmission control protocol.user datagram protocol
Tcp and udp.transmission control protocol.user datagram protocolMushtaque Khan Noonari
 
Network Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: IntrouctionNetwork Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: IntrouctionCloudflare
 
Presto Apache BigData 2017
Presto Apache BigData 2017Presto Apache BigData 2017
Presto Apache BigData 2017Zhenxiao Luo
 
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017Amazon Web Services Korea
 

Similaire à Uber mobility - High Performance Networking (20)

Mobile web performance - MoDev East
Mobile web performance - MoDev EastMobile web performance - MoDev East
Mobile web performance - MoDev East
 
Better Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes ConnectBetter Than Best Effort at Bloomberg from ThousandEyes Connect
Better Than Best Effort at Bloomberg from ThousandEyes Connect
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
Keeping the Internet Fast and Resilient for You and Your Customers
Keeping the Internet Fast and Resilient for You and Your CustomersKeeping the Internet Fast and Resilient for You and Your Customers
Keeping the Internet Fast and Resilient for You and Your Customers
 
SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0
 
University of Delaware - Improving Web Protocols (early SPDY talk)
University of Delaware - Improving Web Protocols (early SPDY talk)University of Delaware - Improving Web Protocols (early SPDY talk)
University of Delaware - Improving Web Protocols (early SPDY talk)
 
Reconsider TCPdump for Modern Troubleshooting
Reconsider TCPdump for Modern TroubleshootingReconsider TCPdump for Modern Troubleshooting
Reconsider TCPdump for Modern Troubleshooting
 
WebRTC: A front-end perspective
WebRTC: A front-end perspectiveWebRTC: A front-end perspective
WebRTC: A front-end perspective
 
Chapter04
Chapter04Chapter04
Chapter04
 
Primer to Browser Netwroking
Primer to Browser NetwrokingPrimer to Browser Netwroking
Primer to Browser Netwroking
 
Routing, Network Performance, and Role of Analytics
Routing, Network Performance, and Role of AnalyticsRouting, Network Performance, and Role of Analytics
Routing, Network Performance, and Role of Analytics
 
Interconnection Automation For All - Extended - MPS 2023
Interconnection Automation For All - Extended - MPS 2023Interconnection Automation For All - Extended - MPS 2023
Interconnection Automation For All - Extended - MPS 2023
 
Cloudflare lower network latency = faster website loads
Cloudflare lower network latency = faster website loadsCloudflare lower network latency = faster website loads
Cloudflare lower network latency = faster website loads
 
Master Class : TCP/IP Mechanics from Scratch to Expert
Master Class : TCP/IP Mechanics from Scratch to ExpertMaster Class : TCP/IP Mechanics from Scratch to Expert
Master Class : TCP/IP Mechanics from Scratch to Expert
 
Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security
Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security
Troubleshooting Layer 2 Ethernet Problem: Loop, Broadcast, Security
 
TCP AND UDP
TCP AND UDP TCP AND UDP
TCP AND UDP
 
Tcp and udp.transmission control protocol.user datagram protocol
Tcp and udp.transmission control protocol.user datagram protocolTcp and udp.transmission control protocol.user datagram protocol
Tcp and udp.transmission control protocol.user datagram protocol
 
Network Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: IntrouctionNetwork Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: Introuction
 
Presto Apache BigData 2017
Presto Apache BigData 2017Presto Apache BigData 2017
Presto Apache BigData 2017
 
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
 

Uber mobility - High Performance Networking

  • 1. Uber Networking: Challenges and Opportunities Ganesh Srinivasan & Minh Pham
  • 2.
  • 3. Where do users need Uber the most? Rider Application Partner Application On The Road and On The Go
  • 4. Last-Mile Latency Latency latency everywhere Control Plane Latency User Plane Latency Core Network Latency Internet Routing Latency
  • 5. Last-Mile Latency (cont.) Control and User-Plane Latency 3G 4G Control Plane 200 - 2500 ms 50 - 100 ms User Plane 50 ms 5 - 10 ms
  • 6. Last-Mile Latency (cont.) Core Network Latency LTE HSPA+ HSPA EDGE GPRS 40 - 50 ms 100 - 200 ms 150 - 400 ms 600 - 750 ms 600 - 750 ms Data from AT&T for deployed 2G - 4G networks
  • 7. Handovers Handovers are seamless, or not? Handovers between cell towers Handovers between different networks On AT&T network, it takes 6.5s to switch from LTE to HSPA+.
  • 8. Dead Zones Where’s your coverage? Loss of connectivity is not the exception but the rule. More chances for network to become unavailable or transient failure to happen.
  • 9. Real-time Interactions What makes Uber run? There are a lot of real-time interactions between a rider and a driver. Most of these interactions have to be real-time to matter.
  • 10.
  • 12.
  • 13.
  • 14. Dynamic Network Client Adapt to any network conditions Rule based system ● City, Carrier, Device ● Fine location, Time Configure different parameters ● Timeout ● Retry ● Protocol ● Number of connections
  • 15. uTimeout Context is king Suggest timeout based on context: location, carrier, time, etc. Examples ● Dispatch Timeout ● Push TTL
  • 16. Suggested Pickup Points No more dead zones Guiding riders and drivers to avoid dead zones. Integrated with suggested pickup points to create a smoother overall user experience.
  • 17. Prediction and Planning Future-time is the new real-time Advance Route planning ● Connectivity ● Handovers ● Dead zone
  • 18. Thank you Proprietary and confidential © 2016 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent necessary for consultations with authorized personnel of Uber. Ganesh Srinivasan & Minh Pham Mobile Platform, Uber
  • 19. (Later day) Evolution of High Performance Networking in Chromium: Speculation + SPDY→ QUIC Jim Roskind jar @ chromium.org Opinions expressed are mine. Presented to Amazon on 5/12/2016
  • 20. Use of High Performance Client-side Instrumentation in Chromium (without explaining how Histograms work in Chrome) Opinions expressed are still mine
  • 21. Who is Jim Roskind ● 7+ years of Chromium development work at Google ○ Making Chromium faster… often in/around networking ○ Driving and/or implementing instrumentation design/development ● Many years at Netscape, working in/around Navigator ○ e.g., Java Security Architect, later VP/Chief Scientist ○ Helped to “free the source” of Mozilla ● InfoSeek co-founder ○ Implemented Python’s Profiler (used for 20 years!!!) ● Sleight of hand card magician
  • 22. Overview 1. Example of Client Side Instrumentation: Histograms 2. Review of SPDY pros/cons and QUIC 3. Instrumentation of Experiments leading to QUIC Protocol Design a. Include forward-looking QUIC elements (not yet in QUIC!)
  • 23. Example: How long does TCP Connecting take? ● Monitor duration from connection request, until availability for data transfer ○ To see actual instrumentation code, [search for TCP_CONNECTION_LATENCY on cs. chromium.org to find src/net/socket/transport_client_socket_pool.cc] ● In chromium, for your browsing results, visit: ○ about:histograms/Net.TCP_Connection_Latency
  • 24. Histogram: Net.TCP_Connection_Latency recorded 3481 samples, average = 301.3ms 0 ... 11 --O (3 = 0.1%) {0.0%} 12 -------------------------------------------------O (132 = 3.8%) {0.1%} 14 ------------------------------------------------------------------------O (195 = 5.6%) {3.9%} 16 ---------------------------------------------------O (137 = 3.9%) {9.5%} 18 -------------------------------------------------O (132 = 3.8%) {13.4%} 20 ---------------------------------------------------O (208 = 6.0%) {17.2%} 23 -----------------------------------------------O (189 = 5.4%) {23.2%} 26 -----------------------------------------O (165 = 4.7%) {28.6%} 29 ----------------------------------------O (216 = 6.2%) {33.4%} 33 -----------------------------------O (192 = 5.5%) {39.6%} 37 --------------------------------O (219 = 6.3%) {45.1%} 42 -----------------------------O (199 = 5.7%) {51.4%} 48 -------------------O (129 = 3.7%) {57.1%} 54 -------------------O (130 = 3.7%) {60.8%} 61 --------------O (98 = 2.8%) {64.5%} 69 ------------------O (120 = 3.4%) {67.3%} 78 ------------------------------O (200 = 5.7%) {70.8%} 88 -----------------------------------O (237 = 6.8%) {76.5%} 100 ---------------------O (140 = 4.0%) {83.3%} 113 ----------------O (110 = 3.2%) {87.4%} 128 ----------O (69 = 2.0%) {90.5%} 145 -----O (36 = 1.0%) {92.5%} 164 -----O (35 = 1.0%) {93.5%} 186 --------O (56 = 1.6%) {94.5%} 211 ----O (28 = 0.8%) {96.2%} 239 -O (10 = 0.3%) {97.0%} 271 O (0 = 0.0%) {97.2%} 307 -O (5 = 0.1%) {97.2%} 348 ... 446 -O (4 = 0.1%) {97.4%} 505 ... 941 O (1 = 0.0%) {97.5%} 1065 -O (7 = 0.2%) {97.5%} 1206 O (3 = 0.1%) {97.7%} 1365 -O (4 = 0.1%) {97.8%} 1546 ... 2243 O (2 = 0.1%) {97.9%} 2540 O (0 = 0.0%) {98.0%} 2876 ------O (43 = 1.2%) {98.0%} 3256 ... 8795 -O (6 = 0.2%) {99.2%} 9958 ... 20979 -O (5 = 0.1%) {99.4%} 23753 -O (4 = 0.1%) {99.5%} 26894 -O (5 = 0.1%) {99.7%} 30451 ... 39037 -O (7 = 0.2%) {99.8%} TCP over Comcast from my home Mode 14ms Median 37ms Mean 301ms 97% under 271ms 1.2% around 3 seconds! ...perhaps because Windows retransmits SYN at 3 seconds!
  • 25. Sample of Global TCP Connection Latency on Windows ● Over 9 billion samples in graph ● Includes 20% under 15ms ○ Probably preconnections ● Mode around 70ms ● Median around 60ms ○ Excluding preconnects, median around 80ms ● 90% under 300ms ● 1% around 3 seconds!?! Note: change from 11 to 12 ms is a graphical artifact
  • 26. Network Stack Evolution Sample Features Driven By Measurements ● Static page analysis, and DNS Pre-resolution ● Speculative race of second TCP connection ○ Most critical on Windows machines ● SDCH (Shared Dictionary Compression over HTTP) ○ Historically used and evaluated for Google search ● Simplistic Personalized Machine Learning: Sub-resource Speculation ○ Visit about:DNS to see what *your* Chromium has learned about *your* sites! ○ DNS pre-resolution of speculated sub-resources ○ TCP pre-connection of speculated sub-resources ● MD5 Retirement ○ ...only after use became globally infrequent
  • 27. SPDY (HTTP/2): Benefits ● Multiplex multitude of HTTP requests ○ Removed HTTP/1 restriction(?) of 6 pending requests ● Multiplexed (prioritized) responses ○ Send responses asap (rather than HTTP Pipelining required order ○ Server push can send results before being requested! ● Shared congestion control pipeline ○ Reduced variance (separate HTTP responses don’t fight) ● Always encrypted (via TLS)
  • 28. SPDY (HTTP/2): Issues ● TCP is slow to connect (SYN… SYN-ACK round trip) ○ TCP Fastopen worked to help ● TLS is slow to connect (CHLO SHLO handshakes) ○ Snap-start worked to help ○ Large certificate chains result in losses and delays ● TCP and TLS have head-of-line (HOL) blocking ○ OS requires in-order TCP delivery ○ TLS uses still larger encrypted blocks (often with block chaining) ● Congestion Avoidance Algorithms evolve slowly ○ 5-15 year trial/deployment cycle
  • 29. QUIC: Improving upon SPDY ● Focus on Latency: 0-RTT Connection with Encryption ○ Speculative algorithms collapse together all HELLO messages ○ Compressed certificate chains reduce impact of packet loss during connections ● Remove HOL blocking ○ Each IP packet can be separately deciphered, and data can be delivered ● Congestion Control Algorithms free to Rapidly Evolve ○ Move from OS to application space ○ Precise packet loss info via rebundling (improvement over TCP retransmission) ○ Algorithms can cater to application, mobile environment, etc. etc. ● More details: QUIC: Design Document and Specification Rationale
  • 30. Reachability Question: Can UDP be used by Chrome users? ● Can UDP packets consistently reach Google?? ○ Gamers use UDP… but are they “the lucky few” with fancy connections? ○ How often is it blocked? ● What size packets should be used? ○ Don’t trust “common wisdom”
  • 31. Recording results of experiments: Research for QUIC development ● PMTU (Path Max Transmission Unit) won’t work for UDP ○ UDP streams are sessionless, and there is no API to “get” an ICMP response!? ○ ...so we needed a good initial estimate of packet sizes for QUIC ● Stand up UDP echo servers around the world ○ Test a variety of UDP packet sizes (learn about the “real” world!) ○ Use two histograms, recording data for random packet sizes. ■ For each size, number of UDP packets sent by client ■ For each size, number of successful ACK responses ● About 5-7% of Chrome users couldn’t reach Google via UDP ○ QUIC has to fall-back gracefully to TCP (and often SPDY)
  • 32. QUIC/UDP Connectivity User based: One vote per user per size Usage based: One vote per user per 30 minutes of usage
  • 33. Future QUIC MTU gains ● QUIC uses (static / conservative) 1350 MTU size for (IPv4) UDP packets ○ Download payload size currently around 1331 bytes of data (per QUIC packet) max ■ 19 bytes QUIC overhead + UDP overhead (28 for IPv4; 48 for IPv6) ■ Currently max is around 96.6% efficient for IPv4 (1331 / 1378) ● Instead of relying on PMTU, integrate exploration of MTU into QUIC ○ Periodically transmit larger packets, such as padded ACK packets ■ Monitor results, without assuming congestive loss ● Efficiency is important to large data transfers (YouTube? Netflix?) ● P2P may allow extreme efficiency, with potential for Jumbo packets
  • 34. How quickly will NAT (Network Address Translation) drop its bindings? ● NAT boxes (e.g., home routers) “understand” TCP, and will warn (reset connection?) when they drop a binding ● NAT boxes don’t “understand” UDP connections ○ They can’t notify anything when they drop a NAT binding ● Use an echo server that accepts a delay parameter ○ Echo server can “wait” before sending its ACK response ■ See if the NATing router still properly routes response (i.e., has intact binding) ○ Evaluate “probability” of success for each delay ■ Use two histograms, with buckets based on delay ■ One counts attempts. One counts successes.
  • 35.
  • 36. QUIC can control NAT In The Future ● Port Control Protocol (RFC 6887) ○ Not deployed today… but QUIC can evolve to use it as it becomes available
  • 37. Creative use of Histogram: Packet loss statistics ● Make 21 requests to a UDP Echo server ○ Request that echo server ACK each numbered packet ○ Histogram with 21 buckets records arrival of each possible packet number ● Look at impact of pacing UDP packets ○ Either “blast” or send at “reasonable pacing rate” ■ “Reasonable pacing” is based on an initial blast to estimate bandwidth
  • 38. Packet 2, in unpaced initial transfer, is almost twice as likely to be lost as packets 1 or 3!?!?! The problem “goes away” after initial transfer. Without pacing, buffer-full(?) losses commonly appear after 12 or 16 packets are sent. Pacing improves survival rate for later packets
  • 39. Packet loss statistics: How much does packet size matter? ● Make 21 requests to a UDP Echo server ○ Request that echo server ACK each numbered packet ○ Histogram with 21 buckets to record arrival of each possible packet number ● Look at impact of packet sizes: ○ 100 vs 500 vs 1200 bytes
  • 40. Smaller 100 byte packets are lost more often initially, and packet 2 is especially vulnerable! Loss “cliff” at 16 unpaced-packets is independent of packet sizes!
  • 41. Future QUIC Gains around 0-RTT ● 2nd packet is critical to effective 0-RTT connection ○ 2.5%+ “extra” probability of losing packet number 2, above and beyond 1-2% ○ Redundantly transmit packet 2 contents proactively! ● 1st packet contains critical CHLO (crypto handshake) ○ 1-2% probability of that packet being lost (critical path for packet number 2!!!) ● Proactive redundancy in 0-RTT handshake/request gains 5+% reliability ○ Uplink channel is underutilized, so redundancy is “cost free” ○ RTO of at least 200ms ⇒ Average savings of at least 10ms ● See “Quicker QUIC Connections” for more details
  • 42. Estimate Potential of FEC for UDP packets ● Sent 21 numbered packets to an ACKing echo server ○ Create 21 distinct histograms, one histogram for each prefix of first-k packets ■ There are (effectively) a about 21 distinct histograms! (one per prefix) ○ Increment the nth bucket if n out of k packets were ACKed ● Example: When sending first 17 packets, find probability of getting 17 vs 16 vs 15 vs … acks, by recording in a single histogram ○ If we get 16 or more acks, then a simple XOR FEC would recover (without retransmission) ○ If we get 15 or more acks, then 2-packet-correcting FEC would recover.
  • 43. Pacing significantly helps after about 12 packets are sent. (blue vs green line) 1-FEC reduces retransmits much more than 2-FEC would help
  • 44. FEC Caveats: They are not good for everything! ● NACK based transmits are more efficient ○ Don’t waste bandwidth on FEC when BDP is much smaller than total payload ○ It is better to observe a loss, and *only* then retransmit ● Largest potential gains are for stream creation (client side) ○ Client upload bandwidth is usually underutilized ○ Payload is tiny (compresed HTTP GET?) , and it is all on the critical path for a response ● Smaller (but possible) gain potentials for tail loss probe via FEC packet ○ Don’t use if tail latency is not critical, or bandwidth is at a premium
  • 45. Summary: Client side histograms are very useful!! ● Creative application provides tremendous utility ● Simple developer API provides wide-spread use ○ Developers will actually measure, before and after deploying!!! ○ There are 2100 *active* histograms in a recent Chrome release!!! ● Mozilla and Chromium now have supporting code ○ Open source is the source ;-) ● Features, such as Networking protocols, can greatly benefit from detailed instrumentation and analysis
  • 46. Acknowledgements: Topics described were massive team efforts ● Thanks to the many members of the Google Chrome team for facilitating this work, and producing a Great Product to build upon! ● Special thanks to the QUIC Team! ● Extra special shout-out for their support on several discussed topics to: ○ Mike Belshe, Roberto Peon: SPDY and pre-QUIC discussions ○ Jeff Bailey: UDP echo test server rollout ○ Raman Tenneti: UDP echo servers; QUIC team member ○ Thanks to scores of Googlers for reviews and contributions to QUIC Design/Rationale! ● Thanks to Google, for providing a place to change the Internet world! ○ Linus Upson: Thanks for providing Google Management Cover
  • 47. gRPC: Universal RPC Makarand Dharmapurikar, Eric Anderson
  • 49. Google has had 4 generations of internal RPC systems, called Stubby ● Used in all production applications and systems ● Over 1010 RPCs per second, fleet-wide ● Separate IDL; APIs for C++, Java, Python, Go ● Tightly coupled with infrastructure (infeasible to usable externally) Very happy with Stubby ● Services available from any language ● One integration point for load balancing, auth, logging, tracing, accounting, billing, quota gRPC History
  • 50. Need solution for more connected world ● Cloud needs same high performance ● Use same APIs from Mobile/Browser gRPC is the next generation of Stubby. Goal: Usable everywhere ● Servers to Mobile to microcontrollers (IoT) ● Awesome networks to horrible networks ● Lots more languages/platforms ● Must support pluggability ● Open Source; developed in the open gRPC History
  • 52. ● Android, iOS; 10+ languages ○ Idiomatic, language-specific APIs ● Payload agnostic. We’ve implemented Protobuf ● HTTP/2 ○ Binary, multiplexing ● QUIC support in process of open-sourcing (via Cronet) ○ No head-of-line blocking; 0 RTT ● Layered and pluggable ○ Use-specific hooks. e.g., naming, LB ○ Metadata. e.g., tracing, auth ● Streaming with flow control. No need for long polling! ● Timeout and cancellation gRPC Features
  • 53. Key insights. Mobile is not that different ● Google already translating 1:1 REST, with Protobuf, to RPCs ● Very high-performance services care about memory and CPU ● Microcontrollers make mobile look beefy ● High latency cross-continent. Home networks aren’t great. Black holes happen ● Many features convenient everywhere, like tracing and streaming Universal RPC - Mobile and cloud ● Mobile depends on Cloud ● Developers should expect same great experience ● Some unique needs, but not overly burdensome ○ Power optimization, platform-specific network integration (for resiliency) gRPC and Mobile
  • 54. Compatibility with ecosystem (current or planned) ● Supports generic HTTP/2 reverse proxies ○ Nghttp2, HAProxy, Apache (untested), Nginx (in progress), GCLB (in progress) ● grpc-gateway ○ A combined gRPC + REST server endpoint ● Name resolver, client-side load balancer ○ etcd (Go only) ● Monitoring/Tracing ○ Zipkin, Open Tracing (in progress) gRPC: Universal RPC
  • 56. service Greeter { rpc SayHello (HelloRequest) returns (HelloReply); } message HelloRequest { string name = 1; } message HelloReply { string message = 1; } Example (IDL)
  • 57. // Create shareable virtual connection (may have 0-to-many actual connections; auto-reconnects) ManagedChannel channel = ManagedChannelBuilder.forAddress(host, port).build(); GreeterBlockingStub blockingStub = GreeterGrpc.newBlockingStub(channel); HelloRequest request = HelloRequest.newBuilder().setName("world").build(); HelloReply response = blockingStub.sayHello(request); // To release resources, as necessary channel.shutdown(); Example (Client)
  • 58. Server server = ServerBuilder.forPort(port) .addService(new GreeterImpl()) .build() .start(); server.awaitTermination(); class GreeterImpl extends GreeterGrpc.AbstractGreeter { @Override public void sayHello(HelloRequest req, StreamObserver<HelloReply> responseObserver) { HelloReply reply = HelloReply.newBuilder().setMessage("Hello, " + req.getName()).build(); responseObserver.onNext(reply); responseObserver.onCompleted(); } } Example (Server)
  • 59. Some of the adopters
  • 60. Site: grpc.io Mailing List: grpc-io@googlegroups.com Twitter Handle: @grpcio
  • 61. Amazing mobile data pipelines Karthik Ramgopal
  • 62. About us ▪ World’s largest professional social network. ▪ 433M members worldwide. ▪ > 50% members access LinkedIn on mobile. ▪ Huge growth in India and China.
  • 63. About me ▪ Mobile Infrastructure Engineer ▪ Android platform and Sitespeed lead
  • 64. LinkedIn app portfolio ▪ LinkedIn Flagship ▪ Lookup ▪ Pulse ▪ Job Seeker ▪ Elevate ▪ Groups ▪ Sales Navigator ▪ Recruiter ▪ Student Job Seeker ▪ Lynda.com
  • 65. The leaky pipe ▪ Mobile Networks are flaky ▪ Speeds range from 80Kbps (GPRS/India) to over 10 Mbps (LTE/US) ▪ Last mile latency ▪ Routing/peering issues ▪ Frequent disconnects and degradation is common
  • 66. Diversity in devices ▪ Fragmented Android ecosystem. Older iPhones prevalent in emerging markets. ▪ Lowest end devices have 256M of RAM and single core CPUs.
  • 67. How do we optimize? ▪ Network connect ▪ Server time ▪ Response download/upload ▪ Parsing and caching ▪ Robust client side infrastructure ▪ Measure, measure and measure
  • 68. Network connect ▪ Sprinkle PoPs and CDNs close to members ▪ Early initialization ▪ Custom DNS cache ▪ SSL session cache ▪ Retries and timeouts tuned by network type
  • 69. Response download/upload ▪ Native multiplexing using SPDY. ▪ Custom dispatcher/response processor ▪ Content resumption ▪ Rest.li multiplexer ▪ Progressive JPEG for images
  • 70. Payload size reduction ▪ Delta sync ▪ Brotli compression ▪ SDCH
  • 71. Parsing ▪ Stream parse and decode ▪ Schema aware JSON parser ▪ Custom image decoder
  • 72. Caching ▪ Traditional request/response caches are passé. ▪ Fission: Decompose and cache ▪ Memory mapped disk cache ▪ No memory cache