Revolutions have a common pattern in technology and this is no different for the API space. This presentation discusses that pattern and goes through various API revolutions. It also uses Netflix as an example of how some revolutions evolved and where things may be headed.
13. Scientific Practice
Kuhn’s View
Experiments on Current Assumption Anomalies from Experiments New Assumption
Scientific
Revolution
Assumption
Assumption
Time
14. Scientific Practice
Kuhn’s View
Experiments on Current Assumption Anomalies from Experiments New Paradigm
Assumption
Assumption
Assumption
Assumption
Assumption
Assumption
Assumption
Assumption
Time
58. Network Border Network Border
OSFA API
START- A/B MEMBER RECOMME MOVIE SIMILAR
AUTH NDATIONS
RATINGS
UP TESTS DATA DATA MOVIES
59. CLIENT CODE
Network Border Network Border
OSFA API
SERVER CODE
START- A/B MEMBER RECOMME MOVIE SIMILAR
AUTH NDATIONS
RATINGS
UP TESTS DATA DATA MOVIES
60. USER INTERFACE
RENDERING
Network Border Network Border
DATA GATHERING,
OSFA API
FORMATTING,
AUTH
START-
UP
AND DELIVERY
A/B
TESTS
MEMBER
DATA
RECOMME
NDATIONS
MOVIE
DATA
RATINGS
SIMILAR
MOVIES
64. Network Border Network Border
Groovy Layer
JAVA API
START- A/B MEMBER RECOMME MOVIE SIMILAR
AUTH NDATIONS
RATINGS
UP TESTS DATA DATA MOVIES
65. CLIENT CODE
Network Border Network Border
CLIENT ADAPTER CODE
(WRITTEN BY CLIENT TEAMS, DYNAMICALLY UPLOADED TO SERVER)
JAVA API
AUTH
START-
SERVER CODE
A/B MEMBER
RECOMME
NDATIONSA
ZXSXX C
MOVIE
RATINGS
SIMILAR
UP TESTS DATA DATA MOVIES
CCC
66. USER INTERFACE
RENDERING
Network Border Network Border
DATA FORMATTING
AND DELIVERY
JAVA API
AUTH
START-
DATA GATHERING
A/B MEMBER
RECOMME
NDATIONSA
ZXSXX C
MOVIE
RATINGS
SIMILAR
UP TESTS DATA DATA MOVIES
CCC
67.
68. Recipe for Targeted APIs
API providers that have a:
• small number of targeted API consumers
• very close relationships between with API consumers
• increasing divergence of needs across these API consumers
• strong desire for optimization by the API consumers
• optimized APIs offer high value proposition
69. Recipe for Targeted APIs
API providers that have a:
• small number of targeted API consumers
• very close relationships between with API consumers
• increasing divergence of needs across these API consumers
• strong desire for optimization by the API consumers
• optimized APIs offer high value proposition
• a generous helping of chocolate (to keep engineers happy)
74. The Structure of API Revolutoins
@daniel_jacobson
djacobson@netflix.com
http://www.linkedin.com/in/danieljacobson
http://www.slideshare.net/danieljacobson
Image courtesy of SakeThrajan
Notes de l'éditeur
Thomas Kuhn published The Structure of Scientific Revolutions in 1962. The book was pretty controversial at the time, and in fact, still offers some pretty contentious views.
Kuhn describes the predominant view of scientific practice as an effort to continuously “discover” reality. Scientific discoveries are therefore building on top of past discoveries over time, continually getting closer to a comprehensive view of reality
Eventually, in principle, science will discover the full truth about reality.
Kuhn’s view is that science does not “discovery” reality or build on top of past “discoveries”. Rather, he believes that the majority of scientific work (which he labels “normal science”) is focused on puzzle solving on top of an initial assumption.
For example, the common belief centuries ago was that the earth was the center of the universe, with all of the planets and the sun resolving around it (the geocentric view).
Given that assumption, normal science builds hypotheses and performs experiments against it. The hypotheses can be proven or disproven within that context and they can build on each other, but they are not progressing towards an unveiling of reality.
During the course of normal science, however, anomalies are encountered. These anomalies are often cast aside as errors in observation/tests or for some other reason. But over time, they mount up or become too powerful in numbers or significance that they cannot be ignored.
Regarding the geocentric view, the phases of Venus became a very powerful anomaly. This anomaly essentially demonstrates that the shadows and reflections on Venus from the Sun, as well as how it moves throughout the sky, make it impossible for it to revolve around the Earth.
When the anomalies encountered are large or frequent enough, they give rise to a competing point of view, a competing assumption. Hypotheses and experiments begin on the new assumption, typically by scientists who otherwise do not practice normal science on the initial assumption.
The phases of Venus anomaly ultimately surfaced the competing assumption that the Sun is at the center and the Earth is one of many other objects revolving around it (the heliocentric view).
If the competing assumption gains enough traction through revolutionary science and becomes strong enough, there is a scientific revolution where the original assumption is completely overthrown in favor of the new one. Kuhn coined the term “Paradigm Shift” to represent this. It is also important to note that paradigm shifts can take a long time to develop and to conclude. But these shifts are absolute, meaning only one of these paradigms can be the focus of normal science.
To be clear, this revolution is not one where the new paradigm is necessarily better than the old one as neither are truly representing reality. It is just a new assumption that appears to be filling the holes of the original or is more representative of modern thinking. In fact, over time, the new paradigm will likely suffer its own anomalies and could very well fall prey to another competing paradigm.
So, what does Kuhn’s theory have to do with APIs? I see this same pattern often in technology and specifically in the API world. The following slides will demonstrate this pattern.
Two quick examples: Oneof the more prominent revolutions is the shift from SOAP to REST. A decade ago, overwhelmingly, the common approach for API development was SOAP. The problems with SOAP are well-documented but include difficulty in implementing and inability to be opened effectively. As REST emerged, it clearly solved many of the issues that SOAP offered. As of a year ago, the distribution of REST to non-REST implementations (with SOAP still being #2) is overwhelming. And for those that adopt REST, the adoption is absolute for each service, meaning when providers replace a SOAP service they often terminate it completely. The consistent and pervasive nature of this make it a revolution.
Similarly, in the earlier stages of API development, XML was the format for delivering content. With the emergence of the slimmer and more efficient JSON, more and more providers are retiring their XML offerings and/or launching new services with only JSON support. JSON-based documents are overwhelmingly the favorite and when adopted, it is often used to the exclusion of XML.
But those examples, while interesting, are not particularly meaty. In fact, they are much more tactical in nature. I am more interested in bigger revolutions, such as the move from public APIs to private APIs. Going further back, content was often trapped in cages just like this bird. The cages could be legacy databases, Flash applications or other kinds of non-structured documents (such as HTML, Word or PDF files).
With the incredible distribution capabilities of the web, many fought to liberate their content to let it open to the world. At first, this came in the form of approaches like RSS. In more recent years, we started seeing more and more open APIs. But the most powerful part of this important revolution is the act of liberating the content. The openness has varying value depending the nature and quality of the content, the breadth and reach of delivery, the quality of the delivering brand, etc. But the value of liberating of the content, the freeing of the bird, is unquestionable!
For most, once the content is liberated, it doesn’t need to go to the world. Letting the bird fly around the house is where most of the value is. The liberation from the cage is critical though!
In fact, many companies who have opened up their content to the world (letting the bird free) have seen tremendous traffic from internal services relative to their public feeds or APIs. These four companies all have public APIs, but the overwhelming traffic comes from their branded applications built internally or through direct partnerships.
To demonstrate the power of the internal API use case and the fact that this revolution is very real, I will use Netflix as an example. This example not only shows the evolution of the revolution, it also surfaces new revolutions that result from the change in audience.
Netflix is focused on being the best, global Internet streaming video provider.
We now have more than 33 million global subscribers in more than 50 countries and territories.
Those subscribers consume more than a billion hours of streaming video a month which accounts for 33% of the peak Internet traffic in the US.
Our 33 million of Netflix’s subscribers are watching shows (like House of Cards) and movies on virtually any device that has a streaming video screen. We are now on more than 800 different device types.
All of these metrics, and others discussed later, demonstrate the massive scale in which Netflix operations.
All of this started, however, with the launch of streaming in 2007. At the time, we were only streaming on computer-based players (i.e.. No devices, mobile phones, etc.). Also at this time, the content was also not fully liberated.
Shortly after streaming launched, in 2008, we launched our REST API. I describe it as a One-Size-Fits-All (OSFA) type of implementation because the API itself sets the rules and requires anyone who interfaces with it to adhere to those rules. Everyone is treated the same.
The OSFA API launched to support the 1,000 flowers model. That is, we would plant the seeds in the ground (by providing access to our content) and see what flowers sprout up in the myriad fields throughout the US. The 1,000 flowers are public API developers. At the launch of the public API, the content was fully liberated and the bird was set free to fly around in the open world.
And at launch, the API was exclusively targeted towards and consumed by the 1,000 flowers (i.e.. External developers). So all of the API traffic was coming from them.
Some examples of the flowers…
But as streaming gained more steam…
The API evolved to support more of the devices that were getting built. The 1,000 flowers were still supported as well, but as the devices ramped up, they became a bigger focus. And the bird was now mostly flying around the house with occasional visits to the open world.
Meanwhile, the balance of requests by audience had completely flipped. Overwhelmingly, the majority of traffic was coming from Netflix-ready devices and a shrinking percentage was from the 1,000 flowers. The rough distribution of the bird’s flying habits is now more than 1000-to-1 in favor of flying in the house.
And to support this revolution in the API target, the organizational structure of product engineering has morphed as well. The API is in the skinny part of the hourglass, brokering content and algorithmic output from the dependency layers to the UIs. In this model, each team specializes in solving specific problems for the product pipeline, making each team (and each engineer) highly impactful for the success of the company.
Laura Merling, in the first keynote from the API Strategy Conference correctly stated that “APIs are a means to an end”…
As a result, when a revolution occurs in the audience of the API (as described in previous slides for Netflix), to achieve the desired end the means needs to change as well. The API design needs to change to support the new audience effectively.
Netflix did a significant review of the API relative to the new charter. We focused our discussion on these three areas and included many teams in our introspection – most notably the various UI teams.
With the adoption of the devices, API traffic took off! We went from about 600 million requests per month to about 42 BILLION requests in just two years.
Today, we are doing more than 2B incoming requests per day. That kind of growth and those kinds of numbers seem great. Who wouldn’t want those numbers, right?
Especially if you are an organization like NPR serving web pages that have ads on them. If NPR.org was serving 2B requests a day, each one of those requests would create impressions for the ad which translates into revenue (and potentially increased CPM at those levels).
But the API traffic is not serving pages with ads. Rather, we are delivering documents like this, in the form of XML…
Or like this, in the form of JSON.
Growth in traffic, especially if it were to continue at this rate, does not directly translate into revenue. Instead, it is more likely to translate into costs. Supporting massive traffic requires major infrastructure to support the load, expenses in delivering the bits, engineering costs to build and support more complex systems, etc.
So our first realization was that we could potentially significantly reduce the chattiness between the devices and the API while maintaining the same or better user experience. Rather than handling 2 billion requests per day, could we have the same UI at 300 million instead? Or less? Could having more optimized delivery of the metadata improve the performance and experience for our customers as well?
With more than 800 different device types supported, we learned that the variability across them can also play a role in some of that chattiness. Different devices have different characteristics and capabilities that could influence the interaction model with the API.
For example, screen size could significantly affect what the API should deliver to the UI. TVs with bigger screens that can potentially fit more titles and more metadata per title than a mobile phone. Do we need to send all of the extra bits for fields or items that are not needed, requiring the device itself to drop items on the floor? Or can we optimize the deliver of those bits on a per-device basis?
Different devices have different controlling functions as well. For devices with swipe technologies, such as the iPad, do we need to pre-load a lot of extra titles in case a user swipes the row quickly to see the last of 500 titles in their queue? Or for up-down-left-right controllers, would devices be more optimized by fetching a few items at a time when they are needed? Other devices support voice or hand gestures or pointer technologies. How might those impact the user experience and therefore the metadata needed to support them?
The technical specs on these devices differ greatly. Some have significant memory space while others do not, impacting how much data can be handled at a given time. Processing power and hard-drive space could also play a role in how the UI performs, in turn potentially influencing the optimal way for fetching content from the API. All of these differences could result in different potential optimizations across these devices.
Finally, the OSFA model also seemed to slow the innovation rate of our various UI teams (as well as the API team itself). This became one of the most important considerations in our research.
Many UI teams needing metadata means many requests to the API team. In the OSFA world, we essentially needed to funnel these requests and then prioritize them. That means that some teams would need to wait for API work to be done. It also meant that, because they all shared the same endpoints, we were often adding variations to the endpoints resulting in a more complex system as well as a lot of spaghetti code. Make teams wait due to prioritization was exacerbated by the fact that tasks took longer because the technical debt was increasing, causing time to build and test to increase. Moreover, many of the incoming requests were asking us to do more of the same kinds of customizations. This created a spiral that would be very difficult to break out of…
All of these aforementioned issues are essentially anomalies in the current OSFA paradigm. For us, these anomalies carve a path for a revolution (meaning, an opportunity for us to overthrow our current OSFA paradigm with a solution that makes up for the OSFA deficiencies).
We evolved our discussion towards what ultimately became a discussion between resource-based APIs and experience-based APIs.
The original OSFA API was very resource oriented with granular requests for specific data, delivering specific documents in specific formats.
The interaction model looked basically like this, with (in this example) the PS3 making many calls across the network to the OSFA API. The API ultimately called back to dependent services to get the corresponding data needed to satisfy the requests.
In this mode, there is a very clear divide between the Client Code and the Server Code. That divide is the network border.
And the responsibilities have the same distribution as well. The Client Code handles the rendering of the interface (as well as asking the server for data). The Server Code is responsible of gathering, formatting and delivering the data to the UIs.
And ultimately, it works. The PS3 interface looks like this and was populated by this interaction model.
But we believe this is not the optimal way to handle it. In fact, assembling a UI through many resource-based API calls is akin to pointillism paintings. The picture looks great when fully assembled, but it is done by assembling many points put together in the right way.
We have decided to pursue an experience-based approach instead. Rather than making many API requests to assemble the PS3 home screen, the PS3 will potentially make a single request to a custom, optimized endpoint.
In an experience-based interaction, the PS3 can potentially make asingle request across the network border to a scripting layer (currently Groovy), in this example to provide the data for the PS3 home screen. The call goes to a very specific, custom endpoint for the PS3 or for a shared UI. The Groovy script then interprets what is needed for the PS3 home screen and triggers a series of calls to the Java API running in the same JVM as the Groovy scripts. The Java API is essentially a series of methods that individually know how to gather the corresponding data from the dependent services. The Java API then returns the data to the Groovy script who then formats and delivers the very specific data back to the PS3.
In this model, the border between Client Code and Server Code is no longer the network border. It is now back on the server. The Groovy is essentially a client adapter written by the client teams.
And the distribution of work changes as well. The client teams continue to handle UI rendering, but now are also responsible for the formatting and delivery of content. The API team, in terms of the data side of things, is responsible for the data gathering and hand-off to the client adapters. Of course, the API team does many other things, including resiliency, scaling, dependency interactions, etc. This model is essentially a platform for API development.
If resource-based APIs assemble data like pointillism, experience-based APIs assemble data like a photograph. The experience-based approach captures and delivers it all at once.
In terms of revolutions, Netflix may just be a lone anomaly that will be cast away as just that. Given my many conversations with other API providers, however, I suspect that the anomalies encountered with the OSFA APIs are becoming more pervasive. This will likely result in a broader revolution at some point in the future (who knows when…) That said, this design is not for everyone, even if you are experiencing the anomalies that I have discussed. Here is a recipe for those to which something like this could apply…
And don’t forget a generous helping of chocolate for your engineers!
As I have said, these revolutions happen often in technology. We are constantly in a quest for plugging the leaks in our previous systems by replacing them with a new, improved systems. The hope is that the paradigm shift results in fewer or smaller leaks. But make no mistake, there will be leaks and anomalies in the new system!
So don’t get too comfortable with any system that you support. Don’t get married to any technology, guideline, protocol, etc. They are all just means to an end.
So expect another revolution! And because we live in a world of revolutions…