Publicité
Publicité

Contenu connexe

Publicité

facebookthrift-151001153400-lva1-app6891.pptx

  1. FACEBOOKTHRIFT
  2.  Facebook is the “social networking “ People have been “facebooking” each other for about 7 years now, making Facebook the most used social network with over 500 million users worldwide.50% of our active users log on to Facebook in any given day Average user has 130 friends .People spend over 700 billion minutes per month on Facebook.There are over 900 million objects that people interact with (pages, groups, events and community pages). INTRODUCTION
  3.  Thrift is an interface definition language and binary communication protocol  It is used as a remote procedure call (RPC) framework and was developed at Facebook for "scalable cross-language services development".  It combines a software stack with a code generation engine to build services that work efficiently on C#, C++ , Java, Perl, PHP, Python, Ruby and Smalltalk.  it is now an open source project in the ApacheSoftware Foundation, now hosted onApache. THRIFT
  4.  Scribe (log server) is aserver for aggregating log data streamed in real-time from many other servers. Useful for logging a wide array of data. It is built on top of Thrift.  Cassandra is adatabase management system designed to handle large amounts of data spread out across many servers. It powers Facebook’s Inbox Search feature and provides a structured key-value store with eventual consistency.  HipHop for PHPis asource code transformer for PHPscript code and was created to save server resources. HipHop transforms PHPsource code into optimized C++.After doing this, it uses g++ to compile it to machine code. The BackEnd
  5.  The primary idea behind Thrift is that it consists of alanguage neutral stack which is implemented across various programming languages and an associated code generation engine which transforms asimple interface and data definition language into client and server remote procedure call libraries.  Thrift is designed to be assimple aspossible for the developers who can define all the necessarydata structures and interfaces for acomplex service in asingle short file.  This file is called asThrift Interface Definition Logic File or Thrift IDLFile.  The developers identified some important features while evaluating the technical challenges of cross language interactions in anetworked environment. Thrift DesignFeatures
  6.  Transport: Eachlanguage must have acommon interface to bidirectional raw data transport. Consider ascenario where there are 2servers in which, one is deployed in Java and the other one is deployed in Python. Soatypical service written in Java should be able to send the raw data from that service to acommon interface which will be understood by the other server which is running on Python and vice-versa. TheTransport Layer should be able to transport the raw data file across the two ends.The specifics about how this transport is implemented shouldn’t matter to the service developer. The same application code should be able to run against TCPStream Sockets, raw data in memory or files on disk.  Protocol: In order to transport the raw data, they have to be encoded into a particular format like binary, XMLetc. Therefore the Transport Layer uses some particular protocol to encode or decode the data. Again the application developer will not be bothered about this. He is only worried whether the data can be read or written in some deterministic manner. Types
  7.  Versioning: For the services to be robust they must evolve from their present version. They should incorporate new features and in order to do this the data types involved in the service should provide a mechanism to add or delete fields of an object or alter the arguments list of afunction without any interruption in service. This is calledVersioning.  Processors: Processors are the ones which process the data streams and accomplish Remote ProcedureCalls. Cont..
  8.  Thrift has been employed in alarge number of applications at Facebook, including search, logging, mobile, ads and the developer platform. Two specific usages are discussed below.  Search  logging Facebook ThriftServices
  9. Facebook serves 570 billion page views per month There are more photos on Facebook than all other photo sites combined More than 3 billion photos are uploaded every month. Facebook’s systems serve 1.2 million photos per second. More than 25 billion pieces of content (status updates, comments, etc) are shared every month. Facebook has more than 30,000 servers (and this number is from last year!) Facebook’s scaling challenge
  10.  Linux &Apache  PHP  Memcache  Haystack  BigPipe How Does Facebook Work?
  11. There are more than 20 billion uploaded photos on Facebook, and each one is saved in four different resolutions, resulting in more than 80 billion photos. And it’s not just about being able to handle billions of photos, performance is critical. Facebook serves around 1.2 million photos per second. Haystack is Facebook’s high-performance photo storage/retrieval system, a highly scalable object store used to serve Facebook’s immense amount of photos. Strictly speaking, Haystack is an object store, so it doesn’t necessarily have to store photos. Haystack stores photo data inside 10 GB bucket with 1 MB of metadata for every GB stored. Haystack
  12.  Pipelining web pages for high performance  BigPipe -dynamic web page serving system, Facebook has developed.  Facebook uses it to serve each web page in sections (called “pagelets”) for optimal performance.  BigPipe is a fundamental redesign of the dynamic web page serving system. The general idea is to pipeline pagelets through several execution stages inside web servers and browsers.  BigPipe breaks the page generation process into several stages  The first three stages are executed by the web server, and the last four stages are executed by the browser. BIGPIPE
  13.  Free & open source, high-performance, distributed memory object caching system  Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.  The system uses a client–server architecture. the clients populate this array and query it.  The servers keep the values in RAM; if a server runs out of RAM, it discards the oldest values.  Clients can read each other's cached data. MEMCACHE
  14.  Facebook has a system, Gatekeeper that lets run different code for different sets of users.  This lets Facebook do gradual releases of new features, activate certain features only for Facebook employees, etc.  Gatekeeper also lets Facebook do something called “dark launches”, which is to activate elements of acertain feature behind the scenes before it goes live Gradual releases and darklaunches
  15.  The Facebook Platform provides a set of APIs and tools which enable 3rd party developers to integrate with the "open graph“.  Graph API is the core of Facebook Platform, enabling developers to read and write data to Facebook Facebook Platform
  16.  The GraphAPI presents asimple, consistent view of the Facebook social graph, uniformly representing objects in the graph (e.g.,people, photos, events, and pages) and the connections between them (e.g., friend relationships, shared content, and phototags).  RestfulAPI for accessing data on the Facebook graph.  Every object in the social graph has a unique ID. You can access the properties of an object by requesting - https://graph.facebook.com/ID  Alternatively, people and pages with usernames can be accessed using their username asan ID.All responses are JSONobjects. The GraphAPI
  17.  FBMLis avariant-evolved subset of HTMLwith some elements removed.  It allows FacebookApplication developers to customize the "look and feel" of their applications, to alimited extent.  It is the specification of how to encode content so that Facebook's servers can read and publish it.  FBMLplays an important role in building applications. FBMLis used to tap in to various Facebook elements when building applications.  It operates alot like HTMLand it gives the ability to do various tasks with ease suchas: sending ausere-mail embedding flashvideo creating adashboard posting on awall Facebook Markup Language
  18.  Facebook also allows the use of regular HTMLtags, such as<a href=”#”></a>, which is used to generate ahyperlink. Facebook alsoallows the use of many more HTMLtags for building applications FBML
  19.  The new Messages interweaves your chats, texts and emails. It’s acentral place to control all of your private communication, both on and off Facebook.  Simply put, it can be a single inbox for all of your messages, no matter how you choose to send them.  A facebook.com EmailAddress  SMS FromFacebook  Chat History Facebook’s New Messages
  20.  Facebook Connect is a set of APIs from Facebook that enable Facebook members to log onto third-party websites, applications, mobile devices and gaming systems with their Facebook identity. Facebook Connect
  21.  Unlike other social networks like Friendster, MySpace, and Twitter – all of whom have run into serious scalability issues at different points during their growth. Facebook has been mostly reliable throughout its rise.  In actuality, Facebook uses JavaScript heavily, relies on their own in-house PHP wrapper called XHP, HipHop (which optimizes PHP), and many more technologies.  A lot of technologies have been developed by Facebookin-house to serve their own needs, for example Cassandra RELIABILITY
  22.  Thrift generates both the server and client interfaces for a given service, and in a consistent manner. Client calls will be more consistent  Related to above: Thrift's RPC-like behavior means that you get type safety  Thrift supports various protocols, not just HTTP. If you are dealing with large volumes of service calls, or have bandwidth requirements, the client/server can transparently switch to more efficient transports  Thrift is a mature piece of software; well tested and used. Advantages of Thrift:
  23.  Thrift is poorly documented.  It is more work to get started on the client side, when the clients are directly building the calling code. It's less work for the service owner if they are building libraries for clientsYet another dependency. Disadvantages:
Publicité