SlideShare une entreprise Scribd logo
1  sur  90
Télécharger pour lire hors ligne
Ne#lix	
  in	
  the	
  Cloud	
  

           Nov	
  6,	
  2010	
  
         Adrian	
  Cockcro:	
  
        @adrianco	
  #ne#lixcloud	
  
h=p://www.linkedin.com/in/adriancockcro:	
  
Combined	
  Slides	
  
For	
  both	
  Developer	
  and	
  OperaJons	
  Audiences	
  (2-­‐3	
  hours	
  of	
  content)   	
  
             Teaser	
  intro	
  slides	
  also	
  at	
  slideshare.net/adrianco	
  
       Oct	
  14th	
  Beta	
  #devops	
  subset	
  -­‐	
  Cloud	
  CompuJng	
  Meetup	
  
              Nov	
  3rd	
  GA	
  –	
  QConSF	
  Developer	
  Oriented	
  subset	
  
      Nov	
  6th,	
  2010	
  –	
  Combined	
  slides	
  at	
  slideshare.net/adrianco	
  
With	
  more	
  than	
  16	
  million	
  subscribers	
  in	
  the	
  
      United	
  States	
  and	
  Canada,	
  Ne9lix,	
  Inc.	
  is	
  the	
  
       world’s	
  leading	
  Internet	
  subscripAon	
  service	
  
              for	
  enjoying	
  movies	
  and	
  TV	
  shows.	
  




Source:	
  h=p://ir.ne#lix.com	
  
Why	
  Give	
  This	
  Talk?	
  
Ne#lix	
  is	
  Path-­‐finding	
  

   The	
  Cloud	
  ecosystem	
  is	
  evolving	
  very	
  fast	
  
Share	
  with	
  and	
  learn	
  from	
  the	
  cloud	
  community	
  
We	
  want	
  to	
  use	
  clouds,	
  
             not	
  build	
  them	
  
   Cloud	
  technology	
  should	
  be	
  a	
  commodity	
  
Public	
  cloud	
  and	
  open	
  source	
  for	
  agility	
  and	
  scale	
  
We	
  are	
  looking	
  for	
  talent	
  

Ne#lix	
  wants	
  to	
  connect	
  with	
  the	
  very	
  best	
  
                         engineers	
  
Why	
  Use	
  AWS?	
  
We	
  stopped	
  building	
  our	
  own	
  
           datacenters	
  
 Capacity	
  growth	
  rate	
  is	
  acceleraJng,	
  unpredictable	
  
   Product	
  launch	
  spikes	
  -­‐	
  iPhone,	
  Wii,	
  PS3,	
  XBox	
  
  Datacenter	
  is	
  large	
  inflexible	
  capital	
  commitment	
  
Customers	
  
          Q3	
  year/year	
  +52%	
  Total	
  and	
  +145%	
  Streaming	
  

           18	
  
           16	
  
           14	
  
           12	
  
            10	
  
             8	
  
             6	
  
              4	
  
               2	
  
                0	
  
                        2009Q2	
   2009Q3	
  
                                                2009Q4	
  
                                                             2010Q1	
  
                                                                          2010Q2	
  
                                                                                       2010Q3	
  


Source:	
  h=p://ir.ne#lix.com	
  
Leverage	
  AWS	
  Scale	
  
“the	
  biggest	
  public	
  cloud”	
  
AWS	
  investment	
  in	
  tooling	
  and	
  automaJon	
  
 AWS	
  zones	
  for	
  high	
  availability,	
  scalability	
  
  AWS	
  skills	
  are	
  common	
  on	
  resumes…	
  
Leverage	
  AWS	
  Feature	
  Set	
  
“two	
  years	
  ahead	
  of	
  the	
  others”	
  
 EC2,	
  S3,	
  SDB,	
  SQS,	
  EBS,	
  EMR,	
  ELB,	
  ASG,	
  IAM,	
  RDB…	
  
“The	
  cloud	
  lets	
  its	
  users	
  focus	
  
  on	
  delivering	
  differenAaAng	
  
  business	
  value	
  instead	
  of	
  
  wasAng	
  valuable	
  resources	
  
  on	
  the	
  undifferen)ated	
  
  heavy	
  li0ing	
  that	
  makes	
  
  up	
  most	
  of	
  IT	
  
  infrastructure.”	
  

  	
  Werner	
  Vogels	
  
  	
  Amazon	
  CTO	
  
Ne#lix	
  Deployed	
  on	
  AWS	
  

Content	
            Logs	
             Play	
          WWW	
            API	
  
    Video	
  
                           S3	
            DRM	
          Search	
       Metadata	
  
   Masters	
  


                        EMR	
              CDN	
          Movie	
          Device	
  
     EC2	
  
                       Hadoop	
           rouJng	
       Choosing	
        Config	
  


                                                                         TV	
  Movie	
  
      S3	
               Hive	
         Bookmarks	
       RaJngs	
  
                                                                         Choosing	
  

                      Business	
                                          Mobile	
  
     CDN	
                               Logging	
        Similars	
  
                     Intelligence	
                                       iPhone	
  
Movie	
  Encoding	
  farm	
  (2009)	
  
                •    Tens	
  of	
  thousands	
  of	
  videos	
  
Content	
       •    Thousands	
  of	
  EC2	
  instances	
  
   Video	
      •    Encoding	
  apps	
  on	
  MS	
  Windows	
  
  Masters	
     •    ~100	
  speed/format	
  permutaJons	
  	
  
                •    Petabytes	
  of	
  S3	
  
    EC2	
  
                •    Content	
  Delivery	
  Networks	
  

     S3	
            “Ne9lix	
  is	
  one	
  of	
  the	
  largest	
  customers	
  
                       of	
  the	
  biggest	
  CDNs	
  Akamai	
  and	
  
                       Limelight”	
  
    CDN	
  
Hadoop	
  -­‐	
  ElasJc	
  Map-­‐Reduce	
  (2009)	
  
                     •    Web	
  Access	
  Logs	
  
 Logs	
              •    Streaming	
  Service	
  Logs	
  
        S3	
  
                     •    Terabyte	
  per	
  day	
  scale	
  
                     •    Easy	
  Hadoop	
  via	
  Amazon	
  EMR	
  
     EMR	
           •    Hive	
  SQL	
  “Data	
  Mart”	
  
    Hadoop	
  
                     •    Gateway	
  to	
  Datacenter	
  BI	
  
      Hive	
         Slideshare.net	
  talks	
  
                     evamtse	
  “Ne#lix:	
  Hive	
  User	
  Group”	
  h=p://slidesha.re/aqJLAC	
  
                     adrianco	
  “Crunch	
  Your	
  Data	
  In	
  The	
  Cloud”	
  h=p://slidesha.re/dx4oCK	
  
   Business	
  
  Intelligence	
  
Streaming	
  Service	
  Back-­‐end	
  
             (early	
  2010)	
  
                •    PC/Mac	
  Silverlight	
  Player	
  Support	
  
Play	
  
                •    Highly	
  available	
  “play	
  bu=on”	
  
   DRM	
        •    DRM	
  Key	
  Management	
  
   CDN	
  
                •    Generate	
  route	
  to	
  stream	
  on	
  CDN	
  
  rouJng	
  
                •    Lookup	
  bookmark	
  for	
  user/movie	
  
Bookmarks	
     •    Update	
  bookmark	
  for	
  user/movie	
  
                •    Log	
  quality	
  of	
  service	
  
 Logging	
  
Web	
  site,	
  a	
  page	
  at	
  a	
  Jme	
  
                (through	
  2010)	
  
                •    Clean	
  presentaJon	
  layer	
  rewrite	
  
WWW	
           •    Search	
  auto-­‐complete	
  
  Search	
  
                •    Search	
  backend	
  and	
  landing	
  page	
  
                •    Movie	
  and	
  genre	
  choosing	
  
  Movie	
       •    Star	
  raJngs	
  and	
  recommendaJons	
  
 Choosing	
  
                •    Similar	
  movies	
  
  RaJngs	
      •    Page	
  by	
  page	
  to	
  80%	
  of	
  views	
  
                      (leave	
  account	
  signup	
  in	
  Datacenter	
  for	
  now)	
  
 Similars	
  
API	
  for	
  TV	
  devices	
  and	
  iPhone	
  etc.	
  
                       (2010)	
  
                   •     REST	
  API:	
  developer.ne#lix.com	
  
 API	
             •     Interfaces	
  to	
  everything	
  else	
  
 Metadata	
        •     TV	
  Device	
  ConfiguraJon	
  
                   •     Personalized	
  movie	
  choosing	
  
   Device	
  
   Config	
         •     iPhone	
  Launch	
  in	
  the	
  cloud	
  only	
  
 TV	
  Movie	
  
 Choosing	
             “Ne9lix	
  is	
  an	
  API	
  for	
  streaming	
  to	
  TVs	
  
  Mobile	
                    	
  (we	
  also	
  do	
  DVD’s	
  and	
  a	
  web	
  site)”	
  
  iPhone	
  
Ne#lix	
  EC2	
  Instances	
  per	
  Account	
  




  Encoding	
  




 Test	
  and	
  ProducJon	
  
                                Log	
  Analysis	
  
Learnings…	
  
Datacenter	
  oriented	
  tools	
  don’t	
  
               work	
  
           Ephemeral	
  instances	
  
           High	
  rate	
  of	
  change	
  
Cloud	
  Tools	
  Don’t	
  Scale	
  for	
  
           Enterprise	
  
  Too	
  many	
  are	
  “Startup”	
  oriented	
  
           Built	
  our	
  own	
  tools	
  
          Drove	
  vendors	
  hard	
  
“fork-­‐li:ed”	
  apps	
  don’t	
  work	
  well	
  

                        Fragile	
  
        Too	
  many	
  datacenter	
  oriented	
  
                  assumpJons	
  
Faster	
  to	
  re-­‐code	
  from	
  scratch	
  
•    Opportunity	
  to	
  pay	
  down	
  technical	
  debt	
  
•    Re-­‐architected	
  and	
  re-­‐wrote	
  most	
  of	
  the	
  code	
  
•    Fine	
  grain	
  web	
  services	
  
•    Leveraged	
  many	
  open	
  source	
  Java	
  projects	
  
•    SystemaJcally	
  instrumented	
  
•    “NoSQL”	
  SimpleDB	
  backend	
  
“In	
  the	
  datacenter,	
  robust	
  code	
  is	
  best	
  
 pracAce.	
  In	
  the	
  cloud,	
  it’s	
  essenAal.”	
  
Takeaway	
  

   Ne9lix	
  is	
  path-­‐finding	
  the	
  use	
  of	
  public	
  AWS	
  
    cloud	
  to	
  replace	
  in-­‐house	
  IT	
  for	
  non-­‐trivial	
  
   applicaAons	
  with	
  hundreds	
  of	
  developers	
  and	
  
                     thousands	
  of	
  systems.	
  

(Pause	
  for	
  quesJons	
  before	
  we	
  dive	
  into	
  details)	
  
What,	
  Why	
  and	
  How?	
  

        The	
  details…	
  
Synopsis	
  
•  The	
  Goals	
  
    –  Faster,	
  Scalable,	
  Available	
  and	
  ProducJve	
  
•  AnJ-­‐pa=erns	
  and	
  Cloud	
  Architecture	
  
    –  The	
  things	
  we	
  wanted	
  to	
  change	
  and	
  why	
  
•  Cloud	
  Bring-­‐up	
  Strategy	
  
    –  Developer	
  TransiJons	
  and	
  Tools	
  
•  Roadmap	
  and	
  Next	
  Steps	
  
Goals	
  
•  Faster	
  
     –  Lower	
  latency	
  than	
  the	
  equivalent	
  datacenter	
  web	
  pages	
  and	
  API	
  calls	
  
     –  Measured	
  as	
  mean	
  and	
  99th	
  percenJle	
  
     –  For	
  both	
  first	
  hit	
  (e.g.	
  home	
  page)	
  and	
  in-­‐session	
  hits	
  for	
  the	
  same	
  user	
  
•  Scalable	
  
     –  Avoid	
  needing	
  any	
  more	
  datacenter	
  capacity	
  as	
  subscriber	
  count	
  increases	
  
     –  No	
  central	
  verJcally	
  scaled	
  databases	
  
     –  Leverage	
  AWS	
  elasJc	
  capacity	
  effecJvely	
  
•  Available	
  
     –  SubstanJally	
  higher	
  robustness	
  and	
  availability	
  than	
  datacenter	
  services	
  
     –  Leverage	
  mulJple	
  AWS	
  availability	
  zones	
  
     –  No	
  scheduled	
  down	
  Jme,	
  no	
  central	
  database	
  schema	
  to	
  change	
  
•  ProducJve	
  
     –  OpJmize	
  agility	
  of	
  a	
  large	
  development	
  team	
  with	
  automaJon	
  and	
  tools	
  
     –  Leave	
  behind	
  complex	
  tangled	
  datacenter	
  code	
  base	
  (~8	
  year	
  old	
  architecture)	
  
     –  Enforce	
  clean	
  layered	
  interfaces	
  and	
  re-­‐usable	
  components	
  
Cloud	
  Architecture	
  Pa=erns	
  

        Where	
  do	
  we	
  start?	
  
Datacenter	
  AnJ-­‐Pa=erns	
  

 What	
  do	
  we	
  currently	
  do	
  in	
  the	
  
datacenter	
  that	
  prevents	
  us	
  from	
  
         meeJng	
  our	
  goals?	
  
Architecture	
  
•  So:ware	
  Architecture	
  
   –  The	
  abstracJons	
  and	
  interfaces	
  that	
  developers	
  build	
  
      against	
  
•  Systems	
  Architecture	
  
   –  The	
  service	
  instances	
  that	
  define	
  availability,	
  
      scalability	
  
•  Compose-­‐ability	
  
   –  so:ware	
  architecture	
  that	
  is	
  independent	
  of	
  the	
  
      systems	
  architecture	
  
   –  decoupled	
  flexible	
  building	
  block	
  components	
  	
  
Rewrite	
  from	
  Scratch	
  

Not	
  everything	
  is	
  cloud	
  specific	
  
   Pay	
  down	
  technical	
  debt	
  
          Robust	
  pa=erns	
  
Old	
  Datacenter	
  vs.	
  New	
  Cloud	
  Arch	
  
    Central	
  SQL	
  Database	
          Distributed	
  Key/Value	
  NoSQL	
  

 SJcky	
  In-­‐Memory	
  Session	
         Shared	
  Memcached	
  Session	
  

       Cha=y	
  Protocols	
                 Latency	
  Tolerant	
  Protocols	
  

 Tangled	
  Service	
  Interfaces	
         Layered	
  Service	
  Interfaces	
  

     Instrumented	
  Code	
              Instrumented	
  Service	
  Pa=erns	
  

    Fat	
  Complex	
  Objects	
          Lightweight	
  Serializable	
  Objects	
  

  Components	
  as	
  Jar	
  Files	
         Components	
  as	
  Services	
  
The	
  Central	
  SQL	
  Database	
  
•  Datacenter	
  has	
  a	
  central	
  database	
  
   –  Everything	
  in	
  one	
  place	
  is	
  convenient	
  unJl	
  it	
  fails	
  
   –  Customers,	
  movies,	
  history,	
  configuraJon	
  
•  Schema	
  changes	
  require	
  downJme	
  

    AnA-­‐paTern	
  impacts	
  scalability,	
  availability	
  
The	
  Distributed	
  Key-­‐Value	
  Store	
  
•  Cloud	
  has	
  many	
  key-­‐value	
  data	
  stores	
  
    –  More	
  complex	
  to	
  keep	
  track	
  of,	
  do	
  backups	
  etc.	
  
    –  Each	
  store	
  is	
  much	
  simpler	
  to	
  administer	
   DBA	
  
    –  Joins	
  take	
  place	
  in	
  java	
  code	
  
•  No	
  schema	
  to	
  change,	
  no	
  scheduled	
  downJme	
  
•  Latency	
  for	
  Memcached	
  vs.	
  Oracle	
  vs.	
  SimpleDB	
  
    –  Memcached	
  is	
  dominated	
  by	
  network	
  latency	
  <1ms	
  
    –  Oracle	
  for	
  simple	
  queries	
  is	
  a	
  few	
  milliseconds	
  
    –  SimpleDB	
  has	
  replicaJon	
  and	
  REST	
  overheads	
  >10ms	
  
The	
  SJcky	
  Session	
  
•  Datacenter	
  SJcky	
  Load	
  Balancing	
  
   –  Efficient	
  caching	
  for	
  low	
  latency	
  
   –  Tricky	
  session	
  handling	
  code	
  
   –  Middle	
  Jer	
  load	
  balancer	
  has	
  issues	
  in	
  pracJce	
  
•  Encourages	
  concentrated	
  funcJonality	
  
   –  one	
  service	
  that	
  does	
  everything	
  


  AnA-­‐paTern	
  impacts	
  producAvity,	
  availability	
  
The	
  Shared	
  Session	
  
•  Cloud	
  Uses	
  Round-­‐Robin	
  Load	
  Balancing	
  
    –  Simple	
  request-­‐based	
  code	
  
    –  External	
  shared	
  caching	
  with	
  memcached	
  
•  More	
  flexible	
  fine	
  grain	
  services	
  
    –  Works	
  be=er	
  with	
  auto-­‐scaled	
  instance	
  counts	
  
Cha=y	
  Opaque	
  and	
  Bri=le	
  Protocols	
  
•  Datacenter	
  service	
  protocols	
  
    –  Assumed	
  low	
  latency	
  for	
  many	
  simple	
  requests	
  
•  Based	
  on	
  serializing	
  exisJng	
  java	
  objects	
  
    –  Inefficient	
  formats	
  
    –  IncompaJble	
  when	
  definiJons	
  change	
  


   AnA-­‐paTern	
  causes	
  producAvity,	
  latency	
  and	
  
                     availability	
  issues	
  
Robust	
  and	
  Flexible	
  Protocols	
  
•  Cloud	
  service	
  protocols	
  
   –  JSR311/Jersey	
  is	
  used	
  for	
  REST/HTTP	
  service	
  calls	
  
   –  Custom	
  client	
  code	
  includes	
  service	
  discovery	
  
   –  Support	
  complex	
  data	
  types	
  in	
  a	
  single	
  request	
  
•  Apache	
  Avro	
  
   –  Evolved	
  from	
  Protocol	
  Buffers	
  and	
  Thri:	
  
   –  Includes	
  JSON	
  header	
  defining	
  key/value	
  protocol	
  
   –  Avro	
  serializaJon	
  is	
  half	
  the	
  size	
  and	
  several	
  Jmes	
  
      faster	
  than	
  Java	
  serializaJon,	
  more	
  work	
  to	
  code	
  
Persisted	
  Protocols	
  
•  Persist	
  Avro	
  in	
  Memcached	
  
   –  Save	
  space/latency	
  (zigzag	
  encoding,	
  half	
  the	
  size)	
  
   –  Less	
  bri=le	
  across	
  versions	
  
   –  New	
  keys	
  are	
  ignored	
  
   –  Missing	
  keys	
  are	
  handled	
  cleanly	
  
•  Avro	
  protocol	
  definiJons	
  
   –  Can	
  be	
  wri=en	
  in	
  JSON	
  or	
  generated	
  from	
  POJOs	
  
   –  It’s	
  hard,	
  needs	
  be=er	
  tooling	
  
Tangled	
  Service	
  Interfaces	
  
•  Datacenter	
  implementaJon	
  is	
  exposed	
  
   –  Oracle	
  SQL	
  queries	
  mixed	
  into	
  business	
  logic	
  
•  Tangled	
  code	
  
   –  Deep	
  dependencies,	
  false	
  sharing	
  
•  Data	
  providers	
  with	
  sideways	
  dependencies	
  
   –  Everything	
  depends	
  on	
  everything	
  else	
  


   AnA-­‐paTern	
  affects	
  producAvity,	
  availability	
  
Untangled	
  Service	
  Interfaces	
  
•  New	
  Cloud	
  Code	
  With	
  Strict	
  Layering	
  
    –  Compile	
  against	
  interface	
  jar	
  
    –  Can	
  use	
  spring	
  runJme	
  binding	
  to	
  enforce	
  
•  Service	
  interface	
  is	
  the	
  service	
  
    –  ImplementaJon	
  is	
  completely	
  hidden	
  
    –  Can	
  be	
  implemented	
  locally	
  or	
  remotely	
  
    –  ImplementaJon	
  can	
  evolve	
  independently	
  
Untangled	
  Service	
  Interfaces	
  
Two	
  layers:	
  
•  SAL	
  -­‐	
  Service	
  Access	
  Library	
  
    –  Basic	
  serializaJon	
  and	
  error	
  handling	
  
    –  REST	
  or	
  POJO’s	
  defined	
  by	
  data	
  provider	
  
•  ESL	
  -­‐	
  Extended	
  Service	
  Library	
  
    –  Caching,	
  conveniences	
  
    –  Can	
  combine	
  several	
  SALs	
  
    –  Exposes	
  faceted	
  type	
  system	
  (described	
  later)	
  
    –  Interface	
  defined	
  by	
  data	
  consumer	
  in	
  many	
  cases	
  
Service	
  InteracJon	
  Pa=ern	
  
    Sample	
  Swimlane	
  Diagram	
  
Service	
  Architecture	
  Pa=erns	
  
•  Internal	
  Interfaces	
  Between	
  Services	
  
   –  Common	
  pa=erns	
  as	
  templates	
  
   –  Highly	
  instrumented,	
  observable,	
  analyJcs	
  
   –  Service	
  Level	
  Agreements	
  –	
  SLAs	
  
•  Library	
  templates	
  for	
  generic	
  features	
  
   –  Instrumented	
  Ne#lix	
  Base	
  Servlet	
  template	
  
   –  Instrumented	
  generic	
  client	
  interface	
  template	
  
   –  Instrumented	
  S3,	
  SimpleDB,	
  Memcached	
  clients	
  
CLIENT	
  
                                                                  Request	
  Start	
  
                                                                   Timestamp,	
               Client	
  
                                          Inbound	
               Request	
  End	
          outbound	
  
                                       deserialize	
  end	
        Timestamp	
            serialize	
  start	
  
                                         Jmestamp	
                                        Jmestamp	
  

                  Inbound	
                                                                                            Client	
  
                 deserialize	
                                                                                      outbound	
  
                    start	
                                                                                        serialize	
  end	
  
                 Jmestamp	
                                                                                         Jmestamp	
  




Client	
  network	
  
    receive	
  
  Jmestamp	
  
                                       Service	
  Request	
                                                                       Client	
  Network	
  
                                                                                                                                       send	
  
                                                                                                                                    Jmestamp	
  


                                      Instruments	
  Every	
  
   Service	
  
network	
  send	
  
 Jmestamp	
  
                                        Step	
  in	
  the	
  call	
                                                                   Service	
  
                                                                                                                                      Network	
  
                                                                                                                                      receive	
  
                                                                                                                                     Jmestamp	
  




                  Service	
                                                                                           Service	
  
                outbound	
                                                                                           inbound	
  
               serialize	
  end	
                                                                                  serialize	
  start	
  
                Jmestamp	
                                                                                          Jmestamp	
  

                                           Service	
                                         Service	
  
                                          outbound	
                                        inbound	
  
                                                                 SERVICE	
  execute	
  
                                        serialize	
  start	
                              serialize	
  end	
  
                                         Jmestamp	
                request	
  start	
      Jmestamp	
  
                                                                    Jmestamp,	
  
                                                                 execute	
  request	
  
                                                                  end	
  Jmestamp	
  
Boundary	
  Interfaces	
  
•  Isolate	
  teams	
  from	
  external	
  dependencies	
  
   –  Fake	
  SAL	
  built	
  by	
  cloud	
  team	
  
   –  Real	
  SAL	
  provided	
  by	
  data	
  provider	
  team	
  later	
  
   –  ESL	
  built	
  by	
  cloud	
  team	
  using	
  faceted	
  objects	
  
•  Fake	
  data	
  sources	
  allow	
  development	
  to	
  start	
  
   –  e.g.	
  Fake	
  IdenJty	
  SAL	
  for	
  a	
  test	
  set	
  of	
  customers	
  
   –  Development	
  solidifies	
  dependencies	
  early	
  
   –  Helps	
  external	
  team	
  provide	
  the	
  right	
  interface	
  
One	
  Object	
  That	
  Does	
  Everything	
  
•  Datacenter	
  uses	
  a	
  few	
  big	
  complex	
  objects	
  
    –  Movie	
  and	
  Customer	
  objects	
  are	
  the	
  foundaJon	
  
    –  Good	
  choice	
  for	
  a	
  small	
  team	
  and	
  one	
  instance	
  
    –  ProblemaJc	
  for	
  large	
  teams	
  and	
  many	
  instances	
  
•  False	
  sharing	
  causes	
  tangled	
  dependencies	
  
    –  UnproducJve	
  re-­‐integraJon	
  work	
  

       AnA-­‐paTern	
  impacAng	
  producAvity	
  and	
  
                         availability	
  
An	
  Interface	
  For	
  Each	
  Component	
  
•  Cloud	
  uses	
  faceted	
  Video	
  and	
  Visitor	
  
    –  Basic	
  types	
  hold	
  only	
  the	
  idenJfier	
  
    –  Facets	
  scope	
  the	
  interface	
  you	
  actually	
  need	
  
    –  Each	
  component	
  can	
  define	
  its	
  own	
  facets	
  
•  No	
  false-­‐sharing	
  and	
  dependency	
  chains	
  
    –  Type	
  manager	
  converts	
  between	
  facets	
  as	
  needed	
  
    –  video.asA(PresentaJonVideo)	
  for	
  www	
  
    –  video.asA(MerchableVideo)	
  for	
  middle	
  Jer	
  
So:ware	
  Architecture	
  Pa=erns	
  
•  Object	
  Models	
  
   –  Basic	
  and	
  derived	
  types,	
  facets,	
  serializable	
  
   –  Pass	
  by	
  reference	
  within	
  a	
  service	
  
   –  Pass	
  by	
  value	
  between	
  services	
  
•  ComputaJon	
  and	
  I/O	
  Models	
  
   –  Service	
  ExecuJon	
  using	
  Best	
  Effort	
  
   –  Common	
  thread	
  pool	
  management	
  
Ne#lix	
  Systems	
  Architecture	
  
API	
  
 AWS	
  EC2	
  
                                         Front	
  End	
  ELB	
  
             Discovery	
  
              Service	
                     API	
  Proxy	
                              API	
  etc.	
  

                                              API	
  ELB	
  


           Component	
                           API	
               SQS	
  
            Services	
                                                                Oracl
                                                                                       e	
  
                                                                                       Oracle	
  
                                                                                             Oracle	
  
                     memcached	
                  memcached	
        ReplicaJon	
  



        EBS	
                                                                         Ne@lix	
  
                                S3	
                                                  Data	
  Center	
  
AWS	
  Storage	
                                               SimpleDB	
  
Ne#lix	
  UndifferenJated	
  Li:ing	
  
•  Middle	
  Tier	
  Load	
  
   Balancing	
  
•  Discovery	
  (local	
  DNS)	
  
•  EncrypJon	
  Services	
  
•  Caching	
  
•  Distributed	
  App	
  
   Management	
  
    We	
  want	
  cloud	
  vendors	
  to	
  do	
  all	
  this	
  for	
  us	
  as	
  well!	
  
Load	
  Balancing	
  in	
  AWS	
  
•  Middle	
  Jer	
  currently	
  not	
  supported	
  in	
  AWS	
  
    –  ELB	
  are	
  public-­‐facing	
  only	
  
    –  Cannot	
  apply	
  security	
  group	
  sezngs	
  
•  ELB	
  verJcal	
  scalability	
  for	
  concentrated	
  clients	
  
    –  Too	
  few	
  proxy	
  IP	
  addresses	
  leads	
  to	
  hot	
  spots	
  
•  ELB	
  needs	
  support	
  for	
  balancing	
  heurisJcs	
  
    –  ProporJonal	
  balance	
  across	
  Availability	
  Zones	
  
    –  Weighted	
  Least	
  connecJons,	
  Weighted	
  Round	
  Robin	
  
•  Zone	
  aware	
  rouJng	
  
    –  Default	
  to	
  instances	
  in	
  the	
  same	
  Availability	
  Zone	
  
    –  Falls	
  back	
  to	
  cross-­‐zone	
  on	
  failure	
  
Discovery	
  
•  Discovery	
  Service	
  (Redundant	
  instances	
  per	
  zone)	
  
    –  Simple	
  REST	
  interface	
  
    –  Cloud	
  apps	
  register	
  with	
  Discovery	
  
•  Apps	
  send	
  heartbeats	
  every	
  30	
  sec	
  to	
  renew	
  lease	
  
    –  App	
  evicted	
  a:er	
  3	
  missed	
  heartbeats	
  
    –  Can	
  re-­‐register	
  if	
  the	
  problem	
  was	
  transient	
  
•  Apps	
  can	
  store	
  custom	
  metadata	
  
    –  Version	
  number,	
  AMI	
  id,	
  Availability	
  Zone,	
  etc.	
  
•  So:ware	
  Round-­‐robin	
  Load	
  Balancer	
  
    –  Query	
  Discovery	
  for	
  instances	
  of	
  specific	
  applicaJon	
  
    –  Baked	
  into	
  Ne#lix	
  REST	
  client	
  (JSR311/Jersey	
  based)	
  
       AWS	
  Middle-­‐)er	
  ELB	
  would	
  eliminate	
  most	
  use	
  cases	
  
Database	
  MigraJon	
  
•  Why	
  SimpleDB?	
  
    –  No	
  DBA’s	
  in	
  the	
  cloud,	
  Amazon	
  hosted	
  service	
  
    –  Work	
  started	
  two	
  years	
  ago,	
  fewer	
  viable	
  opJons	
  
    –  Worked	
  with	
  Amazon	
  to	
  speed	
  up	
  and	
  scale	
  SimpleDB	
  
•  AlternaJves?	
  
    –  InvesJgaJng	
  adding	
  Cassandra	
  and	
  Membase	
  to	
  the	
  mix	
  
    –  Need	
  several	
  opJons	
  to	
  match	
  use	
  cases	
  well	
  
•  Detailed	
  SimpleDB	
  Advice	
  
    –  Sid	
  Anand	
  	
  -­‐	
  QConSF	
  Nov	
  5th	
  –	
  Ne#lix’	
  TransiJon	
  to	
  High	
  
       Availability	
  Storage	
  Systems	
  
    –  Blog	
  -­‐	
  h=p://pracJcalcloudcompuJng.com/	
  
    –  Download	
  Paper	
  PDF	
  -­‐	
  h=p://bit.ly/bhOTLu	
  
Oracle	
  to	
  SimpleDB	
  
                        (See	
  Sid’s	
  paper	
  for	
  details)	
  

•  SimpleDB	
  Domains	
  
    –  De-­‐normalize	
  mulJple	
  tables	
  into	
  a	
  single	
  domain	
  
    –  Work	
  around	
  size	
  limits	
  (10GB	
  per	
  domain,	
  1KB	
  per	
  key)	
  
    –  Shard	
  data	
  across	
  domains	
  to	
  scale	
  
    –  Key	
  –	
  Use	
  distributed	
  sequence	
  generator,	
  GUID	
  or	
  natural	
  
       unique	
  key	
  such	
  as	
  customer-­‐id	
  	
  
    –  Implement	
  a	
  schema	
  validator	
  to	
  catch	
  bad	
  a=ributes	
  
•  ApplicaJon	
  layer	
  support	
  
    –  Do	
  GROUP	
  BY	
  and	
  JOIN	
  operaJons	
  in	
  the	
  applicaJon	
  
    –  Compose	
  relaJons	
  in	
  the	
  applicaJon	
  layer	
  
    –  Check	
  constraints	
  on	
  read,	
  and	
  repair	
  data	
  as	
  a	
  side	
  effect	
  
•  Do	
  without	
  triggers,	
  PL/SQL,	
  clock	
  operaJons	
  
Tools	
  and	
  AutomaJon	
  
•  Developer	
  and	
  Build	
  Tools	
  
    –  Jira,	
  Eclipse,	
  Hudson,	
  Ivy,	
  ArJfactory	
  
    –  Builds,	
  creates	
  .war	
  file,	
  .rpm,	
  bakes	
  AMI	
  and	
  launches	
  
•  Custom	
  Ne#lix	
  ApplicaJon	
  Console	
  
    –  AWS	
  Features	
  at	
  Enterprise	
  Scale	
  (hide	
  the	
  keys!)	
  
    –  Auto	
  Scaler	
  Group	
  is	
  unit	
  of	
  deployment	
  to	
  producJon	
  
•  Open	
  Source	
  +	
  Support	
  
    –  Apache,	
  Tomcat,	
  OpenJDK,	
  CentOS	
  
•  Monitoring	
  Tools	
  
    –    Keynote	
  –	
  service	
  monitoring	
  and	
  alerJng	
  
    –    AppDynamics	
  –	
  Developer	
  focus	
  for	
  cloud	
  
    –    EpicNMS	
  –	
  flexible	
  data	
  collecJon	
  and	
  plots	
  h=p://epicnms.com	
  
    –    Nimso:	
  NMS	
  –	
  ITOps	
  focus	
  for	
  Datacenter	
  +	
  Cloud	
  alerJng	
  
Ne#lix	
  App	
  Console	
  
Auto	
  Scaling	
  Groups	
  
ASG	
  ConfiguraJon	
  
Monitoring	
  Tools	
  
Monitoring	
  Vision	
  
•  Problem	
  
   –  Too	
  many	
  tools,	
  each	
  with	
  a	
  good	
  reason	
  to	
  exist	
  
   –  Hard	
  to	
  get	
  an	
  integrated	
  view	
  of	
  a	
  problem	
  
   –  Too	
  much	
  manual	
  work	
  building	
  dashboards	
  
   –  Tools	
  are	
  not	
  discoverable,	
  views	
  are	
  not	
  filtered	
  
•  SoluJon	
  
   –  Get	
  vendors	
  to	
  add	
  deep	
  linking	
  and	
  embedding	
  
   –  IntegraJon	
  “portal”	
  Jes	
  everything	
  together	
  
   –  Dynamic	
  portal	
  generaJon,	
  relevant	
  data,	
  all	
  tools	
  
Cloud	
  Monitoring	
  Mechanisms	
  
•  Keynote	
  
     –  External	
  URL	
  monitoring	
  
•  Amazon	
  CloudWatch	
  
     –  Metrics	
  for	
  ELB	
  and	
  Instances	
  
•  AppDynamics	
  
     –  End	
  to	
  end	
  transacJon	
  view	
  showing	
  resources	
  used	
  
     –  Powerful	
  real	
  Jme	
  debug	
  tools	
  for	
  latency,	
  CPU	
  and	
  Memory	
  
•  Nimso:	
  NMS	
  
     –  Scalable	
  and	
  reliable	
  monitoring	
  and	
  alerJng,	
  integraJon	
  portal	
  
•  Epic	
  
     –  Flexible	
  and	
  easy	
  to	
  use	
  to	
  extend	
  and	
  embed	
  plots	
  
•  Logs	
  
     –  High	
  capacity	
  logging	
  and	
  analysis	
  framework	
  
     –  Hadoop	
  (log4j	
  -­‐>	
  chukwa	
  -­‐>	
  EMR)	
  
Snapshots	
  for	
  a	
  Business	
  TransacJon	
  
    Sort	
  Call	
  Graphs	
  to	
  Top,	
  pick	
  a	
  slow	
  one	
  
Drill	
  in	
  to	
  Slow	
  Call	
  
Slow	
  Asynchronous	
  S3	
  Write	
  –	
  no	
  big	
  deal…	
  
Cloud	
  Bring-­‐Up	
  Strategy	
  
    Simplest	
  and	
  Soonest	
  
Shadow	
  Traffic	
  RedirecJon	
  
•  Early	
  a=empt	
  to	
  send	
  traffic	
  to	
  cloud	
  
    –  Real	
  traffic	
  stream	
  to	
  validate	
  cloud	
  back	
  end	
  
    –  Uncovered	
  lots	
  of	
  process	
  and	
  tools	
  issues	
  
    –  Uncovered	
  Service	
  latency	
  issues	
  
•  TV	
  Device	
  calls	
  Datacenter	
  API	
  
    –  Returns	
  Genre/movie	
  list	
  for	
  a	
  customer	
  
    –  Asynchronously	
  duplicates	
  request	
  to	
  cloud	
  
    –  Start	
  with	
  send-­‐and-­‐forget	
  mode,	
  ignore	
  response	
  
Shadow	
  Redirect	
  Instances	
  

   Modified	
  
                                            Datacenter	
  
  Datacenter	
                               Service	
  
   Instances	
  


Modified	
  Cloud	
                        Cloud	
  Service	
  
                                                                 One	
  request	
  per	
  
  Instances	
                                                            visit	
  




 Data	
  Sources	
     queueservice	
     videometadata	
  
First	
  Web	
  Pages	
  in	
  the	
  Cloud	
  
Starz	
  Page	
  
First	
  Page	
  
•  First	
  full	
  page	
  –	
  Starz	
  Channel	
  Genre	
  
    –  Simplest	
  page,	
  no	
  sub-­‐genres,	
  minimal	
  personalizaJon	
  
    –  Lots	
  of	
  investment	
  in	
  new	
  Struts	
  based	
  page	
  design	
  
    –  Uses	
  idenJty	
  cookie	
  to	
  lookup	
  in	
  member	
  info	
  svc	
  
•  New	
  “merchweb”	
  front	
  end	
  instance	
  
    –  movies.ne#lix.com	
  points	
  to	
  merchweb	
  instance	
  
•  Uncovered	
  lots	
  of	
  latency	
  issues	
  
    –  Used	
  memcached	
  to	
  hide	
  S3	
  and	
  SimpleDB	
  latency	
  
    –  Improved	
  from	
  slower	
  to	
  faster	
  than	
  Datacenter	
  
Starz	
  Page	
  Cloud	
  Instances	
  


  Front	
  End	
                                                     merchweb	
  




                                                                                        mulJple	
  requests	
  
 Middle	
  Tier	
                         starz	
                   	
  memcached	
         per	
  visit	
  




Data	
  Sources	
     queueservice	
  
                      rentalhistory	
  
                                                      videometadata	
  
Controlled	
  Cloud	
  TransiJon	
  
•  WWW	
  calling	
  code	
  chooses	
  who	
  goes	
  to	
  cloud	
  
   –  Filter	
  out	
  corner	
  cases,	
  send	
  percentage	
  of	
  users	
  
   –  The	
  URL	
  that	
  customers	
  see	
  is	
  
      h=p://movies.ne#lix.com/WiContentPage?csid=1	
  
   –  If	
  problem,	
  redirect	
  to	
  old	
  Datacenter	
  page	
  
      h=p://www.ne#lix.com/WiContentPage?csid=1	
  
•  Play	
  Bu=on	
  and	
  Star	
  RaJng	
  AcJon	
  redirect	
  
   –  Point	
  URLs	
  for	
  acJons	
  that	
  create/modify	
  data	
  
      back	
  to	
  datacenter	
  to	
  start	
  with	
  
Developers	
  
Cloud	
  Developer	
  Setup	
  
•  Cloud	
  Boot	
  Camp	
  
     –    Room	
  full	
  of	
  engineers	
  sharing	
  the	
  pain	
  for	
  1-­‐2	
  days	
  
     –    Built	
  a	
  very	
  rough	
  prototype	
  working	
  web	
  site	
  
     –    Get	
  everyone	
  hands-­‐on	
  in	
  the	
  cloud	
  with	
  a	
  new	
  code	
  base	
  
     –    Debug	
  lots	
  of	
  tooling	
  and	
  conceptual	
  issues	
  very	
  fast	
  
     –    Member	
  info	
  in	
  SimpleDB	
  with	
  developer’s	
  accounts	
  only	
  
•  Cloud	
  Specific	
  Key	
  Setup	
  
     –  It’s	
  a	
  pain,	
  need	
  to	
  configure	
  your	
  IDE’s	
  JVM	
  
     –  Needed	
  to	
  integrate	
  with	
  AWS	
  security	
  model	
  
•  Startup	
  Guide	
  Wiki	
  Pages	
  
     –  What	
  object	
  facets	
  already	
  exist,	
  how	
  to	
  make	
  your	
  own	
  
     –  What	
  components	
  already	
  exist	
  or	
  are	
  work	
  in	
  progress	
  
Developer	
  Instances	
  Collision	
  
 Sam	
  and	
  Rex	
  both	
  want	
  to	
  deploy	
  web	
  front	
  end	
  for	
  
                              development           	
  

Sam	
                                                                  Rex	
  
                               web	
  in	
  
                                test	
  
                              account	
  
Per-­‐Service	
  Namespace	
  RouJng	
  
           Developers	
  choose	
  what	
  to	
  share	
  




     Sam	
                  Rex	
                 Mike	
  
  web-­‐sam	
            web-­‐rex	
           web-­‐dev	
  

backend-­‐dev	
       backend-­‐dev	
       backend-­‐mike	
  
Developer	
  Service	
  Namespaces	
  
•  Developer	
  specific	
  service	
  instances	
  
   –  Configured	
  via	
  Java	
  properJes	
  at	
  runJme	
  
   –  RouJng	
  implemented	
  by	
  REST	
  client	
  library	
  
•  Server	
  ConfiguraJon	
  
   –  Configure	
  discovery	
  service	
  version	
  string	
  
   –  Registers	
  as	
  <appname>-­‐<namespace>	
  
•  Client	
  ConfiguraJon	
  
   –  Route	
  traffic	
  on	
  per-­‐service	
  basis	
  including	
  
      namespace	
  
Current	
  Status	
  
WWW	
  Page	
  by	
  Page	
  during	
  Q2/Q3/Q4	
  
•  Simplest	
  possible	
  page	
  first	
  
    –  Minimal	
  dependencies	
  
•  Add	
  pages	
  as	
  dependent	
  services	
  come	
  online	
  
•  Home	
  page	
  –	
  most	
  complex	
  and	
  highest	
  traffic	
  
•  Leave	
  low	
  traffic	
  pages	
  for	
  later	
  cleanup	
  

     gradual	
  migraAon	
  from	
  Datacenter	
  pages	
  
Big-­‐Bang	
  TransiJon	
  
•  iPhone	
  Launch	
  (August/Sept)	
  
   –  No	
  capacity	
  in	
  the	
  datacenter,	
  cloud	
  only	
  
   –  App	
  Store	
  gates	
  release,	
  not	
  gradual,	
  can’t	
  back	
  out	
  
   –  Market	
  is	
  huge	
  (exisJng	
  and	
  new	
  customers)	
  
   –  Has	
  to	
  work	
  at	
  large	
  scale	
  on	
  day	
  one	
  
•  Datacenter	
  Shadow	
  Redirect	
  Technique	
  
   –  Used	
  to	
  stress	
  back-­‐end	
  and	
  data	
  sources	
  
•  SOASTA	
  Cloud	
  Based	
  Load	
  GeneraJon	
  
   –  Used	
  to	
  stress	
  test	
  API	
  and	
  end-­‐to-­‐end	
  funcJonality	
  
Current	
  Work	
  for	
  Cloud	
  Pla#orm	
  
•  Drive	
  latency	
  and	
  availability	
  goals	
  
    –  More	
  Aggressive	
  caching	
  
    –  Improving	
  Fault	
  and	
  latency	
  robustness	
  
•  Logging	
  and	
  monitoring	
  portal/dashboards	
  
    –  Working	
  to	
  integrate	
  tools	
  and	
  data	
  sources	
  
    –  Need	
  be=er	
  observability	
  and	
  automaJon	
  
•  EvaluaJng	
  a	
  range	
  of	
  NoSQL	
  choices	
  
    –  Broad	
  set	
  of	
  use	
  cases,	
  no	
  single	
  winner	
  
    –  Good	
  topic	
  for	
  another	
  talk…	
  
Wrap	
  Up	
  
Next	
  Few	
  Years…	
  
•  “System	
  of	
  Record”	
  moves	
  to	
  Cloud	
  
      –  Master	
  copies	
  of	
  data	
  live	
  only	
  in	
  the	
  cloud,	
  with	
  backups	
  etc.	
  
      –  Cut	
  the	
  datacenter	
  to	
  cloud	
  replicaJon	
  link	
  
•  InternaJonal	
  Expansion	
  –	
  Global	
  Clouds	
  
      –  Rapid	
  deployments	
  to	
  new	
  markets	
  
•  GPU	
  Clouds	
  opJmized	
  for	
  video	
  encoding	
  
•  Cloud	
  StandardizaJon	
  
      –    Cloud	
  features	
  and	
  APIs	
  should	
  be	
  a	
  commodity	
  not	
  a	
  differenJator	
  
      –    DifferenJate	
  on	
  scale	
  and	
  quality	
  of	
  service	
  
      –    CompeJJon	
  also	
  drives	
  cost	
  down	
  
      –    Higher	
  resilience	
  
      –    Higher	
  scalability	
  


      We	
  would	
  prefer	
  to	
  be	
  an	
  insignificant	
  customer	
  in	
  a	
  giant	
  cloud	
  
Remember	
  the	
  Goals	
  
                Faster	
  
               Scalable	
  
               Available	
  
              ProducJve	
  

Track	
  progress	
  against	
  these	
  goals	
  
Takeaway	
  

Ne9lix	
  is	
  path-­‐finding	
  the	
  use	
  of	
  public	
  AWS	
  
 cloud	
  to	
  replace	
  in-­‐house	
  IT	
  for	
  non-­‐trivial	
  
applicaAons	
  with	
  hundreds	
  of	
  developers	
  and	
  
                  thousands	
  of	
  systems.	
  

       h=p://www.linkedin.com/in/adriancockcro:	
  
               @adrianco	
  #ne#lixcloud	
  

Contenu connexe

Tendances

Getting Started with AWS Lambda and Serverless
Getting Started with AWS Lambda and ServerlessGetting Started with AWS Lambda and Serverless
Getting Started with AWS Lambda and ServerlessAmazon Web Services
 
Mainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best PracticesMainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best PracticesAmazon Web Services
 
AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17Neal Davis
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
 
What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...
What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...
What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...Edureka!
 
AWS Data Transfer Services Deep Dive
AWS Data Transfer Services Deep Dive AWS Data Transfer Services Deep Dive
AWS Data Transfer Services Deep Dive Amazon Web Services
 
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Amazon Web Services
 
Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)Garvit Anand
 
AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...
AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...
AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...Amazon Web Services
 
Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...
Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...
Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...Amazon Web Services
 
Overview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWSOverview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWSAmazon Web Services
 
Emerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud StrategiesEmerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud StrategiesChaitanya Atreya
 
Introduction To AWS & AWS Lambda
Introduction To AWS & AWS LambdaIntroduction To AWS & AWS Lambda
Introduction To AWS & AWS LambdaAn Nguyen
 
Large-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCLarge-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCAmazon Web Services
 

Tendances (20)

Getting Started with AWS Lambda and Serverless
Getting Started with AWS Lambda and ServerlessGetting Started with AWS Lambda and Serverless
Getting Started with AWS Lambda and Serverless
 
AWS Security & Compliance
AWS Security & ComplianceAWS Security & Compliance
AWS Security & Compliance
 
Mainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best PracticesMainframe Modernization with AWS: Patterns and Best Practices
Mainframe Modernization with AWS: Patterns and Best Practices
 
AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3)
 
Overview of Amazon Web Services
Overview of Amazon Web ServicesOverview of Amazon Web Services
Overview of Amazon Web Services
 
What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...
What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...
What is AWS | AWS Certified Solutions Architect | AWS Tutorial | AWS Training...
 
AWS Governance Overview - Beach
AWS Governance Overview - BeachAWS Governance Overview - Beach
AWS Governance Overview - Beach
 
AWS Data Transfer Services Deep Dive
AWS Data Transfer Services Deep Dive AWS Data Transfer Services Deep Dive
AWS Data Transfer Services Deep Dive
 
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
 
Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)
 
Introduction to AWS Security
Introduction to AWS SecurityIntroduction to AWS Security
Introduction to AWS Security
 
AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...
AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...
AWS Networking – Advanced Concepts and new capabilities | AWS Summit Tel Aviv...
 
Cost Optimisation on AWS
Cost Optimisation on AWSCost Optimisation on AWS
Cost Optimisation on AWS
 
Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...
Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...
Behind the Scenes: Exploring the AWS Global Network (NET305) - AWS re:Invent ...
 
Overview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWSOverview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWS
 
Emerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud StrategiesEmerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
Emerging Trends in Hybrid-Cloud & Multi-Cloud Strategies
 
Introduction To AWS & AWS Lambda
Introduction To AWS & AWS LambdaIntroduction To AWS & AWS Lambda
Introduction To AWS & AWS Lambda
 
AWS Technical Essentials Day
AWS Technical Essentials DayAWS Technical Essentials Day
AWS Technical Essentials Day
 
Large-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSCLarge-Scale AWS Migrations with CSC
Large-Scale AWS Migrations with CSC
 

En vedette

Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011sandeep_tata
 
Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Adam Jordens
 
Continuous Deployment to the Cloud using Spinnaker
Continuous Deployment to the Cloud using SpinnakerContinuous Deployment to the Cloud using Spinnaker
Continuous Deployment to the Cloud using SpinnakerTim Ysewyn
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...Adrian Cockcroft
 

En vedette (7)

Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
 
Spinnaker Chadev
Spinnaker ChadevSpinnaker Chadev
Spinnaker Chadev
 
Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726
 
Continuous Deployment to the Cloud using Spinnaker
Continuous Deployment to the Cloud using SpinnakerContinuous Deployment to the Cloud using Spinnaker
Continuous Deployment to the Cloud using Spinnaker
 
Kenzan Spinnaker Meetup
Kenzan Spinnaker MeetupKenzan Spinnaker Meetup
Kenzan Spinnaker Meetup
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 

Similaire à Netflix on Cloud - combined slides for Dev and Ops

Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Adrian Cockcroft
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumAdrian Cockcroft
 
Netflix keynote-adrian-qcon
Netflix keynote-adrian-qconNetflix keynote-adrian-qcon
Netflix keynote-adrian-qconYiwei Ma
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 
AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...
AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...
AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...Amazon Web Services
 
Netflix web-adrian-qcon
Netflix web-adrian-qconNetflix web-adrian-qcon
Netflix web-adrian-qconYiwei Ma
 
(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation Studios(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation StudiosAmazon Web Services
 
C# Client to Cloud
C# Client to CloudC# Client to Cloud
C# Client to CloudStuart Lodge
 
Web Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformWeb Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformSudhir Tonse
 
Aws-What You Need to Know_Simon Elisha
Aws-What You Need to Know_Simon ElishaAws-What You Need to Know_Simon Elisha
Aws-What You Need to Know_Simon ElishaHelen Rogers
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Amazon Web Services
 
Neev cloud services with AWS
Neev cloud services with AWSNeev cloud services with AWS
Neev cloud services with AWSNeev Technologies
 
Java Developer on AWS 在AWS上開發Java應用
Java Developer on AWS 在AWS上開發Java應用Java Developer on AWS 在AWS上開發Java應用
Java Developer on AWS 在AWS上開發Java應用Amazon Web Services
 
Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017Amazon Web Services
 
Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017Amazon Web Services
 
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...Amazon Web Services
 

Similaire à Netflix on Cloud - combined slides for Dev and Ops (20)

Netflix in the cloud 2011
Netflix in the cloud 2011Netflix in the cloud 2011
Netflix in the cloud 2011
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV Forum
 
Netflix keynote-adrian-qcon
Netflix keynote-adrian-qconNetflix keynote-adrian-qcon
Netflix keynote-adrian-qcon
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...
AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...
AWS for Media: Content in the Cloud, Miles Ward (Amazon Web Services) and Bha...
 
Netflix web-adrian-qcon
Netflix web-adrian-qconNetflix web-adrian-qcon
Netflix web-adrian-qcon
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
 
(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation Studios(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation Studios
 
C# Client to Cloud
C# Client to CloudC# Client to Cloud
C# Client to Cloud
 
Netflix and Open Source
Netflix and Open SourceNetflix and Open Source
Netflix and Open Source
 
Web Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformWeb Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud Platform
 
Aws-What You Need to Know_Simon Elisha
Aws-What You Need to Know_Simon ElishaAws-What You Need to Know_Simon Elisha
Aws-What You Need to Know_Simon Elisha
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
Neev cloud services with AWS
Neev cloud services with AWSNeev cloud services with AWS
Neev cloud services with AWS
 
Java Developer on AWS 在AWS上開發Java應用
Java Developer on AWS 在AWS上開發Java應用Java Developer on AWS 在AWS上開發Java應用
Java Developer on AWS 在AWS上開發Java應用
 
Java-Developer-on-AWS
Java-Developer-on-AWSJava-Developer-on-AWS
Java-Developer-on-AWS
 
Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Los Angeles 2017
 
Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017
Deploy Deep Learning Models on Amazon ECS - DevDay Austin 2017
 
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
AWS September Webinar Series - Visual Effects Rendering in the AWS Cloud with...
 

Plus de Adrian Cockcroft

Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Adrian Cockcroft
 
Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Adrian Cockcroft
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowAdrian Cockcroft
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Adrian Cockcroft
 
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionGluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionAdrian Cockcroft
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAdrian Cockcroft
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSFAdrian Cockcroft
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformAdrian Cockcroft
 
Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSAdrian Cockcroft
 
Netflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconNetflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconAdrian Cockcroft
 
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)Adrian Cockcroft
 
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)Adrian Cockcroft
 
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Adrian Cockcroft
 
Performance architecture for cloud connect
Performance architecture for cloud connectPerformance architecture for cloud connect
Performance architecture for cloud connectAdrian Cockcroft
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is uselessAdrian Cockcroft
 

Plus de Adrian Cockcroft (20)

Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
 
Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search Roadshow
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
 
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionGluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
 
Gluecon keynote
Gluecon keynoteGluecon keynote
Gluecon keynote
 
NetflixOSS Meetup
NetflixOSS MeetupNetflixOSS Meetup
NetflixOSS Meetup
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at Netflix
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSF
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source Platform
 
Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWS
 
Netflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconNetflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at Gluecon
 
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
 
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)
 
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
 
Migrating to Public Cloud
Migrating to Public CloudMigrating to Public Cloud
Migrating to Public Cloud
 
Performance architecture for cloud connect
Performance architecture for cloud connectPerformance architecture for cloud connect
Performance architecture for cloud connect
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is useless
 
NoSQL for Netflix
NoSQL for NetflixNoSQL for Netflix
NoSQL for Netflix
 

Dernier

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 

Dernier (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 

Netflix on Cloud - combined slides for Dev and Ops

  • 1. Ne#lix  in  the  Cloud   Nov  6,  2010   Adrian  Cockcro:   @adrianco  #ne#lixcloud   h=p://www.linkedin.com/in/adriancockcro:  
  • 2. Combined  Slides   For  both  Developer  and  OperaJons  Audiences  (2-­‐3  hours  of  content)   Teaser  intro  slides  also  at  slideshare.net/adrianco   Oct  14th  Beta  #devops  subset  -­‐  Cloud  CompuJng  Meetup   Nov  3rd  GA  –  QConSF  Developer  Oriented  subset   Nov  6th,  2010  –  Combined  slides  at  slideshare.net/adrianco  
  • 3. With  more  than  16  million  subscribers  in  the   United  States  and  Canada,  Ne9lix,  Inc.  is  the   world’s  leading  Internet  subscripAon  service   for  enjoying  movies  and  TV  shows.   Source:  h=p://ir.ne#lix.com  
  • 4. Why  Give  This  Talk?  
  • 5. Ne#lix  is  Path-­‐finding   The  Cloud  ecosystem  is  evolving  very  fast   Share  with  and  learn  from  the  cloud  community  
  • 6. We  want  to  use  clouds,   not  build  them   Cloud  technology  should  be  a  commodity   Public  cloud  and  open  source  for  agility  and  scale  
  • 7. We  are  looking  for  talent   Ne#lix  wants  to  connect  with  the  very  best   engineers  
  • 9. We  stopped  building  our  own   datacenters   Capacity  growth  rate  is  acceleraJng,  unpredictable   Product  launch  spikes  -­‐  iPhone,  Wii,  PS3,  XBox   Datacenter  is  large  inflexible  capital  commitment  
  • 10. Customers   Q3  year/year  +52%  Total  and  +145%  Streaming   18   16   14   12   10   8   6   4   2   0   2009Q2   2009Q3   2009Q4   2010Q1   2010Q2   2010Q3   Source:  h=p://ir.ne#lix.com  
  • 11. Leverage  AWS  Scale   “the  biggest  public  cloud”   AWS  investment  in  tooling  and  automaJon   AWS  zones  for  high  availability,  scalability   AWS  skills  are  common  on  resumes…  
  • 12. Leverage  AWS  Feature  Set   “two  years  ahead  of  the  others”   EC2,  S3,  SDB,  SQS,  EBS,  EMR,  ELB,  ASG,  IAM,  RDB…  
  • 13. “The  cloud  lets  its  users  focus   on  delivering  differenAaAng   business  value  instead  of   wasAng  valuable  resources   on  the  undifferen)ated   heavy  li0ing  that  makes   up  most  of  IT   infrastructure.”    Werner  Vogels    Amazon  CTO  
  • 14. Ne#lix  Deployed  on  AWS   Content   Logs   Play   WWW   API   Video   S3   DRM   Search   Metadata   Masters   EMR   CDN   Movie   Device   EC2   Hadoop   rouJng   Choosing   Config   TV  Movie   S3   Hive   Bookmarks   RaJngs   Choosing   Business   Mobile   CDN   Logging   Similars   Intelligence   iPhone  
  • 15. Movie  Encoding  farm  (2009)   •  Tens  of  thousands  of  videos   Content   •  Thousands  of  EC2  instances   Video   •  Encoding  apps  on  MS  Windows   Masters   •  ~100  speed/format  permutaJons     •  Petabytes  of  S3   EC2   •  Content  Delivery  Networks   S3   “Ne9lix  is  one  of  the  largest  customers   of  the  biggest  CDNs  Akamai  and   Limelight”   CDN  
  • 16. Hadoop  -­‐  ElasJc  Map-­‐Reduce  (2009)   •  Web  Access  Logs   Logs   •  Streaming  Service  Logs   S3   •  Terabyte  per  day  scale   •  Easy  Hadoop  via  Amazon  EMR   EMR   •  Hive  SQL  “Data  Mart”   Hadoop   •  Gateway  to  Datacenter  BI   Hive   Slideshare.net  talks   evamtse  “Ne#lix:  Hive  User  Group”  h=p://slidesha.re/aqJLAC   adrianco  “Crunch  Your  Data  In  The  Cloud”  h=p://slidesha.re/dx4oCK   Business   Intelligence  
  • 17. Streaming  Service  Back-­‐end   (early  2010)   •  PC/Mac  Silverlight  Player  Support   Play   •  Highly  available  “play  bu=on”   DRM   •  DRM  Key  Management   CDN   •  Generate  route  to  stream  on  CDN   rouJng   •  Lookup  bookmark  for  user/movie   Bookmarks   •  Update  bookmark  for  user/movie   •  Log  quality  of  service   Logging  
  • 18. Web  site,  a  page  at  a  Jme   (through  2010)   •  Clean  presentaJon  layer  rewrite   WWW   •  Search  auto-­‐complete   Search   •  Search  backend  and  landing  page   •  Movie  and  genre  choosing   Movie   •  Star  raJngs  and  recommendaJons   Choosing   •  Similar  movies   RaJngs   •  Page  by  page  to  80%  of  views   (leave  account  signup  in  Datacenter  for  now)   Similars  
  • 19. API  for  TV  devices  and  iPhone  etc.   (2010)   •  REST  API:  developer.ne#lix.com   API   •  Interfaces  to  everything  else   Metadata   •  TV  Device  ConfiguraJon   •  Personalized  movie  choosing   Device   Config   •  iPhone  Launch  in  the  cloud  only   TV  Movie   Choosing   “Ne9lix  is  an  API  for  streaming  to  TVs   Mobile    (we  also  do  DVD’s  and  a  web  site)”   iPhone  
  • 20. Ne#lix  EC2  Instances  per  Account   Encoding   Test  and  ProducJon   Log  Analysis  
  • 22. Datacenter  oriented  tools  don’t   work   Ephemeral  instances   High  rate  of  change  
  • 23. Cloud  Tools  Don’t  Scale  for   Enterprise   Too  many  are  “Startup”  oriented   Built  our  own  tools   Drove  vendors  hard  
  • 24. “fork-­‐li:ed”  apps  don’t  work  well   Fragile   Too  many  datacenter  oriented   assumpJons  
  • 25. Faster  to  re-­‐code  from  scratch   •  Opportunity  to  pay  down  technical  debt   •  Re-­‐architected  and  re-­‐wrote  most  of  the  code   •  Fine  grain  web  services   •  Leveraged  many  open  source  Java  projects   •  SystemaJcally  instrumented   •  “NoSQL”  SimpleDB  backend  
  • 26. “In  the  datacenter,  robust  code  is  best   pracAce.  In  the  cloud,  it’s  essenAal.”  
  • 27. Takeaway   Ne9lix  is  path-­‐finding  the  use  of  public  AWS   cloud  to  replace  in-­‐house  IT  for  non-­‐trivial   applicaAons  with  hundreds  of  developers  and   thousands  of  systems.   (Pause  for  quesJons  before  we  dive  into  details)  
  • 28. What,  Why  and  How?   The  details…  
  • 29. Synopsis   •  The  Goals   –  Faster,  Scalable,  Available  and  ProducJve   •  AnJ-­‐pa=erns  and  Cloud  Architecture   –  The  things  we  wanted  to  change  and  why   •  Cloud  Bring-­‐up  Strategy   –  Developer  TransiJons  and  Tools   •  Roadmap  and  Next  Steps  
  • 30. Goals   •  Faster   –  Lower  latency  than  the  equivalent  datacenter  web  pages  and  API  calls   –  Measured  as  mean  and  99th  percenJle   –  For  both  first  hit  (e.g.  home  page)  and  in-­‐session  hits  for  the  same  user   •  Scalable   –  Avoid  needing  any  more  datacenter  capacity  as  subscriber  count  increases   –  No  central  verJcally  scaled  databases   –  Leverage  AWS  elasJc  capacity  effecJvely   •  Available   –  SubstanJally  higher  robustness  and  availability  than  datacenter  services   –  Leverage  mulJple  AWS  availability  zones   –  No  scheduled  down  Jme,  no  central  database  schema  to  change   •  ProducJve   –  OpJmize  agility  of  a  large  development  team  with  automaJon  and  tools   –  Leave  behind  complex  tangled  datacenter  code  base  (~8  year  old  architecture)   –  Enforce  clean  layered  interfaces  and  re-­‐usable  components  
  • 31. Cloud  Architecture  Pa=erns   Where  do  we  start?  
  • 32. Datacenter  AnJ-­‐Pa=erns   What  do  we  currently  do  in  the   datacenter  that  prevents  us  from   meeJng  our  goals?  
  • 33. Architecture   •  So:ware  Architecture   –  The  abstracJons  and  interfaces  that  developers  build   against   •  Systems  Architecture   –  The  service  instances  that  define  availability,   scalability   •  Compose-­‐ability   –  so:ware  architecture  that  is  independent  of  the   systems  architecture   –  decoupled  flexible  building  block  components    
  • 34. Rewrite  from  Scratch   Not  everything  is  cloud  specific   Pay  down  technical  debt   Robust  pa=erns  
  • 35. Old  Datacenter  vs.  New  Cloud  Arch   Central  SQL  Database   Distributed  Key/Value  NoSQL   SJcky  In-­‐Memory  Session   Shared  Memcached  Session   Cha=y  Protocols   Latency  Tolerant  Protocols   Tangled  Service  Interfaces   Layered  Service  Interfaces   Instrumented  Code   Instrumented  Service  Pa=erns   Fat  Complex  Objects   Lightweight  Serializable  Objects   Components  as  Jar  Files   Components  as  Services  
  • 36. The  Central  SQL  Database   •  Datacenter  has  a  central  database   –  Everything  in  one  place  is  convenient  unJl  it  fails   –  Customers,  movies,  history,  configuraJon   •  Schema  changes  require  downJme   AnA-­‐paTern  impacts  scalability,  availability  
  • 37. The  Distributed  Key-­‐Value  Store   •  Cloud  has  many  key-­‐value  data  stores   –  More  complex  to  keep  track  of,  do  backups  etc.   –  Each  store  is  much  simpler  to  administer   DBA   –  Joins  take  place  in  java  code   •  No  schema  to  change,  no  scheduled  downJme   •  Latency  for  Memcached  vs.  Oracle  vs.  SimpleDB   –  Memcached  is  dominated  by  network  latency  <1ms   –  Oracle  for  simple  queries  is  a  few  milliseconds   –  SimpleDB  has  replicaJon  and  REST  overheads  >10ms  
  • 38. The  SJcky  Session   •  Datacenter  SJcky  Load  Balancing   –  Efficient  caching  for  low  latency   –  Tricky  session  handling  code   –  Middle  Jer  load  balancer  has  issues  in  pracJce   •  Encourages  concentrated  funcJonality   –  one  service  that  does  everything   AnA-­‐paTern  impacts  producAvity,  availability  
  • 39. The  Shared  Session   •  Cloud  Uses  Round-­‐Robin  Load  Balancing   –  Simple  request-­‐based  code   –  External  shared  caching  with  memcached   •  More  flexible  fine  grain  services   –  Works  be=er  with  auto-­‐scaled  instance  counts  
  • 40. Cha=y  Opaque  and  Bri=le  Protocols   •  Datacenter  service  protocols   –  Assumed  low  latency  for  many  simple  requests   •  Based  on  serializing  exisJng  java  objects   –  Inefficient  formats   –  IncompaJble  when  definiJons  change   AnA-­‐paTern  causes  producAvity,  latency  and   availability  issues  
  • 41. Robust  and  Flexible  Protocols   •  Cloud  service  protocols   –  JSR311/Jersey  is  used  for  REST/HTTP  service  calls   –  Custom  client  code  includes  service  discovery   –  Support  complex  data  types  in  a  single  request   •  Apache  Avro   –  Evolved  from  Protocol  Buffers  and  Thri:   –  Includes  JSON  header  defining  key/value  protocol   –  Avro  serializaJon  is  half  the  size  and  several  Jmes   faster  than  Java  serializaJon,  more  work  to  code  
  • 42. Persisted  Protocols   •  Persist  Avro  in  Memcached   –  Save  space/latency  (zigzag  encoding,  half  the  size)   –  Less  bri=le  across  versions   –  New  keys  are  ignored   –  Missing  keys  are  handled  cleanly   •  Avro  protocol  definiJons   –  Can  be  wri=en  in  JSON  or  generated  from  POJOs   –  It’s  hard,  needs  be=er  tooling  
  • 43. Tangled  Service  Interfaces   •  Datacenter  implementaJon  is  exposed   –  Oracle  SQL  queries  mixed  into  business  logic   •  Tangled  code   –  Deep  dependencies,  false  sharing   •  Data  providers  with  sideways  dependencies   –  Everything  depends  on  everything  else   AnA-­‐paTern  affects  producAvity,  availability  
  • 44. Untangled  Service  Interfaces   •  New  Cloud  Code  With  Strict  Layering   –  Compile  against  interface  jar   –  Can  use  spring  runJme  binding  to  enforce   •  Service  interface  is  the  service   –  ImplementaJon  is  completely  hidden   –  Can  be  implemented  locally  or  remotely   –  ImplementaJon  can  evolve  independently  
  • 45. Untangled  Service  Interfaces   Two  layers:   •  SAL  -­‐  Service  Access  Library   –  Basic  serializaJon  and  error  handling   –  REST  or  POJO’s  defined  by  data  provider   •  ESL  -­‐  Extended  Service  Library   –  Caching,  conveniences   –  Can  combine  several  SALs   –  Exposes  faceted  type  system  (described  later)   –  Interface  defined  by  data  consumer  in  many  cases  
  • 46. Service  InteracJon  Pa=ern   Sample  Swimlane  Diagram  
  • 47. Service  Architecture  Pa=erns   •  Internal  Interfaces  Between  Services   –  Common  pa=erns  as  templates   –  Highly  instrumented,  observable,  analyJcs   –  Service  Level  Agreements  –  SLAs   •  Library  templates  for  generic  features   –  Instrumented  Ne#lix  Base  Servlet  template   –  Instrumented  generic  client  interface  template   –  Instrumented  S3,  SimpleDB,  Memcached  clients  
  • 48. CLIENT   Request  Start   Timestamp,   Client   Inbound   Request  End   outbound   deserialize  end   Timestamp   serialize  start   Jmestamp   Jmestamp   Inbound   Client   deserialize   outbound   start   serialize  end   Jmestamp   Jmestamp   Client  network   receive   Jmestamp   Service  Request   Client  Network   send   Jmestamp   Instruments  Every   Service   network  send   Jmestamp   Step  in  the  call   Service   Network   receive   Jmestamp   Service   Service   outbound   inbound   serialize  end   serialize  start   Jmestamp   Jmestamp   Service   Service   outbound   inbound   SERVICE  execute   serialize  start   serialize  end   Jmestamp   request  start   Jmestamp   Jmestamp,   execute  request   end  Jmestamp  
  • 49. Boundary  Interfaces   •  Isolate  teams  from  external  dependencies   –  Fake  SAL  built  by  cloud  team   –  Real  SAL  provided  by  data  provider  team  later   –  ESL  built  by  cloud  team  using  faceted  objects   •  Fake  data  sources  allow  development  to  start   –  e.g.  Fake  IdenJty  SAL  for  a  test  set  of  customers   –  Development  solidifies  dependencies  early   –  Helps  external  team  provide  the  right  interface  
  • 50. One  Object  That  Does  Everything   •  Datacenter  uses  a  few  big  complex  objects   –  Movie  and  Customer  objects  are  the  foundaJon   –  Good  choice  for  a  small  team  and  one  instance   –  ProblemaJc  for  large  teams  and  many  instances   •  False  sharing  causes  tangled  dependencies   –  UnproducJve  re-­‐integraJon  work   AnA-­‐paTern  impacAng  producAvity  and   availability  
  • 51. An  Interface  For  Each  Component   •  Cloud  uses  faceted  Video  and  Visitor   –  Basic  types  hold  only  the  idenJfier   –  Facets  scope  the  interface  you  actually  need   –  Each  component  can  define  its  own  facets   •  No  false-­‐sharing  and  dependency  chains   –  Type  manager  converts  between  facets  as  needed   –  video.asA(PresentaJonVideo)  for  www   –  video.asA(MerchableVideo)  for  middle  Jer  
  • 52. So:ware  Architecture  Pa=erns   •  Object  Models   –  Basic  and  derived  types,  facets,  serializable   –  Pass  by  reference  within  a  service   –  Pass  by  value  between  services   •  ComputaJon  and  I/O  Models   –  Service  ExecuJon  using  Best  Effort   –  Common  thread  pool  management  
  • 54. API   AWS  EC2   Front  End  ELB   Discovery   Service   API  Proxy   API  etc.   API  ELB   Component   API   SQS   Services   Oracl e   Oracle   Oracle   memcached   memcached   ReplicaJon   EBS   Ne@lix   S3   Data  Center   AWS  Storage   SimpleDB  
  • 55. Ne#lix  UndifferenJated  Li:ing   •  Middle  Tier  Load   Balancing   •  Discovery  (local  DNS)   •  EncrypJon  Services   •  Caching   •  Distributed  App   Management   We  want  cloud  vendors  to  do  all  this  for  us  as  well!  
  • 56. Load  Balancing  in  AWS   •  Middle  Jer  currently  not  supported  in  AWS   –  ELB  are  public-­‐facing  only   –  Cannot  apply  security  group  sezngs   •  ELB  verJcal  scalability  for  concentrated  clients   –  Too  few  proxy  IP  addresses  leads  to  hot  spots   •  ELB  needs  support  for  balancing  heurisJcs   –  ProporJonal  balance  across  Availability  Zones   –  Weighted  Least  connecJons,  Weighted  Round  Robin   •  Zone  aware  rouJng   –  Default  to  instances  in  the  same  Availability  Zone   –  Falls  back  to  cross-­‐zone  on  failure  
  • 57. Discovery   •  Discovery  Service  (Redundant  instances  per  zone)   –  Simple  REST  interface   –  Cloud  apps  register  with  Discovery   •  Apps  send  heartbeats  every  30  sec  to  renew  lease   –  App  evicted  a:er  3  missed  heartbeats   –  Can  re-­‐register  if  the  problem  was  transient   •  Apps  can  store  custom  metadata   –  Version  number,  AMI  id,  Availability  Zone,  etc.   •  So:ware  Round-­‐robin  Load  Balancer   –  Query  Discovery  for  instances  of  specific  applicaJon   –  Baked  into  Ne#lix  REST  client  (JSR311/Jersey  based)   AWS  Middle-­‐)er  ELB  would  eliminate  most  use  cases  
  • 58. Database  MigraJon   •  Why  SimpleDB?   –  No  DBA’s  in  the  cloud,  Amazon  hosted  service   –  Work  started  two  years  ago,  fewer  viable  opJons   –  Worked  with  Amazon  to  speed  up  and  scale  SimpleDB   •  AlternaJves?   –  InvesJgaJng  adding  Cassandra  and  Membase  to  the  mix   –  Need  several  opJons  to  match  use  cases  well   •  Detailed  SimpleDB  Advice   –  Sid  Anand    -­‐  QConSF  Nov  5th  –  Ne#lix’  TransiJon  to  High   Availability  Storage  Systems   –  Blog  -­‐  h=p://pracJcalcloudcompuJng.com/   –  Download  Paper  PDF  -­‐  h=p://bit.ly/bhOTLu  
  • 59. Oracle  to  SimpleDB   (See  Sid’s  paper  for  details)   •  SimpleDB  Domains   –  De-­‐normalize  mulJple  tables  into  a  single  domain   –  Work  around  size  limits  (10GB  per  domain,  1KB  per  key)   –  Shard  data  across  domains  to  scale   –  Key  –  Use  distributed  sequence  generator,  GUID  or  natural   unique  key  such  as  customer-­‐id     –  Implement  a  schema  validator  to  catch  bad  a=ributes   •  ApplicaJon  layer  support   –  Do  GROUP  BY  and  JOIN  operaJons  in  the  applicaJon   –  Compose  relaJons  in  the  applicaJon  layer   –  Check  constraints  on  read,  and  repair  data  as  a  side  effect   •  Do  without  triggers,  PL/SQL,  clock  operaJons  
  • 60. Tools  and  AutomaJon   •  Developer  and  Build  Tools   –  Jira,  Eclipse,  Hudson,  Ivy,  ArJfactory   –  Builds,  creates  .war  file,  .rpm,  bakes  AMI  and  launches   •  Custom  Ne#lix  ApplicaJon  Console   –  AWS  Features  at  Enterprise  Scale  (hide  the  keys!)   –  Auto  Scaler  Group  is  unit  of  deployment  to  producJon   •  Open  Source  +  Support   –  Apache,  Tomcat,  OpenJDK,  CentOS   •  Monitoring  Tools   –  Keynote  –  service  monitoring  and  alerJng   –  AppDynamics  –  Developer  focus  for  cloud   –  EpicNMS  –  flexible  data  collecJon  and  plots  h=p://epicnms.com   –  Nimso:  NMS  –  ITOps  focus  for  Datacenter  +  Cloud  alerJng  
  • 65. Monitoring  Vision   •  Problem   –  Too  many  tools,  each  with  a  good  reason  to  exist   –  Hard  to  get  an  integrated  view  of  a  problem   –  Too  much  manual  work  building  dashboards   –  Tools  are  not  discoverable,  views  are  not  filtered   •  SoluJon   –  Get  vendors  to  add  deep  linking  and  embedding   –  IntegraJon  “portal”  Jes  everything  together   –  Dynamic  portal  generaJon,  relevant  data,  all  tools  
  • 66. Cloud  Monitoring  Mechanisms   •  Keynote   –  External  URL  monitoring   •  Amazon  CloudWatch   –  Metrics  for  ELB  and  Instances   •  AppDynamics   –  End  to  end  transacJon  view  showing  resources  used   –  Powerful  real  Jme  debug  tools  for  latency,  CPU  and  Memory   •  Nimso:  NMS   –  Scalable  and  reliable  monitoring  and  alerJng,  integraJon  portal   •  Epic   –  Flexible  and  easy  to  use  to  extend  and  embed  plots   •  Logs   –  High  capacity  logging  and  analysis  framework   –  Hadoop  (log4j  -­‐>  chukwa  -­‐>  EMR)  
  • 67.
  • 68. Snapshots  for  a  Business  TransacJon   Sort  Call  Graphs  to  Top,  pick  a  slow  one  
  • 69. Drill  in  to  Slow  Call   Slow  Asynchronous  S3  Write  –  no  big  deal…  
  • 70. Cloud  Bring-­‐Up  Strategy   Simplest  and  Soonest  
  • 71. Shadow  Traffic  RedirecJon   •  Early  a=empt  to  send  traffic  to  cloud   –  Real  traffic  stream  to  validate  cloud  back  end   –  Uncovered  lots  of  process  and  tools  issues   –  Uncovered  Service  latency  issues   •  TV  Device  calls  Datacenter  API   –  Returns  Genre/movie  list  for  a  customer   –  Asynchronously  duplicates  request  to  cloud   –  Start  with  send-­‐and-­‐forget  mode,  ignore  response  
  • 72. Shadow  Redirect  Instances   Modified   Datacenter   Datacenter   Service   Instances   Modified  Cloud   Cloud  Service   One  request  per   Instances   visit   Data  Sources   queueservice   videometadata  
  • 73. First  Web  Pages  in  the  Cloud  
  • 75. First  Page   •  First  full  page  –  Starz  Channel  Genre   –  Simplest  page,  no  sub-­‐genres,  minimal  personalizaJon   –  Lots  of  investment  in  new  Struts  based  page  design   –  Uses  idenJty  cookie  to  lookup  in  member  info  svc   •  New  “merchweb”  front  end  instance   –  movies.ne#lix.com  points  to  merchweb  instance   •  Uncovered  lots  of  latency  issues   –  Used  memcached  to  hide  S3  and  SimpleDB  latency   –  Improved  from  slower  to  faster  than  Datacenter  
  • 76. Starz  Page  Cloud  Instances   Front  End   merchweb   mulJple  requests   Middle  Tier   starz    memcached   per  visit   Data  Sources   queueservice   rentalhistory   videometadata  
  • 77. Controlled  Cloud  TransiJon   •  WWW  calling  code  chooses  who  goes  to  cloud   –  Filter  out  corner  cases,  send  percentage  of  users   –  The  URL  that  customers  see  is   h=p://movies.ne#lix.com/WiContentPage?csid=1   –  If  problem,  redirect  to  old  Datacenter  page   h=p://www.ne#lix.com/WiContentPage?csid=1   •  Play  Bu=on  and  Star  RaJng  AcJon  redirect   –  Point  URLs  for  acJons  that  create/modify  data   back  to  datacenter  to  start  with  
  • 79. Cloud  Developer  Setup   •  Cloud  Boot  Camp   –  Room  full  of  engineers  sharing  the  pain  for  1-­‐2  days   –  Built  a  very  rough  prototype  working  web  site   –  Get  everyone  hands-­‐on  in  the  cloud  with  a  new  code  base   –  Debug  lots  of  tooling  and  conceptual  issues  very  fast   –  Member  info  in  SimpleDB  with  developer’s  accounts  only   •  Cloud  Specific  Key  Setup   –  It’s  a  pain,  need  to  configure  your  IDE’s  JVM   –  Needed  to  integrate  with  AWS  security  model   •  Startup  Guide  Wiki  Pages   –  What  object  facets  already  exist,  how  to  make  your  own   –  What  components  already  exist  or  are  work  in  progress  
  • 80. Developer  Instances  Collision   Sam  and  Rex  both  want  to  deploy  web  front  end  for   development   Sam   Rex   web  in   test   account  
  • 81. Per-­‐Service  Namespace  RouJng   Developers  choose  what  to  share   Sam   Rex   Mike   web-­‐sam   web-­‐rex   web-­‐dev   backend-­‐dev   backend-­‐dev   backend-­‐mike  
  • 82. Developer  Service  Namespaces   •  Developer  specific  service  instances   –  Configured  via  Java  properJes  at  runJme   –  RouJng  implemented  by  REST  client  library   •  Server  ConfiguraJon   –  Configure  discovery  service  version  string   –  Registers  as  <appname>-­‐<namespace>   •  Client  ConfiguraJon   –  Route  traffic  on  per-­‐service  basis  including   namespace  
  • 84. WWW  Page  by  Page  during  Q2/Q3/Q4   •  Simplest  possible  page  first   –  Minimal  dependencies   •  Add  pages  as  dependent  services  come  online   •  Home  page  –  most  complex  and  highest  traffic   •  Leave  low  traffic  pages  for  later  cleanup   gradual  migraAon  from  Datacenter  pages  
  • 85. Big-­‐Bang  TransiJon   •  iPhone  Launch  (August/Sept)   –  No  capacity  in  the  datacenter,  cloud  only   –  App  Store  gates  release,  not  gradual,  can’t  back  out   –  Market  is  huge  (exisJng  and  new  customers)   –  Has  to  work  at  large  scale  on  day  one   •  Datacenter  Shadow  Redirect  Technique   –  Used  to  stress  back-­‐end  and  data  sources   •  SOASTA  Cloud  Based  Load  GeneraJon   –  Used  to  stress  test  API  and  end-­‐to-­‐end  funcJonality  
  • 86. Current  Work  for  Cloud  Pla#orm   •  Drive  latency  and  availability  goals   –  More  Aggressive  caching   –  Improving  Fault  and  latency  robustness   •  Logging  and  monitoring  portal/dashboards   –  Working  to  integrate  tools  and  data  sources   –  Need  be=er  observability  and  automaJon   •  EvaluaJng  a  range  of  NoSQL  choices   –  Broad  set  of  use  cases,  no  single  winner   –  Good  topic  for  another  talk…  
  • 88. Next  Few  Years…   •  “System  of  Record”  moves  to  Cloud   –  Master  copies  of  data  live  only  in  the  cloud,  with  backups  etc.   –  Cut  the  datacenter  to  cloud  replicaJon  link   •  InternaJonal  Expansion  –  Global  Clouds   –  Rapid  deployments  to  new  markets   •  GPU  Clouds  opJmized  for  video  encoding   •  Cloud  StandardizaJon   –  Cloud  features  and  APIs  should  be  a  commodity  not  a  differenJator   –  DifferenJate  on  scale  and  quality  of  service   –  CompeJJon  also  drives  cost  down   –  Higher  resilience   –  Higher  scalability   We  would  prefer  to  be  an  insignificant  customer  in  a  giant  cloud  
  • 89. Remember  the  Goals   Faster   Scalable   Available   ProducJve   Track  progress  against  these  goals  
  • 90. Takeaway   Ne9lix  is  path-­‐finding  the  use  of  public  AWS   cloud  to  replace  in-­‐house  IT  for  non-­‐trivial   applicaAons  with  hundreds  of  developers  and   thousands  of  systems.   h=p://www.linkedin.com/in/adriancockcro:   @adrianco  #ne#lixcloud