SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
Inter-cloud object storage:
                                              Colony
                                                                     15/Oct/2012
                                                               NTT DATA INTELLILINK
                                                                  Motonobu Ichimura




Copyright © 2012 NTT DATA Corporation
EtherPad




                   http://etherpad.openstack.org/grizzly-colony




Copyright © 2012 NTT DATA INTELLILINK Corporation                 2
Agenda



•What is Colony ?
–Our goal
–Usecase
•How to make swift network(or region) aware
–Problems with original swift code
–Our modification
–Investigation
–Conclusion
•Future Plan
–Problems to tackle (and being tackled)
–Collaboration




 Copyright © 2012 NTT DATA INTELLILINK Corporation   3
What is Colony?




Copyright © 2012 NTT DATA INTELLILINK Corporation   4
Goal: academic community cloud


Academic
Community Cloud
                                                          Education Cloud         Univ.-X Cloud


 Univ. -A Cloud                           Univ.-B Cloud
                                                          Research Cloud

                                                                     ・・・

                                                 Intercloud services
                                                                 Science Information Network



 Copyright © 2012 NTT DATA INTELLILINK Corporation                                                5
Intercloud object storage service
                                                                                         Colony federates cloud
                                                                    Nova
                                                                                         object storage services,
                                                                                         like swift, to archive
                                                                           Glance


                                                           Swift for
                                                           intercloud
                                                                                 Swift
                                                                                         intercloud object
                                                                                         storage service.
                                                           use




                                                                    Swift for
                                                                    intercloud
                                                                    use
                                                                                                   Swift for
Nova
                        Swift for                   Swift                                          local use
                        intercloud
       Glance           use
                                                             Glance
           Swift
                                                      Nova
                                                                                                   Swift for
                                                                                                   intercloud use

       Copyright © 2012 NTT DATA INTELLILINK Corporation                                                            6
Users’ points of view
          Cloud Services

                                                                                   Cloud-B
                                                                                                                   Object B1-1

                         Cloud-A                                                                 Container B1
                                                                                                 Container B2
                                                                                                 Container B3
                                                                                                                   Object B1-2
                                                                                                                   Object B1-3


                                                                        Object A1-1
                                                                                       Swift-B                               Object I1-1
                                                     Container A1       Object A1-2              Inter-cloud Container I1    Object I1-2
                                                     Container A2       Object A1-3              Inter-cloud Container I8    Object I1-3
                          Swift-A                    Container A3                                Inter-cloud Container I10
                                                     Inter-cloud Container I1
                                                                                 Object I4-1
                                                     Inter-cloud Container I4    Object I4-2
                                                     Inter-cloud Container I13   Object I4-3




        Geographically                                                                                 Inter-cloud Container
                                                                                                                                Object I1-1
                                                                                                                                Object I1-2

         Distributed
                                                                                                       I1
                                                                                                       Inter-cloud Container I2 Object I1-3

                                                                       Swift-I                         Inter-cloud Container I3 Object I4-1
                                                                                                       Inter-cloud Container I4 Object I4-2
                                                                                                                                Object I4-3


           Inter-cloud object storage service : colony

 Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                                            7
Colony archives the federation
                                                       Shibboleth IdP
                                              Authenticate with
                                              Shibboleth IdP
                                                                                           Cloud-A User
                                                             Colony
                                                                Apache
                                                          mod_wsgi   mod_shib

                                                           Colony-horizon
                                                          Colony-keystone
                                                          Colony-dispatcher
                                                                 Squid


                                 Provide seamless access to
                                                                 Slapd

                                                 Ubuntu

                                 multiple swifts
                                               Swift                            Swift

     Swift-I                           Colony-Keystone                   Colony-Keystone   Swift-A
                                              Slapd                             Slapd


Copyright © 2012 NTT DATA INTELLILINK Corporation                                                         8
UseCase
 We plan to use Colony as

 Object Storage for Clouds to Clouds migration
 Object Storage to delevery VM Images around Japan
     Object Storage to store big data.




Copyright © 2012 NTT DATA INTELLILINK Corporation    9
Developed software components in colony
•Colony-Horizon – based on diablo/stable Horizon with some enhancements
•Multi-region support – Users can choose which swift is used to store/retrieve objects
•Swift Container’s ACL ,metadata support
•Swift Object’s metadata support
•>5G segment upload support …
•Colony-Keystone – based on diablo/stable Keystone with some enhancements
•Authenticate with Shibboleth
•%{tanant_name} can be used for endpointTemplates in addition to %{tenant_id} to federate
cloud services
•Colony-Dispatcher - new
•Relay requests to multiple object services (and merge response for clients)
•Relay requests to a specific object service indicated by URI
•Choose the “nearest” swift-proxy server to relay requests
•Copy objects among different swifts
•Utilities - new
•Tools to simplfy admin tasks to federate object storage services


    Copyright © 2012 NTT DATA INTELLILINK Corporation                                       10
Colony-horizon

Users can choose swift                               Swift -I




                                                     Swift -A




 Copyright © 2012 NTT DATA INTELLILINK Corporation              11
Colony – keystone
                        Shibboleth IdP
                                                       Modifications to keystone
                                                         • Add ePPN field to keystone schema
                                                         • ADD rest api services to create token by ePPN
                                                         ('/token_by/eppn') and email address('/token_by/email')
                                                         • Add a rest api service to register/update ePPN
                                                         ('/users/{user_id}/eppn')



1. ID/passwd                                   2. Attribute: ePPN, mail_addr
                                                                                0-1. User registration by mail_addr
                                                                                0-2 . Associate ePPN to mail_addr
                                                                                      by initial access
                      Shibboleth SP
                         Colony-
                         Colony-Horizon
                                                         3. Attribute: ePPN
                                                                                   Colony-
                                                         4. auth_token
                                                                                   Keystone

  Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                   12
Colony-dispatcher
1. Swift client can send requests to Swift-A and Swift-I through Swift Dispatcher
2. Swift Dispatcher merges and sends the response from each Swift to Swift Client

    Swift Client
                                                                             Requests modified for merging responses.
         A:container1                                                        •Account Info
         A:container2                                                        •Container List
                                                                             •X-Copy-from/to
         I:container1
         I:container2
                                                         Colony Dispatcher


Response merged by                                      Swift Proxy             Swift Proxy         Swift Proxy
Colony Dispatcher has
a prefix to indicate
which Swift is used to
store.
                                           Swift-A (local)                   Swift-I (intercloud )
    Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                   13
Caching
Swift Dispatcher can use cache proxy (like squid) per
swift proxy to retrieve objects from remote swifts.

     A:container1
     A:container2
     I:container1                                   Colony Dispatcher
     I:container2
                                                                        Cache(Proxy)
 Swift Client

                                                    Swift Proxy            Swift Proxy   Swift Proxy




                                       Swift-A (local)                  Swift-I (intercloud )
Copyright © 2012 NTT DATA INTELLILINK Corporation                                                      14
How to swift make network aware




Copyright © 2012 NTT DATA INTELLILINK Corporation              15
Current implementation




Copyright © 2012 NTT DATA INTELLILINK Corporation     16
Problems which original swift code has




•PUT/GET performance
–Swift proxy waits all objects are put to storage servers.
–Swift proxy chooses randomly the node to retrieve object.




 Copyright © 2012 NTT DATA INTELLILINK Corporation           17
Test Environments

                                                    CPU: Intel(R) Xeon(R) CPU E7- 8870 (40core)
                                                    Mem: 126GB
                                                    NIC: 1000baseT/Full
                                                                                                       x2
                                                                    900MBps(0.1msec)
                                                          Sapporo




                                                      Tokyo

                                                      9900MBps
                                            CPU: AMD Opetron 6128 2000Mhz (16core)
                                            Mem: 32GB
                                            NIC: 10000baseT/Full                                  x2

Copyright © 2012 NTT DATA INTELLILINK Corporation                                                           18
PUT operation




                                                                  Sapporo

                                                              Storage   Storage
                                                                                    Object PUT operation is
                                                                  Storage
                                                                                    always affected by the
                                                                                    worst case.

                                                         Tokyo

                                                    Storage   Storage

                                                        Storage             Proxy

                                                                                    Client

Copyright © 2012 NTT DATA INTELLILINK Corporation                                                             19
Object's location




Copyright © 2012 NTT DATA INTELLILINK Corporation          20
PUT object's throughput @Tokyo (Bytes/sec)




Copyright © 2012 NTT DATA INTELLILINK Corporation      21
GET operation



                                                                    High-bandwidth, low-latency



                                                                  Sapporo

                                                              Storage   Storage
                                                                                             1/replications
                                                                  Storage




                                                         Tokyo

                                                    Storage   Storage

                                                        Storage             Proxy
                                              High-bandwidth, low-latency

                                                                                             Client

Copyright © 2012 NTT DATA INTELLILINK Corporation                                                             22
Object's location




Copyright © 2012 NTT DATA INTELLILINK Corporation          23
GET object's throughput @Tokyo (Bytes/sec)




            Performance degradation by network between Sapporo and Tokyo
Copyright © 2012 NTT DATA INTELLILINK Corporation                          24
Our modification




Copyright © 2012 NTT DATA INTELLILINK Corporation   25
How to solve - Basic Idea



•Limitation
–Don’t modify data structure (including ring)
–Minimize customization


•Adding some rules to the ring’s data strcuture
–Zone information is treated as decimal number, so consider difference between
zoneA and ZoneB represents a distance of zoneA and ZoneB
•Adding some zone hints to Swift proxy servers
•Changes the order of nodes for Proxy server.




 Copyright © 2012 NTT DATA INTELLILINK Corporation                               26
How to solve




                                                                         Proxy
                                                                       Zone 200
[app:proxy-server]                                                    Distance 10
                                                       Sapporo
nearby_mode = false
                                                         zone 200-        Proxy , which has zone info(200) and zone
own_zone = 100                                                            distance(10), considers
                                                            202           storage servers between zone 200-210
near_distance = 10                                                        to be located near the proxy.


                                                    Tokyo
                                                     zone 100-
                                                        102             Proxy
                          Proxy ,which has zone info(100) and zone    Zone 100
                          distance(10), considers                    Distance 10
                          storage servers between zone 100-110
                          to be located near the proxy.


Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                     27
PUT operation

  Proxy initially puts objects to the nearest storage servers using zone information and
  zone distance. Then object replicator replicates it the proper position asyncronasly.



                                                                 Sapporo
                                                           Storage Storage
                                                              D         F
                                                                Storage
                                                                   G




                                                         Tokyo
                                                    Storage Storage
                                                       A         B             zone_info: 100
                                                         Storage               zone_distance: 10
                                                            C          Proxy

                                                                                       Client

Copyright © 2012 NTT DATA INTELLILINK Corporation                                                  28
PUT operation

  This is the same situation that all storage servers located in Supporo are broken.




                                                                 Sapporo
                                                           Storage Storage
                                                            × D         E ×
                                                                Storage
                                                                   F
                                                                      ×


                                                         Tokyo
                                                    Storage Storage
                                                       A         B
                                                         Storage
                                                            C             Proxy

                                                         Hinted hand off          Client

Copyright © 2012 NTT DATA INTELLILINK Corporation                                          29
GET operation


                                                                                    1.First, try to retrieve
                                                                                    object from storage
                                                                                    server near the proxy.
                                                                  Sapporo           2.After that, try to retrieve
                                                                                    object from storage
                                                              Storage   Storage     server indicated as a
                                                                                    primary zone
                                                                  Storage




                                                         Tokyo

                                                    Storage   Storage

                                                        Storage             Proxy

                                                                                    Client

Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                   30
DELETE operation


                                                                                    1.First, try to delete
                                                                                    object from storage
                                                                                    server near the proxy
                                                                  Sapporo           2.After that, try to delete
                                                                                    object from storage
                                                              Storage   Storage     server indicated as a
                                                                                    primary zone
                                                                  Storage




                                                         Tokyo

                                                    Storage   Storage

                                                        Storage             Proxy

                                                                                    Client

Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                 31
Code


  ring.py                                                                      proxy/server.py
                                                                               @@ -1044,6 +1056,14 @@ def POST(self, req):
 def get_near_nodes(self, account, container, obj, own_zone, near_distance):   1056                 container_partition, containers, _junk, req.acl, _junk = ¥
          """                                                                  1057                      self.container_info(self.account_name, self.container_name,
          Get the partition and nodes same as get_nodes,                       1058                            account_autocreate=self.app.account_autocreate)
                                                                               1059 +               if self.app.nearby_mode:
         :param account: account name                                          1060
         :param container: container name                                      +                  partition, near_nodes = self.app.object_ring.get_near_nodes(
         :param obj: object name                                               1061
         :param own_zone: top number of zone name                              +                        self.account_name, self.container_name, self.object_name,
         :param near_distance: recognize matched zone name                     1062 +                          self.app.own_zone, self.app.near_distance) 1063
         which start from own_zone to a number add own_zone and this number.   +                  print 'before nodes: %s' % containers 1064
         :returns: a tuple of (partition, list of node dicts)                  +                  containers = near_nodes + ¥ 1065
          """                                                                  +                        [cont for cont in containers if cont['zone'] not in [c['zon
          part, nodes = self.get_nodes(account, container, obj)                e'] for c in near_nodes]] 1066
                                                                               +                  print 'after nodes: %s' % containers 1047
                                                                               1067                 if 'swift.authorize' in req.environ: 1048
          def isnearby(one, other, distance):                                  1068                      aresp = req.environ['swift.authorize'](req) 1049
               if one <= other and one + distance > other:                     1069                      if aresp:
                    return True
               return False


          near_nodes = []                                                      and then modify proxy/server.py to
          for node in nodes:
                if isnearby(own_zone, node['zone'], near_distance):
                      near_nodes.append(node)
                                                                               use get_near_nodes() for each
          if len(near_nodes) <= self.replica_count:
                for node in self.get_more_nodes(part):                         method.
                      if isnearby(own_zone, node['zone'], near_distance):
                            near_nodes.append(node)
                      if len(near_nodes) >= self.replica_count:
                            break
          return part, near_nodes




 adding get_near_nodes() to ring.py


Copyright © 2012 NTT DATA INTELLILINK Corporation                                                                                                                      32
Investigation


               PUT Average (bytes/sec) @Sapporo

40,000,000
35,000,000
30,000,000
25,000,000
20,000,000                                                      Original
15,000,000                                                      Patched
10,000,000
 5,000,000
         0
                   1K          1M         10M       100M   1G



                 PUT Average (bytes/sec) @Tokyo

160,000,000
140,000,000
120,000,000
100,000,000
 80,000,000                                                     Original
 60,000,000                                                     Patched
 40,000,000
 20,000,000
          0
                    1K         1M          10M      100M   1G



Copyright © 2012 NTT DATA INTELLILINK Corporation                          33
Using Cache
 How about the case of all objects are located to remote areas ?




                                                         Sapporo

                                                     Storage    Storage

                                                         Storage




                                                           Tokyo
Kyusyu
                                                      Storage    Storage
  Proxy
                                                           Storage         Proxy

Client


 Copyright © 2012 NTT DATA INTELLILINK Corporation                                 34
Colony-Dispatcher as a cache

Colony-Dispatcher can be a swift-proxy-proxy with cache
mechanism






    Copyright © 2012 NTT DATA INTELLILINK Corporation      35
Investigation – Cache effectiveness
        Using Colony-Dispatcher as a cache, the performance to retrieve objects from
        remote area could be nice.

                  GET average (bytes/sec) @Sapporo

 350,000,000
 300,000,000
 250,000,000                                                      Column K
 200,000,000
                                                                  Column K
 150,000,000
                                                                  Column K
 100,000,000
                                                                  Column K
  50,000,000
              0
                     1K       1M         10M        100M    1G

                  GET average (bytes/sec) @Tokyo

250,000,000

200,000,000
                                                                 Column K
150,000,000
                                                                 Column K
100,000,000                                                      Column K

 50,000,000                                                      Column K

         0
                    1K       1M        10M        100M     1G


       Copyright © 2012 NTT DATA INTELLILINK Corporation                               36
Conclusion



•Re-ordering the nodes by regions for Proxy resolves GET/PUT performance
issues
–And this feature can be implemented with minimum(<50 lines of code) customization.
•Using cache is a good idea for inter-cloud use




 Copyright © 2012 NTT DATA INTELLILINK Corporation                                    37
Our future plan




Copyright © 2012 NTT DATA INTELLILINK Corporation   38
Problems to tackle



•Object’s location
•Adding Region concepts to the ring structure might help.
–Primary nodes isolated by region


•Replication’s performance
       – Key factor
               • We aggressivelly used hinted-hand-off mechanism to
                        –   Using UDT instead of TCP for replication
                        –   Using pyinotify to I/O event driven replication
                        –   Separation of Network for replication
                        –   Hop by Hop replication




 Copyright © 2012 NTT DATA INTELLILINK Corporation                            39
Are you interested in Colony ?



•Please contact with me if you are interested in Colony project.
–We want to collaborate with people who want to use/develop swift as a inter-cloud
object store.




 Copyright © 2012 NTT DATA INTELLILINK Corporation                                   40
Are you interested in academic clouds?



•If you are interested in the way how to integrate clouds using dodai and
clony
–My colleague (guan-san) will make a presentation about dodai (Cluster as a
service) at 17:20 @Manchester A

–Yokoyama-san (a member of NII) might talk about how to integrate both Colony
and Dodai on LT




 Copyright © 2012 NTT DATA INTELLILINK Corporation                              41
Thank you.




Copyright © 2012 NTT DATA INTELLILINK Corporation   42
Q&A



•Please phase your question using simple grammar if possible.




 Copyright © 2012 NTT DATA INTELLILINK Corporation              43

Contenu connexe

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Colony for-openstack-grizzly-summit

  • 1. Inter-cloud object storage: Colony 15/Oct/2012 NTT DATA INTELLILINK Motonobu Ichimura Copyright © 2012 NTT DATA Corporation
  • 2. EtherPad http://etherpad.openstack.org/grizzly-colony Copyright © 2012 NTT DATA INTELLILINK Corporation 2
  • 3. Agenda •What is Colony ? –Our goal –Usecase •How to make swift network(or region) aware –Problems with original swift code –Our modification –Investigation –Conclusion •Future Plan –Problems to tackle (and being tackled) –Collaboration Copyright © 2012 NTT DATA INTELLILINK Corporation 3
  • 4. What is Colony? Copyright © 2012 NTT DATA INTELLILINK Corporation 4
  • 5. Goal: academic community cloud Academic Community Cloud Education Cloud Univ.-X Cloud Univ. -A Cloud Univ.-B Cloud Research Cloud ・・・ Intercloud services Science Information Network Copyright © 2012 NTT DATA INTELLILINK Corporation 5
  • 6. Intercloud object storage service Colony federates cloud Nova object storage services, like swift, to archive Glance Swift for intercloud Swift intercloud object storage service. use Swift for intercloud use Swift for Nova Swift for Swift local use intercloud Glance use Glance Swift Nova Swift for intercloud use Copyright © 2012 NTT DATA INTELLILINK Corporation 6
  • 7. Users’ points of view Cloud Services Cloud-B Object B1-1 Cloud-A Container B1 Container B2 Container B3 Object B1-2 Object B1-3 Object A1-1 Swift-B Object I1-1 Container A1 Object A1-2 Inter-cloud Container I1 Object I1-2 Container A2 Object A1-3 Inter-cloud Container I8 Object I1-3 Swift-A Container A3 Inter-cloud Container I10 Inter-cloud Container I1 Object I4-1 Inter-cloud Container I4 Object I4-2 Inter-cloud Container I13 Object I4-3 Geographically Inter-cloud Container Object I1-1 Object I1-2 Distributed I1 Inter-cloud Container I2 Object I1-3 Swift-I Inter-cloud Container I3 Object I4-1 Inter-cloud Container I4 Object I4-2 Object I4-3 Inter-cloud object storage service : colony Copyright © 2012 NTT DATA INTELLILINK Corporation 7
  • 8. Colony archives the federation Shibboleth IdP Authenticate with Shibboleth IdP Cloud-A User Colony Apache mod_wsgi mod_shib Colony-horizon Colony-keystone Colony-dispatcher Squid Provide seamless access to Slapd Ubuntu multiple swifts Swift Swift Swift-I Colony-Keystone Colony-Keystone Swift-A Slapd Slapd Copyright © 2012 NTT DATA INTELLILINK Corporation 8
  • 9. UseCase We plan to use Colony as Object Storage for Clouds to Clouds migration Object Storage to delevery VM Images around Japan Object Storage to store big data. Copyright © 2012 NTT DATA INTELLILINK Corporation 9
  • 10. Developed software components in colony •Colony-Horizon – based on diablo/stable Horizon with some enhancements •Multi-region support – Users can choose which swift is used to store/retrieve objects •Swift Container’s ACL ,metadata support •Swift Object’s metadata support •>5G segment upload support … •Colony-Keystone – based on diablo/stable Keystone with some enhancements •Authenticate with Shibboleth •%{tanant_name} can be used for endpointTemplates in addition to %{tenant_id} to federate cloud services •Colony-Dispatcher - new •Relay requests to multiple object services (and merge response for clients) •Relay requests to a specific object service indicated by URI •Choose the “nearest” swift-proxy server to relay requests •Copy objects among different swifts •Utilities - new •Tools to simplfy admin tasks to federate object storage services Copyright © 2012 NTT DATA INTELLILINK Corporation 10
  • 11. Colony-horizon Users can choose swift Swift -I Swift -A Copyright © 2012 NTT DATA INTELLILINK Corporation 11
  • 12. Colony – keystone Shibboleth IdP Modifications to keystone • Add ePPN field to keystone schema • ADD rest api services to create token by ePPN ('/token_by/eppn') and email address('/token_by/email') • Add a rest api service to register/update ePPN ('/users/{user_id}/eppn') 1. ID/passwd 2. Attribute: ePPN, mail_addr 0-1. User registration by mail_addr 0-2 . Associate ePPN to mail_addr by initial access Shibboleth SP Colony- Colony-Horizon 3. Attribute: ePPN Colony- 4. auth_token Keystone Copyright © 2012 NTT DATA INTELLILINK Corporation 12
  • 13. Colony-dispatcher 1. Swift client can send requests to Swift-A and Swift-I through Swift Dispatcher 2. Swift Dispatcher merges and sends the response from each Swift to Swift Client Swift Client Requests modified for merging responses. A:container1 •Account Info A:container2 •Container List •X-Copy-from/to I:container1 I:container2 Colony Dispatcher Response merged by Swift Proxy Swift Proxy Swift Proxy Colony Dispatcher has a prefix to indicate which Swift is used to store. Swift-A (local) Swift-I (intercloud ) Copyright © 2012 NTT DATA INTELLILINK Corporation 13
  • 14. Caching Swift Dispatcher can use cache proxy (like squid) per swift proxy to retrieve objects from remote swifts. A:container1 A:container2 I:container1 Colony Dispatcher I:container2 Cache(Proxy) Swift Client Swift Proxy Swift Proxy Swift Proxy Swift-A (local) Swift-I (intercloud ) Copyright © 2012 NTT DATA INTELLILINK Corporation 14
  • 15. How to swift make network aware Copyright © 2012 NTT DATA INTELLILINK Corporation 15
  • 16. Current implementation Copyright © 2012 NTT DATA INTELLILINK Corporation 16
  • 17. Problems which original swift code has •PUT/GET performance –Swift proxy waits all objects are put to storage servers. –Swift proxy chooses randomly the node to retrieve object. Copyright © 2012 NTT DATA INTELLILINK Corporation 17
  • 18. Test Environments CPU: Intel(R) Xeon(R) CPU E7- 8870 (40core) Mem: 126GB NIC: 1000baseT/Full x2 900MBps(0.1msec) Sapporo Tokyo 9900MBps CPU: AMD Opetron 6128 2000Mhz (16core) Mem: 32GB NIC: 10000baseT/Full x2 Copyright © 2012 NTT DATA INTELLILINK Corporation 18
  • 19. PUT operation Sapporo Storage Storage Object PUT operation is Storage always affected by the worst case. Tokyo Storage Storage Storage Proxy Client Copyright © 2012 NTT DATA INTELLILINK Corporation 19
  • 20. Object's location Copyright © 2012 NTT DATA INTELLILINK Corporation 20
  • 21. PUT object's throughput @Tokyo (Bytes/sec) Copyright © 2012 NTT DATA INTELLILINK Corporation 21
  • 22. GET operation High-bandwidth, low-latency Sapporo Storage Storage 1/replications Storage Tokyo Storage Storage Storage Proxy High-bandwidth, low-latency Client Copyright © 2012 NTT DATA INTELLILINK Corporation 22
  • 23. Object's location Copyright © 2012 NTT DATA INTELLILINK Corporation 23
  • 24. GET object's throughput @Tokyo (Bytes/sec) Performance degradation by network between Sapporo and Tokyo Copyright © 2012 NTT DATA INTELLILINK Corporation 24
  • 25. Our modification Copyright © 2012 NTT DATA INTELLILINK Corporation 25
  • 26. How to solve - Basic Idea •Limitation –Don’t modify data structure (including ring) –Minimize customization •Adding some rules to the ring’s data strcuture –Zone information is treated as decimal number, so consider difference between zoneA and ZoneB represents a distance of zoneA and ZoneB •Adding some zone hints to Swift proxy servers •Changes the order of nodes for Proxy server. Copyright © 2012 NTT DATA INTELLILINK Corporation 26
  • 27. How to solve Proxy Zone 200 [app:proxy-server] Distance 10 Sapporo nearby_mode = false zone 200- Proxy , which has zone info(200) and zone own_zone = 100 distance(10), considers 202 storage servers between zone 200-210 near_distance = 10 to be located near the proxy. Tokyo zone 100- 102 Proxy Proxy ,which has zone info(100) and zone Zone 100 distance(10), considers Distance 10 storage servers between zone 100-110 to be located near the proxy. Copyright © 2012 NTT DATA INTELLILINK Corporation 27
  • 28. PUT operation Proxy initially puts objects to the nearest storage servers using zone information and zone distance. Then object replicator replicates it the proper position asyncronasly. Sapporo Storage Storage D F Storage G Tokyo Storage Storage A B zone_info: 100 Storage zone_distance: 10 C Proxy Client Copyright © 2012 NTT DATA INTELLILINK Corporation 28
  • 29. PUT operation This is the same situation that all storage servers located in Supporo are broken. Sapporo Storage Storage × D E × Storage F × Tokyo Storage Storage A B Storage C Proxy Hinted hand off Client Copyright © 2012 NTT DATA INTELLILINK Corporation 29
  • 30. GET operation 1.First, try to retrieve object from storage server near the proxy. Sapporo 2.After that, try to retrieve object from storage Storage Storage server indicated as a primary zone Storage Tokyo Storage Storage Storage Proxy Client Copyright © 2012 NTT DATA INTELLILINK Corporation 30
  • 31. DELETE operation 1.First, try to delete object from storage server near the proxy Sapporo 2.After that, try to delete object from storage Storage Storage server indicated as a primary zone Storage Tokyo Storage Storage Storage Proxy Client Copyright © 2012 NTT DATA INTELLILINK Corporation 31
  • 32. Code ring.py proxy/server.py @@ -1044,6 +1056,14 @@ def POST(self, req): def get_near_nodes(self, account, container, obj, own_zone, near_distance): 1056 container_partition, containers, _junk, req.acl, _junk = ¥ """ 1057 self.container_info(self.account_name, self.container_name, Get the partition and nodes same as get_nodes, 1058 account_autocreate=self.app.account_autocreate) 1059 + if self.app.nearby_mode: :param account: account name 1060 :param container: container name + partition, near_nodes = self.app.object_ring.get_near_nodes( :param obj: object name 1061 :param own_zone: top number of zone name + self.account_name, self.container_name, self.object_name, :param near_distance: recognize matched zone name 1062 + self.app.own_zone, self.app.near_distance) 1063 which start from own_zone to a number add own_zone and this number. + print 'before nodes: %s' % containers 1064 :returns: a tuple of (partition, list of node dicts) + containers = near_nodes + ¥ 1065 """ + [cont for cont in containers if cont['zone'] not in [c['zon part, nodes = self.get_nodes(account, container, obj) e'] for c in near_nodes]] 1066 + print 'after nodes: %s' % containers 1047 1067 if 'swift.authorize' in req.environ: 1048 def isnearby(one, other, distance): 1068 aresp = req.environ['swift.authorize'](req) 1049 if one <= other and one + distance > other: 1069 if aresp: return True return False near_nodes = [] and then modify proxy/server.py to for node in nodes: if isnearby(own_zone, node['zone'], near_distance): near_nodes.append(node) use get_near_nodes() for each if len(near_nodes) <= self.replica_count: for node in self.get_more_nodes(part): method. if isnearby(own_zone, node['zone'], near_distance): near_nodes.append(node) if len(near_nodes) >= self.replica_count: break return part, near_nodes adding get_near_nodes() to ring.py Copyright © 2012 NTT DATA INTELLILINK Corporation 32
  • 33. Investigation PUT Average (bytes/sec) @Sapporo 40,000,000 35,000,000 30,000,000 25,000,000 20,000,000 Original 15,000,000 Patched 10,000,000 5,000,000 0 1K 1M 10M 100M 1G PUT Average (bytes/sec) @Tokyo 160,000,000 140,000,000 120,000,000 100,000,000 80,000,000 Original 60,000,000 Patched 40,000,000 20,000,000 0 1K 1M 10M 100M 1G Copyright © 2012 NTT DATA INTELLILINK Corporation 33
  • 34. Using Cache How about the case of all objects are located to remote areas ? Sapporo Storage Storage Storage Tokyo Kyusyu Storage Storage Proxy Storage Proxy Client Copyright © 2012 NTT DATA INTELLILINK Corporation 34
  • 35. Colony-Dispatcher as a cache Colony-Dispatcher can be a swift-proxy-proxy with cache mechanism  Copyright © 2012 NTT DATA INTELLILINK Corporation 35
  • 36. Investigation – Cache effectiveness Using Colony-Dispatcher as a cache, the performance to retrieve objects from remote area could be nice. GET average (bytes/sec) @Sapporo 350,000,000 300,000,000 250,000,000 Column K 200,000,000 Column K 150,000,000 Column K 100,000,000 Column K 50,000,000 0 1K 1M 10M 100M 1G GET average (bytes/sec) @Tokyo 250,000,000 200,000,000 Column K 150,000,000 Column K 100,000,000 Column K 50,000,000 Column K 0 1K 1M 10M 100M 1G Copyright © 2012 NTT DATA INTELLILINK Corporation 36
  • 37. Conclusion •Re-ordering the nodes by regions for Proxy resolves GET/PUT performance issues –And this feature can be implemented with minimum(<50 lines of code) customization. •Using cache is a good idea for inter-cloud use Copyright © 2012 NTT DATA INTELLILINK Corporation 37
  • 38. Our future plan Copyright © 2012 NTT DATA INTELLILINK Corporation 38
  • 39. Problems to tackle •Object’s location •Adding Region concepts to the ring structure might help. –Primary nodes isolated by region •Replication’s performance – Key factor • We aggressivelly used hinted-hand-off mechanism to – Using UDT instead of TCP for replication – Using pyinotify to I/O event driven replication – Separation of Network for replication – Hop by Hop replication Copyright © 2012 NTT DATA INTELLILINK Corporation 39
  • 40. Are you interested in Colony ? •Please contact with me if you are interested in Colony project. –We want to collaborate with people who want to use/develop swift as a inter-cloud object store. Copyright © 2012 NTT DATA INTELLILINK Corporation 40
  • 41. Are you interested in academic clouds? •If you are interested in the way how to integrate clouds using dodai and clony –My colleague (guan-san) will make a presentation about dodai (Cluster as a service) at 17:20 @Manchester A –Yokoyama-san (a member of NII) might talk about how to integrate both Colony and Dodai on LT Copyright © 2012 NTT DATA INTELLILINK Corporation 41
  • 42. Thank you. Copyright © 2012 NTT DATA INTELLILINK Corporation 42
  • 43. Q&A •Please phase your question using simple grammar if possible. Copyright © 2012 NTT DATA INTELLILINK Corporation 43