Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day

Link to video: https://youtu.be/lDXdf5q8Yw8

At Indeed, we use massive amounts of data to build our products and services. At first, we relied on rsync to distribute these data to our servers. This rsync system lasted for ten years before we started to encounter scaling challenges. So we built a new system on top of BitTorrent to improve latency, reliability, and throughput. Today, terabytes of data flow around the world every day between our servers. In this talk, we will describe what we needed, what we created, and the lessons we learned building a system at this scale.

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day

  1. 1. RAD How We Replicate Terabytes of Data Around the World Every Day
  2. 2. Jason Koppe System Administrator
  3. 3. Indeed is the #1 external source of hire 64% of US job searchers search on indeed each month Unique Visitors (millions) Million unique visitors 2009 2011 2012 2013 2014 2015 0 20 40 60 80 100 120 140 160 180 2010 180M 180 million unique users 80.2M unique US visitors per month 16M jobs 50+ countries 28 languages
  4. 4. How We Build Systems fast simple resilient scalable
  5. 5. fast
  6. 6. Fast
  7. 7. Job Search Browser Rendering median ~0.5 seconds Feb 24 Feb 25 Feb 26 Feb 27 Feb 28 Feb 29 Mar 1 Mar 2 Mar 3 Mar 4 Mar 5 Mar 6 Mar 7 Mar 8 0 100 200 300 400 500 600 700 800 milliseconds
  8. 8. simple
  9. 9. 2004 launch: a few servers, 1.8m US jobs
  10. 10. 2004 Aggregation MySQL Job Search Every job on the web
  11. 11. relational database, accessed across the network
  12. 12. NOT fast at full text search NOT a search engine
  13. 13. 2004 Indeed 1999 Lucene
  14. 14. LuceneTM a high-performance, full featured text search engine library
  15. 15. LuceneTM NOT a remote database, files must be on local disk
  16. 16. MySQL Database Server Lucene Index Server Index Builder /data/jobindex
  17. 17. Index Builder Index Builder Index Builder Index Builder /data/jobindex /data/jobindex /data/jobindex /data/jobindex MySQL
  18. 18. MySQL Database Server Indexer Server Index Builder /data/jobindex Search Engine /data/jobindex 4 Search Servers
  19. 19. any combination of data, not just lucene
  20. 20. lucene + model
  21. 21. lucene + model bitset
  22. 22. lucene + model bitset lucene + custom binary
  23. 23. lucene + model bitset lucene + custom binary json + csv
  24. 24. MySQL Database Server Index Builder Producer Artifact Artifact Consumers Search Engine
  25. 25. MySQL Database Server Index Builder Producer Artifact Artifact Consumers Search Engine Artifact is read-optimized data stored in a directory on the file system
  26. 26. Producer creates and updates a data artifact Database Server Index Builder Producer Artifact Artifact Consumers Search Engine MySQL
  27. 27. Consumer reads a data artifact Database Server Index Builder Producer Artifact Artifact Consumers Search Engine MySQL
  28. 28. produce once, consume many times
  29. 29. MySQL Database Server Index Builder Producer Artifact Artifact Consumers Search Engine Benefit: minimize database access
  30. 30. MySQL Database Server Index Builder Producer Artifact Artifact Consumers Search Engine Benefit: compute artifact once
  31. 31. MySQL Database Server Index Builder Producer Artifact Artifact Consumers Search Engine Benefit: scale consumers independently
  32. 32. MySQL Expensive Index Builder Producer Artifact Artifact Commodity Search Engine Benefit: scale consumers independently
  33. 33. MySQL Database Server Index Builder Producer Artifact Artifact Consumers Search Engine Benefit: separate code deployables
  34. 34. fast resilient scalable
  35. 35. Producer artifact Search Engine Consumers artifact Index Builder
  36. 36. Producer artifact Search Engine Consumers artifact Index Builder
  37. 37. rsync efficient point-to-point file transfer utility
  38. 38. 1 consumers should reload data regularly
  39. 39. 1 consumers should reload data regularly 2 roll back
  40. 40. consumers should reload data regularly 2 roll back 3 data reload should not interrupt requests 1
  41. 41. artifact versioning
  42. 42. $ ls -d jobindex.* jobindex.1 jobindex.2 jobindex.3 new directory for new version
  43. 43. $ ls -d jobindex.* jobindex.1 jobindex.2 jobindex.3 jobindex.latest -> jobindex.3 symlink to know current version
  44. 44. $ ls -d jobindex.* jobindex.1 jobindex.2 jobindex.3 jobindex.4 jobindex.latest -> jobindex.4 load new data
  45. 45. $ ls -d jobindex.* jobindex.1 jobindex.2 jobindex.3 jobindex.4 jobindex.latest -> jobindex.3 roll back
  46. 46. each new version takes disk space & time
  47. 47. versions total bytes on disk normal disk copy
  48. 48. versions disk latency total bytes on disk normal disk copy
  49. 49. versions version create time disk latency total bytes on disk normal disk copy
  50. 50. 1.8m jobs, change <2% per hour
  51. 51. all jobs 00:00 AM
  52. 52. all jobs 00:00 AM all jobs 04:00 AM new jobs changed jobs
  53. 53. all jobs 00:00 AM all jobs 04:00 AM new jobs changed jobs unchanged
  54. 54. incremental updates
  55. 55. save disk space & time
  56. 56. share data between versions
  57. 57. file1.bin file2.bin file3.bin 3GB jobindex.1
  58. 58. file1.bin file2.bin file3.bin 3GB jobindex.1 file1.bin file2.bin file3.bin jobindex.2
  59. 59. file1.bin file2.bin file3.bin 3GB jobindex.1 file1.bin file2.bin file3.bin file4.bin 4GB jobindex.2
  60. 60. file1.bin file2.bin file3.bin 3GB jobindex.1 file1.bin file2.bin file3.bin file4.bin 4GB jobindex.2 file1.bin file2.bin file3.bin file4.bin file5.bin 5GB jobindex.3
  61. 61. file1.bin file2.bin file3.bin 3GB jobindex.1 file1.bin file2.bin file3.bin file4.bin 4GB jobindex.2 file1.bin file2.bin file3.bin file4.bin file5.bin 5GB jobindex.3 = 12GB+ +
  62. 62. 5GB file1.bin file2.bin file3.bin 3GB jobindex.1 file1.bin file2.bin file3.bin file4.bin 1GB jobindex.2 file1.bin file2.bin file3.bin file4.bin file5.bin 1GB jobindex.3 =+ +
  63. 63. file1.bin file2.bin file3.bin file4.bin jobindex.2 file1.bin file2.bin file3.bin file5.bin jobindex.3 deleted 1GB 1GB = 5GB+ 2GB file4.bin
  64. 64. remove referenced file of symlink, data is gone
  65. 65. hardlink additional name for an existing file
  66. 66. hardlink != symlink
  67. 67. file1.bin file2.bin file3.bin 3GB jobindex.1 file1.bin file2.bin file3.bin file4.bin 1GB jobindex.2 file1.bin file2.bin file3.bin file4.bin file5.bin 1GB jobindex.3 = 5GB+ +
  68. 68. file1.bin file2.bin file3.bin file4.bin 4GB jobindex.2 file1.bin file2.bin file3.bin file4.bin file5.bin 1GB jobindex.3 = 5GB+
  69. 69. file1.bin file2.bin file3.bin file4.bin file5.bin 5GB jobindex.3 = 5GB
  70. 70. remove last hardlink, data is gone
  71. 71. artifact versions: symlinks + hardlinks + rsync
  72. 72. scale: single producer, many consumers
  73. 73. Job Search Browser Rendering median ~0.5 seconds Feb 24 Feb 25 Feb 26 Feb 27 Feb 28 Feb 29 Mar 1 Mar 2 Mar 3 Mar 4 Mar 5 Mar 6 Mar 7 Mar 8 0 100 200 300 400 500 600 700 800 milliseconds
  74. 74. fast simple resilient scalable How We Build Systems
  75. 75. 2004 Indeed 1999 Lucene 2008 6 countries
  76. 76. 2004 Indeed 1999 Lucene 2008 6 countries 2009 23 countries
  77. 77. 2004 2008 200920062005 22.5 M5.2 M 7.1 M4.0 M1.8 M jobs added or modified each month
  78. 78. 2004 Indeed 1999 Lucene 2008 6 countries 2009 23 countries 2nd datacenter
  79. 79. Producer Consumers artifacts DC1 Staging Consumers artifacts DC2 multi-dc rsync Staging Consumers artifacts DC3
  80. 80. Producer Consumers artifacts DC1 Staging Consumers artifacts DC2 Staging Consumers artifacts DC3 minimize Internet bandwidth
  81. 81. 2011 52 countries 4 datacenters 2004 Indeed 1999 Lucene 2008 6 countries 2009 23 countries
  82. 82. 2004 2008 200920062005 22.5 M5.2 M 7.1 M4.0 M1.8 M jobs added or modified each month 2011 32.5 M
  83. 83. rsync system growing pains
  84. 84. Simple: serially copy one artifact at a time DC1 Producer Artifacts DC2 Staging Artifacts
  85. 85. Problem: serially can cause delays Producer Staging New New New Old DC1 DC2
  86. 86. smalllarge2large1 smalllarge2large1 Workaround: copy separately in “streams” DC1 DC2 Staging Producer
  87. 87. Simple: point-to-point datacenter rsync paths DC4 DC3 DC2 DC1
  88. 88. Problem: Internet, why did you do that? Down DC4 DC3 DC2 DC1
  89. 89. Workaround: shift replication path DC4 DC3 DC2 DC1
  90. 90. Scale: few consumers with rsync Producer Artifacts Consumers
  91. 91. Consumers Producer Grow: many consumers with rsync Artifacts Consumers
  92. 92. Consumers Producer Problem: too many consumers with rsync Artifacts Consumers network 100% used
  93. 93. Workaround: add more network bandwidth Consumers Producer Artifacts Consumers
  94. 94. Workaround: add staging tiers Consumers Producer Artifacts Staging Artifacts Artifacts Staging Artifacts Staging Artifacts Consumers Consumers Consumers Consumers Consumers Consumers Consumers Staging
  95. 95. rsync growth required sysad intervention
  96. 96. 2011 52 countries 2004 Indeed 1999 Lucene 2008 6 countries 2009 23 countries 2014 rsync growth
  97. 97. 100 artifacts, adding +1 producer each month
  98. 98. producing 1,761 TB per month
  99. 99. over 200 consumers, +2 each month
  100. 100. replicating over 21,931 TB per month
  101. 101. staging tiers or network bandwidth, quarterly
  102. 102. modify replication path, monthly
  103. 103. requiring too much intervention from system administrators
  104. 104. sysad dev sysad dev +50% +100% 2014 January December
  105. 105. 2011 52 countries 2004 Indeed 1999 Lucene 2008 6 countries 2009 23 countries 2014 rsync limits
  106. 106. Julie Scully Software Engineer
  107. 107. Jobsearch backend team produces a lot of data
  108. 108. RAD “Resilient Artifact Distribution”
  109. 109. Design GoalsDesign Goals Minimize network bottlenecks Loose coupling Automatic recovery Developer empowerment System-wide visibility 1 2 3 4 5
  110. 110. Design Goals Minimize network bottlenecks Loose coupling Automatic recovery Developer empowerment System-wide visibility 3 4 5 1 2
  111. 111. Design Goals Minimize network bottlenecks Loose coupling Automatic recovery Developer empowerment System-wide visibility 1 2 5 4 3
  112. 112. Design Goals Minimize network bottlenecks Loose coupling Automatic recovery Developer empowerment System-wide visibility 1 2 3 5 4
  113. 113. Design Goals Minimize network bottlenecks Loose coupling Automatic recovery Developer empowerment System-wide visibility 1 2 3 4 5
  114. 114. Design GoalsDesign Goals Minimize network bottlenecks Loose coupling Automatic recovery Developer empowerment System-wide visibility 1 2 3 4 5
  115. 115. No more point-to-point
  116. 116. Measure time and network traffic Bittorrent: Would it work? Sample replication to 3 consumers https://github.com/shevek/ttorrent
  117. 117. Network Test Total MB received + transmitted for 700MB artifact Producer 2,240 Consumer 1 746 Consumer 2 747 Consumer 3 747 machine RSYNC
  118. 118. Network Test Total MB received + transmitted for 700MB artifact Producer 2,240 782 Consumer 1 746 1,226 Consumer 2 747 1,225 Consumer 3 747 1,245 machine BITTORRENTRSYNC
  119. 119. Network Test Total MB received + transmitted for 700MB artifact Producer 2,240 782 Consumer 1 746 1,226 Consumer 2 747 1,225 Consumer 3 747 1,245 Total 4,481 4,480 machine BITTORRENTRSYNC
  120. 120. 24 minutes rsync 5.5 minutes bittorrent Timing Test
  121. 121. How does bittorrent work?
  122. 122. Data split into small pieces of equal size
  123. 123. Hash computed for each piece
  124. 124. File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1
  125. 125. File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1 Piece 1: 75 MB Piece 2: 75 MB Piece 3: 75 MB Piece 4: 75 MB Piece 5: 25 MB
  126. 126. torrent metadata file
  127. 127. { files:file1.bin,100MB; file2.bin,200MB; file3.bin,50MB } { piecelength:75MB } { infohash:XSDJSK;JDISJLD;DJKJDB;KDJB OP;FJEIODK; } .torrent metadata file:
  128. 128. { files:file1.bin,100MB; file2.bin,200MB; file3.bin,50MB } { piecelength:75MB } { infohash:XSDJSK;JDISJLD;DJKJDB;KDJB OP;FJEIODK; } .torrent metadata file:
  129. 129. { files:file1.bin,100MB; file2.bin,200MB; file3.bin,50MB } { piecelength:75MB } { infohash:XSDJSK;JDISJLD;DJKJDB;KDJB OP;FJEIODK; } .torrent metadata file:
  130. 130. Tracker Coordinator of the download
  131. 131. Seeder Any client providing data
  132. 132. Seeder Data I have pieces for info hash Tracker .torrent Info Hash File manifest
  133. 133. Data .torrent Info Hash File manifest Seeder Tracker Info hash peer Map Ok! I have pieces for info hash
  134. 134. Consumer Any client downloading data
  135. 135. Peers for infohash Consumer Tracker .torrent Info Hash File manifest Tracker URL Map Info hash peer How a consumer gets the first piece
  136. 136. Peers for infohash Peerlist Consumer Tracker .torrent Info Hash File manifest Tracker URL Map Info hash peer How a consumer gets the first piece
  137. 137. Data .torrent Info Hash File manifest Consumer/ Seeder I have pieces for infohash Tracker Info hash peer Map It is also a seeder
  138. 138. Consumer 1 Seeding as it downloads Consumer 2 Seeding as it downloads Consumer 3 Seeding as it downloads Seeder SWARM
  139. 139. Didn’t quite meet our needs
  140. 140. Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH5 File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1
  141. 141. jobindex.2 File4.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1
  142. 142. jobindex.2 File4.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH6 Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH5 Piece 6: HASH7 File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1
  143. 143. jobindex.2 File4.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH6 Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH5 Piece 6: HASH7 File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1
  144. 144. File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1 jobindex.2 File4.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.2 File0.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB)
  145. 145. Piece 1: HASH6 Piece 2: HASH7 Piece 3: HASH8 Piece 4: HASH9 Piece 5: HASH10 Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH5 Piece 6: HASH11 File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1 jobindex.2 File4.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.2 File0.bin (50MB) File3.bin (50MB) File1.bin (100MB) File2.bin (200MB)
  146. 146. Control sort order?
  147. 147. jobindex.2 File3.bin (50MB) File1.bin (150MB) File2.bin (200MB) Piece 1: HASH6 Piece 2: HASH7 Piece 3: HASH8 Piece 4: HASH9 Piece 5: HASH10 Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH5 Piece 6: HASH11 File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1
  148. 148. File3.bin (50MB) File1.bin (100MB) File2.bin (200MB) jobindex.1 Piece 1: HASH6 Piece 2: HASH7 Piece 3: HASH8 Piece 4: HASH9 Piece 5: HASH10 Piece 1: HASH1 Piece 2: HASH2 Piece 3: HASH3 Piece 4: HASH4 Piece 5: HASH5 Piece 6: HASH11 File3.bin (50MB) File1.bin (150MB) File2.bin (200MB) jobindex.2
  149. 149. hash each file?
  150. 150. Compare files not pieces
  151. 151. { files:file1.bin,100MB,DATETIME; file2.bin,200MB,DATETIME; file3.bin,50MB,DATETIME } { piecelength:75MB } ... .torrent metadata file contents:
  152. 152. File1.bin (100MB) File2.bin (200MB) File3.bin (50MB) jobindex.1 Piece 1: File 0, File1 Piece 2: File 1 Piece 3: File 1, File 2 Piece 4: File 2 Piece 5: File 2, File 3 Piece 6: File 3 File1.bin (100MB) File2.bin (200MB) File3.bin (50MB) jobindex.2 File0.bin (50MB)
  153. 153. File1.bin (100MB) File2.bin (200MB) File3.bin (50MB) jobindex.1 File1.bin (100MB) File2.bin (200MB) File3.bin (50MB) jobindex.2 File0.bin (50MB) Piece 1: File 0, File1 Piece 2: File 1 Piece 3: File 1, File 2 Piece 4: File 2 Piece 5: File 2, File 3 Piece 6: File 3
  154. 154. Bittorrent Evaluation Result substantially faster drastically reduces network load on the producer machine horizontally scalable
  155. 155. Design GoalsDesign Goals Automatic recovery Developer empowerment System-wide visibility 3 4 5 Loose coupling2 Minimize network bottlenecks1
  156. 156. Service-oriented architecture
  157. 157. Headwater The beginning of a river
  158. 158. Headwater Host Data Producer Data Publish my data
  159. 159. Headwater takes ownership of the data (hardlink + read-only)
  160. 160. Headwater Host Data Producer Data Publish my data Will do!
  161. 161. Headwater Host Data Producer Data
  162. 162. create the .torrent metadata file
  163. 163. Headwater The beginning of a river River Course the water carves across the landscape
  164. 164. Rhone RhoneRhone Zookeeper Rhone: multi-master coordinator service
  165. 165. Rhone Headwater Host Data Producer Data
  166. 166. Rhone Headwater Host Data Producer Datadata.version torrent metadata
  167. 167. Rhone Headwater Host Data Producer Datadata.version torrent metadata
  168. 168. Rhone Headwater Host Data Producer Data Tracker .torrent metadata can be retrieved data.version torrent metadata
  169. 169. Headwater The beginning of a river River Course the water carves across the landscape Delta The end of the river
  170. 170. Subscribe to data! Delta Host Data Consumer
  171. 171. Make all subscribed artifacts available
  172. 172. RhoneDelta Host Data Consumer Headwater Host Data Producer Data
  173. 173. Delta Data Consumer Rhone Host
  174. 174. Tracker Delta Host Data ConsumerData /rad/data
  175. 175. Delta Host Data ConsumerData Where’s the latest data? /rad/data
  176. 176. It’s at /rad/data Delta Host Data ConsumerData Where’s the latest data? /rad/data
  177. 177. Delta Host Data ConsumerData /rad/data
  178. 178. Keep all subscribed artifacts current
  179. 179. Delta Data Consumer Rhone Host
  180. 180. Rhone Data Host Artifact Availability Flow Delta Headwater Host Data Consumer Data Producer Data
  181. 181. Design GoalsDesign Goals Automatic recovery Developer empowerment System-wide visibility 4 5 Minimize network bottlenecks1 Loose coupling2 3
  182. 182. Rhone Headwater Host Data Producer Data Crash!
  183. 183. Rhone Headwater Data Producer Datadata.version torrent metadata Tracker Crash! Host
  184. 184. Development philosophy: Make recovery the common case
  185. 185. Durable state with atomic filesystem operations
  186. 186. All service calls are idempotent
  187. 187. RAD handles network recovery
  188. 188. DC4 DC3 DC2 DC1 rsync is point-to-point
  189. 189. DC1 DC4 DC3 DC2 bittorrent peer-to-peer
  190. 190. Down DC1 DC4 DC3 DC2 No problem with bittorrent swarm
  191. 191. RAD treats artifact independently
  192. 192. Design GoalsDesign Goals Developer empowerment System-wide visibility5 Minimize network bottlenecks1 Loose coupling2 Automatic recovery3 4
  193. 193. Adding a new artifact in the rsync system
  194. 194. Ask System Administrators
  195. 195. Adding a new artifact in the RAD system
  196. 196. Declare it in the code
  197. 197. REST API is language agnostic
  198. 198. Design GoalsDesign Goals System-wide visibility Minimize network bottlenecks1 Loose coupling2 Automatic recovery3 Developer empowerment4 5
  199. 199. Rhone already knows all artifacts
  200. 200. Rhone stores list of versions by artifact. version 4 version 5 version 6 artifactA version 221 version 226 version 227 version 228 artifactB version 1artifactC
  201. 201. Heartbeats from Delta and Headwater
  202. 202. Rhone has system-wide view
  203. 203. RADAR: Developers can easily see where their data is
  204. 204. RADAR: Developers can easily see where their data is
  205. 205. RADAR: Developers can easily see where their data is
  206. 206. RADAR: Developers can easily see where their data is
  207. 207. start simple and iterate
  208. 208. 2011 52 countries 2004 Indeed 2008 6 countries 2009 23 countries 2014 rsync limits 1st artifact migrated to RAD
  209. 209. Lesson learned: prevent people from using the system incorrectly
  210. 210. We made configuration TOO easy
  211. 211. New Requirement: protect the disks
  212. 212. Delta Prevent downloading artifacts that will fill the disk (and alarm)
  213. 213. 2011 52 countries 2004 Indeed 2008 6 countries 2009 23 countries 2014 rsync limits 1st artifact migrated to RAD 2015 critical artifacts migrated
  214. 214. 2011 52 countries 2004 Indeed 2008 6 countries 2009 23 countries 2014 rsync limits 1st artifact migrated to RAD 2015 critical artifacts migrated 2016 80 RAD artifacts
  215. 215. 2011 52 countries 2004 Indeed 2008 6 countries 2009 23 countries 2014 rsync limits 1st artifact migrated to RAD 2015 critical artifacts migrated 2016 80 RAD artifacts 100 artifacts in 10 years
  216. 216. 100 artifacts in 10 years 2011 52 countries 2004 Indeed 2008 6 countries 2009 23 countries 2014 rsync limits 1st artifact migrated to RAD 2015 critical artifacts migrated 2016 80 RAD artifacts 80 new artifacts in 1 year
  217. 217. 7,666 versions published Producer Consumer 56 unique producers 52,357 versions downloaded 670 unique consumers RAD Stats March 23, 2016
  218. 218. Duration of JobIndex replication in RAD v. Rsync Jan 18 6 AM 12 PM 6 PM Jan 19 6 AM 12 PM 6 PM 1,000 2,000 3,000 RAD rsync time
  219. 219. replicating over 65,193 TB per month
  220. 220. Learn More Engineering blog & talks http://indeed.tech Open Source http://opensource.indeedeng.io Careers http://indeed.jobs Twitter @IndeedEng

×