Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017

179 vues

Publié le

Distributed Streams on NYJavaSig

Publié dans : Technologie
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017

  1. 1. This slide blank on purpose
  2. 2. > whoami • Solutions Architect @Hazelcast • Hang out with awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
  3. 3. Agenda • Refreshing knowledge on Java 8 Streams • Distribute and Conquer • Distributed Data • Distributed Streams • How we did all this
  4. 4. Java 8 Streams
  5. 5. Java 8 Streams… • An abstraction represents a sequence of elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
  6. 6. Why I should care about Stream API? • You’re Java developer
  7. 7. What does regular Java developer think about Scala? advanced
  8. 8. Why I should care about Stream API? • You’re Java developer • Many Java developers know Java • It’s all about data processing
  9. 9. java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() • sorted()
  10. 10. Problem • One does not simply put all Big Data in one machine
  11. 11. Problem • Data doesn’t fit just one machine
  12. 12. Problem • One does not simply put all Big Data in one machine • Data is too important to have it only one machine
  13. 13. CACHES
  14. 14. Replication on Sharding? http://book.mixu.net/distsys/single-page.html
  15. 15. Solution • Use Distributed Map aka IMap
  16. 16. What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2 Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
  17. 17. Green Primary Green Backup Green Shard
  18. 18. Problem • Lambda serialization 27
  19. 19. 28
  20. 20. Solution • serializable version of the interfaces • Introducing DistributedStream 29
  21. 21. 30
  22. 22. 32 Jet Streams
  23. 23. What’s Hazelcast Jet? • General purpose distributed data processing framework • Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 34
  24. 24. DAG 36
  25. 25. Job Execution 37
  26. 26. Future (It’s bright!) • Memory module for processing big data • Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
  27. 27. Your fuel, our Jet Engine • Public release – Feb 7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note viktor@hazelcast.com • Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
  28. 28. Conclusion • Java Stream API provides very white range of data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters

×