SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
DATASERVICES
PROCESSING (BIG) DATA THE
MICROSERVICE WAY
Dr. Josef Adersberger ( @adersberger), QAware GmbH
http://www.datasciencecentral.com
ENTERPRISE
http://www.cardinalfang.net/misc/companies_list.html
?
PROCESSING
BIG DATA FAST DATA
SMART DATA
All things distributed:
‣distributed 

processing
‣distributed 

databases
Data to information:
‣machine (deep) learning
‣advanced statistics
‣natural language processing
‣semantic web
Low latency and 

high throughput:
‣stream processing
‣messaging
‣event-driven
DATA

PROCESSING
SYSTEM

INTEGRATION
APIS UIS
data -> information
information -> userinformation -> systems
information 

-> blended information
SOLUTIONS
The {big,SMART,FAST} data 

Swiss Army Knifes
( )
node
Distributed Data
Distributed Processing
Driver data flow
icon credits to Nimal Raj (database), Arthur Shlain (console) and alvarobueno (takslist)
DATA SERVICES
{BIG, FAST,
SMART}
DATA
MICRO-

SERVICE
BASIC IDEA: DATA PROCESSING WITHIN A GRAPH OF MICROSERVICES
Microservice

(aka Dataservice)
Message 

Queue
Sources Processors Sinks
DIRECTED GRAPH OF MICROSERVICES EXCHANGING DATA VIA MESSAGING
BASIC IDEA: COHERENT PLATFORM FOR MICRO- AND DATASERVICES
CLUSTER OPERATING SYSTEM
IAAS ON PREM LOCAL
MICROSERVICES
DATASERVICES
MICROSERVICES PLATFORM
DATASERVICES PLATFORM
OPEN SOURCE DATASERVICE PLATFORMS
‣ Open source project based on the Spring stack
‣ Microservices: Spring Boot
‣ Messaging: Kafka 0.9, Kafka 0.10, RabbitMQ
‣ Standardized API with several open source implementations
‣ Microservices: JavaEE micro container
‣ Messaging: JMS
‣ Open source by Lightbend (part. commercialised & proprietary)
‣ Microservices: Lagom, Play
‣ Messaging: akka
ARCHITECT’S VIEW
- ON SPRING CLOUD DATA FLOW
DATASERVICES
BASIC IDEA: DATA PROCESSING WITHIN A GRAPH OF MICROSERVICES
Sources Processors Sinks
DIRECTED GRAPH OF SPRING BOOT MICROSERVICES EXCHANGING DATA VIA MESSAGING
Stream
App
Message 

Broker
Channel
THE BIG PICTURE
SPRING CLOUD DATA FLOW SERVER (SCDF SERVER)
TARGET RUNTIME
SPI
API
LOCAL
SCDF Shell
SCDF Admin UI
Flo Stream Designer
THE BIG PICTURE
SPRING CLOUD DATA FLOW SERVER (SCDF SERVER)
TARGET RUNTIME
MESSAGE BROKER
APP
SPRING BOOT
SPRING FRAMEWORK
SPRING CLOUD STREAM
SPRING INTEGRATION
BINDER
APP
APP
APP
CHANNELS

(input/output)
THE VEINS: SCALABLE DATA LOGISTICS WITH MESSAGING
Sources Processors Sinks
STREAM PARTITIONING: TO BE ABLE TO SCALE MICROSERVICES
BACK PRESSURE HANDLING: TO BE ABLE TO COPE WITH PEEKS
STREAM PARTITIONING
output 

instances

(consumer group)
PARTITION KEY -> PARTITION SELECTOR -> PARTITION INDEX
input

(provider)
f(message)->field f(field)->index f(index)->pindex
pindex = index % output instances
message 

partitioning
BACK PRESSURE HANDLING
1
3
2
1. Signals if (message) pressure is too high
2. Regulates inbound (message) flow
3. (Data) retention lake
DISCLAIMER: THERE IS ALSO A TASK EXECUTION MODEL (WE WILL IGNORE)
‣ short-living
‣finite data set
‣programming model = Spring Cloud Task
‣starters available for JDBC and Spark 

as data source/sink
CONNECTED CAR PLATFORM
EDGE SERVICE
MQTT Broker

(apigee Link)
MQTT Source Data 

Cleansing
Realtime traffic

analytics
KPI ANALYTICS
Spark
DASHBOARD
react-vis
Presto
Masterdata

Blending
Camel
KafkaKafka
ESB
gPRC
DEVELOPERS’S VIEW
-ON SPRING CLOUD DATA FLOW
DATASERVICES
ASSEMBLING A STREAM
▸ App starters: A set of pre-built

apps aka dataservices
▸ Composition of apps with linux-style 

pipe syntax:
http | magichappenshere | log
Starter app
Custom app
https://www.pinterest.de/pin/272116002461148164
MORE PIPES
twitterstream 

--consumerKey=<CONSUMER_KEY> 

--consumerSecret=<CONSUMER_SECRET> 

--accessToken=<ACCESS_TOKEN> 

--accessTokenSecret=<ACCESS_TOKEN_SECRET> 

| log
:tweets.twitterstream > 

field-value-counter 

--fieldName=lang --name=language
:tweets.twitterstream > 

filter 

--expression=#jsonPath(payload,’$.lang’)=='en'
--outputType=application/json
with parameters:
with explicit input channel & analytics:
with SpEL expression and explicit output type
OUR SAMPLE APPLICATION: WORLD MOOD
https://github.com/adersberger/spring-cloud-dataflow-samples
twitterstream
Starter app
Custom app
filter

(lang=en)
log
twitter ingester

(test data)
tweet extractor

(text)
sentiment

analysis

(StanfordNLP)
field-value-counter
DEVELOPING CUSTOM APPS: THE VERY BEGINNING
https://start.spring.io
@SpringBootApplication
@EnableBinding(Source.class)
public class TwitterIngester {
private Iterator<String> lines;
@Bean
@InboundChannelAdapter(value = Source.OUTPUT,
poller = @Poller(fixedDelay = "200", maxMessagesPerPoll = "1"))
public MessageSource<String> twitterMessageSource() {
return () -> new GenericMessage<>(emitTweet());
}
private String emitTweet() {
if (lines == null || !lines.hasNext()) lines = readTweets();
return lines.next();
}
private Iterator<String> readTweets() {
//…
}
}
PROGRAMMING MODEL: SOURCE
@RunWith(SpringRunner.class)
@SpringBootTest(webEnvironment= SpringBootTest.WebEnvironment.RANDOM_PORT)
public class TwitterIngesterTest {
@Autowired
private Source source;
@Autowired
private MessageCollector collector;
@Test
public void tweetIngestionTest() throws InterruptedException {
for (int i = 0; i < 100; i++) {
Message<String> message = (Message<String>) 

collector.forChannel(source.output()).take();
assert (message.getPayload().length() > 0);
}
}
}
PROGRAMMING MODEL: SOURCE TESTING
PROGRAMMING MODEL: PROCESSOR (WITH STANFORD NLP)
@SpringBootApplication
@EnableBinding(Processor.class)
public class TweetSentimentProcessor {
@Autowired
StanfordNLP nlp;
@StreamListener(Processor.INPUT) //input channel with default name
@SendTo(Processor.OUTPUT) //output channel with default name
public int analyzeSentiment(String tweet){
return TupleBuilder.tuple().of("mood", findSentiment(tweet));
}
public int findSentiment(String tweet) {
int mainSentiment = 0;
if (tweet != null && tweet.length() > 0) {
int longest = 0;
Annotation annotation = nlp.process(tweet);
for (CoreMap sentence : annotation.get(CoreAnnotations.SentencesAnnotation.class)) {
Tree tree = sentence.get(SentimentCoreAnnotations.SentimentAnnotatedTree.class);
int sentiment = RNNCoreAnnotations.getPredictedClass(tree);
String partText = sentence.toString();
if (partText.length() > longest) {
mainSentiment = sentiment;
longest = partText.length();
}
}
}
return mainSentiment;
}
}
PROGRAMMING MODEL: PROCESSOR TESTING
@RunWith(SpringRunner.class)
@SpringBootTest(webEnvironment= SpringBootTest.WebEnvironment.RANDOM_PORT)
public class TweetSentimentProcessorTest {
@Autowired
private Processor processor;
@Autowired
private MessageCollector collector;
@Autowired
private TweetSentimentProcessor sentimentProcessor;
@Test
public void testAnalysis() {
checkFor("I hate everybody around me!");
checkFor("The world is lovely");
checkFor("I f***ing hate everybody around me. They're from hell");
checkFor("Sunny day today!");
}
private void checkFor(String msg) {
processor.input().send(new GenericMessage<>(msg));
assertThat(
collector.forChannel(processor.output()),
receivesPayloadThat(
equalTo(TupleBuilder.tuple().of("mood", sentimentProcessor.findSentiment(msg)));
}
}
DEVELOPING THE STREAM DEFINITIONS WITH FLO
http://projects.spring.io/spring-flo/
RUNNING IT LOCAL
RUNNING THE DATASERVICES
$ redis-server &

$ zookeeper-server-start.sh . /config/zookeeper.properties &

$ kafka-server-start.sh ./config/server.properties &

$ java -jar spring-cloud-dataflow-server-local-1.2.0.RELEASE.jar &

$ java -jar ../spring-cloud-dataflow-shell-1.2.0.RELEASE.jar
dataflow:> app import —uri [1]



dataflow:> app register --name tweetsentimentalyzer --type processor --uri file:///libs/
worldmoodindex-0.0.2-SNAPSHOT.jar



dataflow:> stream create tweets-ingestion --definition "twitterstream --consumerKey=A --
consumerSecret=B --accessToken=C --accessTokenSecret=D | filter —
expression=#jsonPath(payload,’$.lang')=='en' | log" —deploy



dataflow:> stream create tweets-analyzer --definition “:tweets-ingestion.filter >
tweetsentimentalyzer | field-value-counter --fieldName=mood —name=Mood"


dataflow:> stream deploy tweets-analyzer —properties
“deployer.tweetsentimentalyzer.memory=1024m,deployer.tweetsentimentalyzer.count=8,

app.transform.producer.partitionKeyExpression=payload.id"
[1] http://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-stream-app-descriptor/Bacon.RELEASE/
spring-cloud-stream-app-descriptor-Bacon.RELEASE.kafka-10-apps-maven-repo-url.properties
RUNNING IT IN THE CLOUD
RUNNING THE DATASERVICES
$ git clone https://github.com/spring-cloud/spring-cloud-dataflow-server-kubernetes

$ kubectl create -f src/etc/kubernetes/kafka-zk-controller.yml

$ kubectl create -f src/etc/kubernetes/kafka-zk-service.yml

$ kubectl create -f src/etc/kubernetes/kafka-controller.yml

$ kubectl create -f src/etc/kubernetes/mysql-controller.yml

$ kubectl create -f src/etc/kubernetes/mysql-service.yml

$ kubectl create -f src/etc/kubernetes/kafka-service.yml

$ kubectl create -f src/etc/kubernetes/redis-controller.yml

$ kubectl create -f src/etc/kubernetes/redis-service.yml

$ kubectl create -f src/etc/kubernetes/scdf-config-kafka.yml

$ kubectl create -f src/etc/kubernetes/scdf-secrets.yml

$ kubectl create -f src/etc/kubernetes/scdf-service.yml

$ kubectl create -f src/etc/kubernetes/scdf-controller.yml

$ kubectl get svc #lookup external ip “scdf” <IP>
$ java -jar ../spring-cloud-dataflow-shell-1.2.0.RELEASE.jar
dataflow:> dataflow config server --uri http://<IP>:9393

dataflow:> app import —uri [2]

dataflow:> app register --type processor --name tweetsentimentalyzer --uri docker:qaware/
tweetsentimentalyzer-processor:latest
dataflow:> …
[2] http://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-stream-app-descriptor/Bacon.RELEASE/spring-
cloud-stream-app-descriptor-Bacon.RELEASE.stream-apps-kafka-09-docker
LESSONS LEARNED
PRO CON
specialized programming

model -> efficient
specialized execution 

environment -> efficient
support for all types of data

(big, fast, smart)
disjoint programming model 

(data processing <-> services)
maybe a disjoint execution

environment

(data stack <-> service stack)
BEST USED
further on: as default for {big,fast,smart} data processing
PRO CON
coherent execution
environment (runs on
microservice stack)
coherent programming
model with emphasis on
separation of concerns
bascialy supports all types of
data (big, fast, smart)
has limitations on throughput

(big & fast data) due to less
optimization (like data affinity,
query optimizer, …) and
message-wise processing
technology immature in certain

parts (e.g. diagnosability)
BEST USED FOR
hybrid applications of data processing, system integration, API, UI
moderate throughput data applications with existing dev team
Message by message processing
TWITTER.COM/QAWARE - SLIDESHARE.NET/QAWARE
Thank you!
Questions?
josef.adersberger@qaware.de
@adersberger
https://github.com/adersberger/spring-cloud-dataflow-samples
BONUS SLIDES
MORE…
▸ Reactive programming
▸ Diagnosability
public Flux<String> transform(@Input(“input”) Flux<String> input) {
return input.map(s -> s.toUpperCase());
}
@EnableBinding(Sink::class)
@EnableConfigurationProperties(PostgresSinkProperties::class)
class PostgresSink {
@Autowired
lateinit var props: PostgresSinkProperties
@StreamListener(Sink.INPUT)
fun processTweet(message: String) {
Database.connect(props.url, user = props.user, password = props.password,
driver = "org.postgresql.Driver")
transaction {
SchemaUtils.create(Messages)
Messages.insert {
it[Messages.message] = message
}
}
}
}
object Messages : Table() {
val id = integer("id").autoIncrement().primaryKey()
val message = text("message")
}
PROGRAMMING MODEL: SINK (WITH KOTLIN)
MICRO ANALYTICS SERVICES
Microservice
Dashboard
Microservice …
BLUEPRINT ARCHITECTURE
ARCHITECT’S VIEW
THE SECRET OF BIG DATA PERFORMANCE
Rule 1: Be as close to the data as possible!

(CPU cache > memory > local disk > network)
Rule 2: Reduce data volume as early as possible! 

(as long as you don’t sacrifice parallelization)
Rule 3: Parallelize as much as possible!
Rule 4: Premature diagnosability and optimization
THE BIG PICTURE
http://cloud.spring.io/spring-cloud-dataflow
BASIC IDEA: BI-MODAL SOURCES AND SINKS
Sources Processors Sinks
READ FROM / WRITE TO: FILE, DATABASE, URL, …
INGEST FROM / DIGEST TO: TWITTER, MQ, LOG, …

Contenu connexe

Tendances

Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!Guido Schmutz
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache StormP. Taylor Goetz
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub ServiceOracle Korea
 
Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipesinside-BigData.com
 
A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)Markus Günther
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and howPetr Zapletal
 
TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...
TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...
TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...tdc-globalcode
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsGuozhang Wang
 
RedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and BeyondRedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and BeyondRedis Labs
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark Summit
 
Open-source Infrastructure at Lyft
Open-source Infrastructure at LyftOpen-source Infrastructure at Lyft
Open-source Infrastructure at LyftDaniel Hochman
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
 
MongoDB World 2018: What's Next? The Path to Sharded Transactions
MongoDB World 2018: What's Next? The Path to Sharded TransactionsMongoDB World 2018: What's Next? The Path to Sharded Transactions
MongoDB World 2018: What's Next? The Path to Sharded TransactionsMongoDB
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open libertyAndy Mauer
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSSadique Puthen
 

Tendances (20)

Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
 
Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipes
 
A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Reactive Design Patterns
Reactive Design PatternsReactive Design Patterns
Reactive Design Patterns
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
 
TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...
TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...
TDC2016POA | Trilha Infraestrutura - Apache Mesos & Marathon: gerenciando rem...
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
 
RedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and BeyondRedisConf18 - 2,000 Instances and Beyond
RedisConf18 - 2,000 Instances and Beyond
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
 
Open-source Infrastructure at Lyft
Open-source Infrastructure at LyftOpen-source Infrastructure at Lyft
Open-source Infrastructure at Lyft
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
MongoDB World 2018: What's Next? The Path to Sharded Transactions
MongoDB World 2018: What's Next? The Path to Sharded TransactionsMongoDB World 2018: What's Next? The Path to Sharded Transactions
MongoDB World 2018: What's Next? The Path to Sharded Transactions
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
 

Similaire à Dataservices: Processing (Big) Data the Microservice Way

Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Stephan Ewen
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextPrateek Maheshwari
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
 
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...Jürgen Ambrosi
 
Stream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data MicroservicesStream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data Microservicesmarius_bogoevici
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAmazon Web Services
 
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSmack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSpark Summit
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
 
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...Tathagata Das
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent MonitoringIntelie
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's comingDatabricks
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Amazon Web Services
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Anyscale
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEventsNeil Avery
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisHelena Edelson
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in RealtimeDataWorks Summit
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 

Similaire à Dataservices: Processing (Big) Data the Microservice Way (20)

Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
 
Stream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data MicroservicesStream and Batch Processing in the Cloud with Data Microservices
Stream and Batch Processing in the Cloud with Data Microservices
 
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache StormAWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
 
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSmack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
Guest Lecture on Spark Streaming in Stanford CME 323: Distributed Algorithms ...
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's coming
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0
 
Serverless London 2019 FaaS composition using Kafka and CloudEvents
Serverless London 2019   FaaS composition using Kafka and CloudEventsServerless London 2019   FaaS composition using Kafka and CloudEvents
Serverless London 2019 FaaS composition using Kafka and CloudEvents
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in Realtime
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 

Plus de QAware GmbH

50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdfQAware GmbH
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...QAware GmbH
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzFully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzQAware GmbH
 
Down the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureDown the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureQAware GmbH
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!QAware GmbH
 
Make Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringMake Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringQAware GmbH
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightDer Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightQAware GmbH
 
Was kommt nach den SPAs
Was kommt nach den SPAsWas kommt nach den SPAs
Was kommt nach den SPAsQAware GmbH
 
Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo QAware GmbH
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See... Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...QAware GmbH
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster QAware GmbH
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.QAware GmbH
 
Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!QAware GmbH
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s AutoscalingQAware GmbH
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPKontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPQAware GmbH
 
Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.QAware GmbH
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s AutoscalingQAware GmbH
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.QAware GmbH
 
Per Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysPer Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysQAware GmbH
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster QAware GmbH
 

Plus de QAware GmbH (20)

50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzFully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
 
Down the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureDown the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile Architecture
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
 
Make Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringMake Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform Engineering
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightDer Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
 
Was kommt nach den SPAs
Was kommt nach den SPAsWas kommt nach den SPAs
Was kommt nach den SPAs
 
Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See... Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
 
Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPKontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
 
Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.
 
Per Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysPer Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API Gateways
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
 

Dernier

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 

Dernier (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 

Dataservices: Processing (Big) Data the Microservice Way

  • 1. DATASERVICES PROCESSING (BIG) DATA THE MICROSERVICE WAY Dr. Josef Adersberger ( @adersberger), QAware GmbH
  • 3. BIG DATA FAST DATA SMART DATA All things distributed: ‣distributed 
 processing ‣distributed 
 databases Data to information: ‣machine (deep) learning ‣advanced statistics ‣natural language processing ‣semantic web Low latency and 
 high throughput: ‣stream processing ‣messaging ‣event-driven
  • 4. DATA
 PROCESSING SYSTEM
 INTEGRATION APIS UIS data -> information information -> userinformation -> systems information 
 -> blended information
  • 6. The {big,SMART,FAST} data 
 Swiss Army Knifes ( )
  • 7. node Distributed Data Distributed Processing Driver data flow icon credits to Nimal Raj (database), Arthur Shlain (console) and alvarobueno (takslist)
  • 9. BASIC IDEA: DATA PROCESSING WITHIN A GRAPH OF MICROSERVICES Microservice
 (aka Dataservice) Message 
 Queue Sources Processors Sinks DIRECTED GRAPH OF MICROSERVICES EXCHANGING DATA VIA MESSAGING
  • 10. BASIC IDEA: COHERENT PLATFORM FOR MICRO- AND DATASERVICES CLUSTER OPERATING SYSTEM IAAS ON PREM LOCAL MICROSERVICES DATASERVICES MICROSERVICES PLATFORM DATASERVICES PLATFORM
  • 11. OPEN SOURCE DATASERVICE PLATFORMS ‣ Open source project based on the Spring stack ‣ Microservices: Spring Boot ‣ Messaging: Kafka 0.9, Kafka 0.10, RabbitMQ ‣ Standardized API with several open source implementations ‣ Microservices: JavaEE micro container ‣ Messaging: JMS ‣ Open source by Lightbend (part. commercialised & proprietary) ‣ Microservices: Lagom, Play ‣ Messaging: akka
  • 12. ARCHITECT’S VIEW - ON SPRING CLOUD DATA FLOW DATASERVICES
  • 13. BASIC IDEA: DATA PROCESSING WITHIN A GRAPH OF MICROSERVICES Sources Processors Sinks DIRECTED GRAPH OF SPRING BOOT MICROSERVICES EXCHANGING DATA VIA MESSAGING Stream App Message 
 Broker Channel
  • 14. THE BIG PICTURE SPRING CLOUD DATA FLOW SERVER (SCDF SERVER) TARGET RUNTIME SPI API LOCAL SCDF Shell SCDF Admin UI Flo Stream Designer
  • 15. THE BIG PICTURE SPRING CLOUD DATA FLOW SERVER (SCDF SERVER) TARGET RUNTIME MESSAGE BROKER APP SPRING BOOT SPRING FRAMEWORK SPRING CLOUD STREAM SPRING INTEGRATION BINDER APP APP APP CHANNELS
 (input/output)
  • 16. THE VEINS: SCALABLE DATA LOGISTICS WITH MESSAGING Sources Processors Sinks STREAM PARTITIONING: TO BE ABLE TO SCALE MICROSERVICES BACK PRESSURE HANDLING: TO BE ABLE TO COPE WITH PEEKS
  • 17. STREAM PARTITIONING output 
 instances
 (consumer group) PARTITION KEY -> PARTITION SELECTOR -> PARTITION INDEX input
 (provider) f(message)->field f(field)->index f(index)->pindex pindex = index % output instances message 
 partitioning
  • 18. BACK PRESSURE HANDLING 1 3 2 1. Signals if (message) pressure is too high 2. Regulates inbound (message) flow 3. (Data) retention lake
  • 19. DISCLAIMER: THERE IS ALSO A TASK EXECUTION MODEL (WE WILL IGNORE) ‣ short-living ‣finite data set ‣programming model = Spring Cloud Task ‣starters available for JDBC and Spark 
 as data source/sink
  • 20. CONNECTED CAR PLATFORM EDGE SERVICE MQTT Broker
 (apigee Link) MQTT Source Data 
 Cleansing Realtime traffic
 analytics KPI ANALYTICS Spark DASHBOARD react-vis Presto Masterdata
 Blending Camel KafkaKafka ESB gPRC
  • 21. DEVELOPERS’S VIEW -ON SPRING CLOUD DATA FLOW DATASERVICES
  • 22. ASSEMBLING A STREAM ▸ App starters: A set of pre-built
 apps aka dataservices ▸ Composition of apps with linux-style 
 pipe syntax: http | magichappenshere | log Starter app Custom app
  • 23. https://www.pinterest.de/pin/272116002461148164 MORE PIPES twitterstream 
 --consumerKey=<CONSUMER_KEY> 
 --consumerSecret=<CONSUMER_SECRET> 
 --accessToken=<ACCESS_TOKEN> 
 --accessTokenSecret=<ACCESS_TOKEN_SECRET> 
 | log :tweets.twitterstream > 
 field-value-counter 
 --fieldName=lang --name=language :tweets.twitterstream > 
 filter 
 --expression=#jsonPath(payload,’$.lang’)=='en' --outputType=application/json with parameters: with explicit input channel & analytics: with SpEL expression and explicit output type
  • 24. OUR SAMPLE APPLICATION: WORLD MOOD https://github.com/adersberger/spring-cloud-dataflow-samples twitterstream Starter app Custom app filter
 (lang=en) log twitter ingester
 (test data) tweet extractor
 (text) sentiment
 analysis
 (StanfordNLP) field-value-counter
  • 25. DEVELOPING CUSTOM APPS: THE VERY BEGINNING https://start.spring.io
  • 26. @SpringBootApplication @EnableBinding(Source.class) public class TwitterIngester { private Iterator<String> lines; @Bean @InboundChannelAdapter(value = Source.OUTPUT, poller = @Poller(fixedDelay = "200", maxMessagesPerPoll = "1")) public MessageSource<String> twitterMessageSource() { return () -> new GenericMessage<>(emitTweet()); } private String emitTweet() { if (lines == null || !lines.hasNext()) lines = readTweets(); return lines.next(); } private Iterator<String> readTweets() { //… } } PROGRAMMING MODEL: SOURCE
  • 27. @RunWith(SpringRunner.class) @SpringBootTest(webEnvironment= SpringBootTest.WebEnvironment.RANDOM_PORT) public class TwitterIngesterTest { @Autowired private Source source; @Autowired private MessageCollector collector; @Test public void tweetIngestionTest() throws InterruptedException { for (int i = 0; i < 100; i++) { Message<String> message = (Message<String>) 
 collector.forChannel(source.output()).take(); assert (message.getPayload().length() > 0); } } } PROGRAMMING MODEL: SOURCE TESTING
  • 28. PROGRAMMING MODEL: PROCESSOR (WITH STANFORD NLP) @SpringBootApplication @EnableBinding(Processor.class) public class TweetSentimentProcessor { @Autowired StanfordNLP nlp; @StreamListener(Processor.INPUT) //input channel with default name @SendTo(Processor.OUTPUT) //output channel with default name public int analyzeSentiment(String tweet){ return TupleBuilder.tuple().of("mood", findSentiment(tweet)); } public int findSentiment(String tweet) { int mainSentiment = 0; if (tweet != null && tweet.length() > 0) { int longest = 0; Annotation annotation = nlp.process(tweet); for (CoreMap sentence : annotation.get(CoreAnnotations.SentencesAnnotation.class)) { Tree tree = sentence.get(SentimentCoreAnnotations.SentimentAnnotatedTree.class); int sentiment = RNNCoreAnnotations.getPredictedClass(tree); String partText = sentence.toString(); if (partText.length() > longest) { mainSentiment = sentiment; longest = partText.length(); } } } return mainSentiment; } }
  • 29. PROGRAMMING MODEL: PROCESSOR TESTING @RunWith(SpringRunner.class) @SpringBootTest(webEnvironment= SpringBootTest.WebEnvironment.RANDOM_PORT) public class TweetSentimentProcessorTest { @Autowired private Processor processor; @Autowired private MessageCollector collector; @Autowired private TweetSentimentProcessor sentimentProcessor; @Test public void testAnalysis() { checkFor("I hate everybody around me!"); checkFor("The world is lovely"); checkFor("I f***ing hate everybody around me. They're from hell"); checkFor("Sunny day today!"); } private void checkFor(String msg) { processor.input().send(new GenericMessage<>(msg)); assertThat( collector.forChannel(processor.output()), receivesPayloadThat( equalTo(TupleBuilder.tuple().of("mood", sentimentProcessor.findSentiment(msg))); } }
  • 30. DEVELOPING THE STREAM DEFINITIONS WITH FLO http://projects.spring.io/spring-flo/
  • 31. RUNNING IT LOCAL RUNNING THE DATASERVICES $ redis-server &
 $ zookeeper-server-start.sh . /config/zookeeper.properties &
 $ kafka-server-start.sh ./config/server.properties &
 $ java -jar spring-cloud-dataflow-server-local-1.2.0.RELEASE.jar &
 $ java -jar ../spring-cloud-dataflow-shell-1.2.0.RELEASE.jar dataflow:> app import —uri [1]
 
 dataflow:> app register --name tweetsentimentalyzer --type processor --uri file:///libs/ worldmoodindex-0.0.2-SNAPSHOT.jar
 
 dataflow:> stream create tweets-ingestion --definition "twitterstream --consumerKey=A -- consumerSecret=B --accessToken=C --accessTokenSecret=D | filter — expression=#jsonPath(payload,’$.lang')=='en' | log" —deploy
 
 dataflow:> stream create tweets-analyzer --definition “:tweets-ingestion.filter > tweetsentimentalyzer | field-value-counter --fieldName=mood —name=Mood" 
 dataflow:> stream deploy tweets-analyzer —properties “deployer.tweetsentimentalyzer.memory=1024m,deployer.tweetsentimentalyzer.count=8,
 app.transform.producer.partitionKeyExpression=payload.id" [1] http://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-stream-app-descriptor/Bacon.RELEASE/ spring-cloud-stream-app-descriptor-Bacon.RELEASE.kafka-10-apps-maven-repo-url.properties
  • 32.
  • 33. RUNNING IT IN THE CLOUD RUNNING THE DATASERVICES $ git clone https://github.com/spring-cloud/spring-cloud-dataflow-server-kubernetes
 $ kubectl create -f src/etc/kubernetes/kafka-zk-controller.yml
 $ kubectl create -f src/etc/kubernetes/kafka-zk-service.yml
 $ kubectl create -f src/etc/kubernetes/kafka-controller.yml
 $ kubectl create -f src/etc/kubernetes/mysql-controller.yml
 $ kubectl create -f src/etc/kubernetes/mysql-service.yml
 $ kubectl create -f src/etc/kubernetes/kafka-service.yml
 $ kubectl create -f src/etc/kubernetes/redis-controller.yml
 $ kubectl create -f src/etc/kubernetes/redis-service.yml
 $ kubectl create -f src/etc/kubernetes/scdf-config-kafka.yml
 $ kubectl create -f src/etc/kubernetes/scdf-secrets.yml
 $ kubectl create -f src/etc/kubernetes/scdf-service.yml
 $ kubectl create -f src/etc/kubernetes/scdf-controller.yml
 $ kubectl get svc #lookup external ip “scdf” <IP> $ java -jar ../spring-cloud-dataflow-shell-1.2.0.RELEASE.jar dataflow:> dataflow config server --uri http://<IP>:9393
 dataflow:> app import —uri [2]
 dataflow:> app register --type processor --name tweetsentimentalyzer --uri docker:qaware/ tweetsentimentalyzer-processor:latest dataflow:> … [2] http://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-stream-app-descriptor/Bacon.RELEASE/spring- cloud-stream-app-descriptor-Bacon.RELEASE.stream-apps-kafka-09-docker
  • 35. PRO CON specialized programming
 model -> efficient specialized execution 
 environment -> efficient support for all types of data
 (big, fast, smart) disjoint programming model 
 (data processing <-> services) maybe a disjoint execution
 environment
 (data stack <-> service stack) BEST USED further on: as default for {big,fast,smart} data processing
  • 36. PRO CON coherent execution environment (runs on microservice stack) coherent programming model with emphasis on separation of concerns bascialy supports all types of data (big, fast, smart) has limitations on throughput
 (big & fast data) due to less optimization (like data affinity, query optimizer, …) and message-wise processing technology immature in certain
 parts (e.g. diagnosability) BEST USED FOR hybrid applications of data processing, system integration, API, UI moderate throughput data applications with existing dev team Message by message processing
  • 37. TWITTER.COM/QAWARE - SLIDESHARE.NET/QAWARE Thank you! Questions? josef.adersberger@qaware.de @adersberger https://github.com/adersberger/spring-cloud-dataflow-samples
  • 39. MORE… ▸ Reactive programming ▸ Diagnosability public Flux<String> transform(@Input(“input”) Flux<String> input) { return input.map(s -> s.toUpperCase()); }
  • 40. @EnableBinding(Sink::class) @EnableConfigurationProperties(PostgresSinkProperties::class) class PostgresSink { @Autowired lateinit var props: PostgresSinkProperties @StreamListener(Sink.INPUT) fun processTweet(message: String) { Database.connect(props.url, user = props.user, password = props.password, driver = "org.postgresql.Driver") transaction { SchemaUtils.create(Messages) Messages.insert { it[Messages.message] = message } } } } object Messages : Table() { val id = integer("id").autoIncrement().primaryKey() val message = text("message") } PROGRAMMING MODEL: SINK (WITH KOTLIN)
  • 43. ARCHITECT’S VIEW THE SECRET OF BIG DATA PERFORMANCE Rule 1: Be as close to the data as possible!
 (CPU cache > memory > local disk > network) Rule 2: Reduce data volume as early as possible! 
 (as long as you don’t sacrifice parallelization) Rule 3: Parallelize as much as possible! Rule 4: Premature diagnosability and optimization
  • 45. BASIC IDEA: BI-MODAL SOURCES AND SINKS Sources Processors Sinks READ FROM / WRITE TO: FILE, DATABASE, URL, … INGEST FROM / DIGEST TO: TWITTER, MQ, LOG, …