3. Origens e Objetivos
Projeto criado pela equipe de arquitetura do LinkedIn. O objetivo do
projeto é prover uma insfraestrutura de mensagens de alto troughput
e fácil escalabilidade.
3
4. Mas... Como?
O servidores de mensagem (aka brokers) preocupam-se somente
com a manutenção do estado do cluster Kafka;
Para o mundo externo, os brokers provem somente a visão de
tópicos e offsets para que os clientes possam manter seu próprio
estado.
4
6. Log? Log tipo arquivos de log?
$ sudo tail -n 5 -f /var/log/syslog
Jun 4 19:37:07 inception systemd[1]: Time has been ch ...
Jun 4 19:37:07 inception systemd[1]: apt-daily.timer: ...
Jun 4 19:39:13 inception systemd[1]: Started Cleanup ...
Jun 4 19:40:01 inception CRON[5892]: (root) CMD (test ...
Jun 4 19:44:51 inception org.kde.kaccessibleapp[6056] ...
_
6
7. Log? Log tipo arquivos de log?
$ sudo tail -n 5 -f /var/log/syslog
Jun 4 19:37:07 inception systemd[1]: Time has been ch ...
Jun 4 19:37:07 inception systemd[1]: apt-daily.timer: ...
Jun 4 19:39:13 inception systemd[1]: Started Cleanup ...
Jun 4 19:40:01 inception CRON[5892]: (root) CMD (test ...
Jun 4 19:44:51 inception org.kde.kaccessibleapp[6056] ...
Jun 4 19:49:02 inception ntpd[711]: receive: Unexpect ...
_
6
8. Log? Log tipo arquivos de log?
$ sudo tail -n 5 -f /var/log/syslog
Jun 4 19:37:07 inception systemd[1]: Time has been ch ...
Jun 4 19:37:07 inception systemd[1]: apt-daily.timer: ...
Jun 4 19:39:13 inception systemd[1]: Started Cleanup ...
Jun 4 19:40:01 inception CRON[5892]: (root) CMD (test ...
Jun 4 19:44:51 inception org.kde.kaccessibleapp[6056] ...
Jun 4 19:49:02 inception ntpd[711]: receive: Unexpect ...
Jun 4 19:55:31 inception kernel: [11996.667253] hrtim ...
_
6
9. O que é um log?
time
First entry
...
(t) entry
(t+1) entry
(t+2) entry
(t+n) entry
...
next entry
7
10. Onde logs são utilizados?
Sempre que precismos armazenar o que aconteceu e quando
aconteceu...
8
11. Onde logs são utilizados?
Sistemas de Base de Dados
– PostgreSQL’s Write-Ahead Loggind (WAL)
– Oracles 10/11G Redo Log
Sistemas de controle de versão
– Git, The Stupid Content Tracker
– Subversion
9
12. Git’s log: um log, mas nem tanto
$ git log --oneline
498d410 Fixes message format and adds some logging
e09c955 Enhances the ContainerStub with a StoredObject stub
18fe603 Puts topic name in a configuration companion object
89d9c5d Separates consumer from producer configuration
a9f1a76 Adds a provisory container stub
d800dfe Creates consumer configuration
fa4da8e Removes trash
4808450 Adds kafka producer
333b14f Let there be light
10
13. Git’s reflog: um verdadeiro log
$ git reflog
498d410 HEAD@0: commit (amend): Fixes message format and adds some ...
9147167 HEAD@1: commit (amend): Fixes message format and adds some ...
97d8661 HEAD@2: commit: Fixes message format and adds some logging
e09c955 HEAD@3: commit: Enhances the ContainerStub with a StoredOb ...
18fe603 HEAD@4: rebase finished: returning to refs/heads/master
18fe603 HEAD@5: rebase: checkout refs/remotes/origin/master
d800dfe HEAD@6: rebase finished: returning to refs/heads/master
d800dfe HEAD@7: rebase: Creates consumer configuration
fa4da8e HEAD@8: rebase: checkout refs/remotes/origin/master
701b3e6 HEAD@9: commit: Creates consumer configuration
4808450 HEAD@10: commit: Adds kafka producer
333b14f HEAD@11: clone: from https://pedroarthur@bitbucket.org/ped ...
11
21. Visão geral do Kafka
Zookeeper
cluster
Producer
Producer
Producer
Consumer
Group A
Consumer
Group B
Consumer
Group C
Block Storage
19
22. "Apache Zookeeper is an ...
server which enables
highly reliable distributed
coordination"
Apache Zookeeper’s home page
20
23. Zookeeper
Do ponto de vista das interfaces de programação, o Zookeeper é um
"sistema de arquivos com garantias"; por exemplo
– Diferentes clientes tentam criar um arquivo, somente um deles
receberá uma notificação positiva;
– Diferentes clientes tentam alterar um arquivo, somente um deles
receberá a notificação de escrita e os demais devem retentar a
operação
21
24. Anatomia de um Tópico
Partition 0 @ broker_a
Partition 1 @ broker_b
Partition 2 @ broker_c
topic
Consumer 2
Consumer 0
Consumer 1
Cosumer Group A
Producer
Producer
Producer
22
26. Anonimizado xD
A/B data
aggregation
Source A
Database
Source B
Database
A/B Aggregation
Database
A/B Aggregation
Topic
Data Enrichment
Pipeline
Data Analysis
Engine
A Topic
B Topic
Source A Data
Transformation
Source B Data
Transformation
Flume
Master Node
Data Source A
Data Source B2
Data Source B1
Query API
Publish/Subscriber
Integration API
Analysis
Database
24
28. Producer em Scala (>= 0.9.0.0)
val properties = new Properties () {
put("bootstrap.servers", "broker1 :9092 , broker2 :9092")
put("key.serializer ",
"org.apache.kafka.common. serialization . StringSerializer ")
put("value.serializer ",
"org.apache.kafka.common. serialization . StringSerializer ")
put("acks", "1")
put("retries", "0")
/* any colour you like */
}
val producer = new KafkaProducer [String , String ]( properties )
26
29. Producer em Scala (>= 0.9.0.0)
val message = new ProducerRecord [String , String ](
" the_destination_topic ", "entry key", "your data")
/* just try to send data */
val future: Future[ RecordMetadata ] = producer.send(message)
/* try to send data and call -me back after it */
val futureAndCallback : Future[ RecordMetadata ] =
producer.send(message ,
new Callback () {
def onCompletion (
metadata: RecordMetadata , exception: Exception) {
/* (metadata XOR exception) is non -null :( */
}
})
producer.close () /* release */
27
30. Consumer em Scala (>= 0.9.0.0)
val properties = new Properties () {
put("bootstrap.servers", "broker1 :9092 , broker2 :9092")
put("group.id", " the_group_id ")
put("enable.auto.commit", "true")
put("auto.commit.interval.ms", "1000")
put("key. deserializer ",
"org.apache.kafka.common. serialization . StringDeserializer ")
put("value. deserializer ",
"org.apache.kafka.common. serialization . StringDeserializer ")
/* any colour you like */
}
val consumer = new KafkaConsumer [String , String ]( properties )
28
31. Consumer em Scala (>= 0.9.0.0)
/* subscribe to as many topics as you like */
consumer.subscribe(Arrays.asList(" the_destination_topic "))
while (true) {
val records: /* argument is the timeout in millis */
ConsumerRecords [String , String] = consumer.poll (100)
records foreach {
record: ConsumerRecord [String , String] =>
log.info("${record.topic ()} is at ${record.offset ()}")
}
}
29
33. Jay Krep’s I Heart Logs
"Why a book about logs? Thats easy: the humble log is an
abstraction that lies at the heart of many systems, from NoSQL
databases to cryptocurrencies. Even though most engineers dont
think much about them, this short book shows you why logs are
worthy of your attention ..."
Release Date: October 2014
31
35. Kafkabox: sincronização de arquivos
Pequeno projeto prova de conceito utilizando Kafka, OpenStack
Swift e inotify; baixe-o com o comando a seguir:
$ git clone http://bitbucket.com/pedroarthur/kafkabox/
33
36. Implementação
– Inotify: escute todas as mudanças do diretório;
– Para cada mudança,
– Envie mudança para o Swift via HTTP;
– Escreva a mudança ocorrida no tópico Kafka.
34