2. What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• http://www.elasticsearch.org/
3. What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• JSON-oriented;
• RESTful API;
• Schema free.
MySQL ElasticSearch
database Index
table Type
column field
Defined data type Auto detected
4. What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Master nodes & data nodes;
• Auto-organize for replicas and shards;
• Asynchronous transport between nodes.
5. What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Flush every 1 second.
6. What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Build on Apache lucene.
• Also has facets just as solr.
7. What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Give a cluster name, auto-discovery by
unicast/multicast ping or EC2 key.
• No zookeeper needed.
11. Howto Curl
• Query
• Term => { match some terms (after analyzed)}
• Match => { match whole field (no analyzed)}
• Prefix => { match field prefix (no analyzed)}
• Range => { from, to}
• Regexp => { .* }
• Query_string => { this AND that OR thus }
• Must/must_not => {query}
• Shoud => [{query},{}]
• Bool => {must,must_not,should,…}
12. Howto Curl
• Filter
$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?
pretty=1&size=1' -d '{
"query" : {
“match_all" : {}
},
"filter" : {
"term" : { “user" : “kimchy" }
}
}'
Much faster because filter is cacheable and do not calcute
_score.
13. Howto Curl
• Filter
• And => [{filter},{filter}] (only two)
• Not => {filter}
• Or => [{filter},{filter}](only two)
• Script => {“script”:”doc[‘field’].value > 10”}
• Other like the query DSL
17. Howto Perl – ElasticSearch.pm
use ElasticSearch;
my $es = ElasticSearch->new(
servers => 'search.foo.com:9200',
transport => 'httptiny ‘,
max_requests => 10_000,
trace_calls => 'log_file',
no_refresh => 0 | 1,
);
• Get nodelist by /_cluster API from the $servers;
• Rand change request to other node after
$max_requests.
18. Howto Perl – ElasticSearch.pm
$es->index(
index => 'twitter',
type => 'tweet',
id => 1,
data => {
user => 'kimchy',
post_date => '2009-11-15T14:12:12',
message => 'trying out Elastic Search'
}
);
20. Howto Perl – ElasticSearch.pm
$es->search(
facets => {
wow_facet => {
queryb => { content => 'wow' },
facet_filterb => { status => 'active' },
}
}
)
ElasticSearch::SearchBuilder
More perlish
SQL::Abstract-like
But I don’t like ==!
21. Howto Perl – Elastic::Model
• Tie a Moose object to elasticsearch
package MyApp;
use Elastic::Model;
has_namespace 'myapp' => {
user => 'MyApp::User'
};
no Elastic::Model;
1;
22. Howto Perl – Elastic::Model
package MyApp::User;
use Elastic::Doc;
use DateTime;
has 'name' => (
is => 'rw',
isa => 'Str',
);
has 'email' => (
is => 'rw',
isa => 'Str',
);
has 'created' => (
is => 'ro',
isa => 'DateTime',
default => sub { DateTime->now }
);
no Elastic::Doc;
1;
23. Howto Perl – Elastic::Model
package MyApp::User;
use Moose;
use DateTime;
has 'name' => (
is => 'rw',
isa => 'Str',
);
has 'email' => (
is => 'rw',
isa => 'Str',
);
has 'created' => (
is => 'ro',
isa => 'DateTime',
default => sub { DateTime->now }
);
no Moose;
1;
24. Howto Perl – Elastic::Model
• Connect to db
my $es = ElasticSearch->new( servers => 'localhost:9200' );
my $model = MyApp->new( es => $es );
• Create database and table
$model->namespace('myapp')->index->create();
• CRUD
my $domain = $model->domain('myapp');
$domain->newdoc()|get();
• search
my $search = $domain->view->type(‘user’)->query(…)->filterb(…);
$results = $search->search;
say "Total results found: ".$results->total;
while (my $doc = $results->next_doc) {
say $doc->name;
}
25. ES for Dev -- Github
• 20TB data;
• 1300000000 files;
• 130000000000 code lines.
• Using 26 Elasticsearch storage nodes(each
has 2TB SSD) managed by puppet.
• 1replica + 20 shards.
• https://github.com/blog/1381-a-whole-new-code-search
• https://github.com/blog/1397-recent-code-search-outages
26. ES for Dev – Git::Search
• Thank you, Mateu Hunter!
• https://github.com/mateu/Git-Search
cpanm --installdeps .
cp git-search.conf git-search-local.conf
edit git-search-local.conf
perl -Ilib bin/insert_docs.pl
plackup -Ilib
curl http://localhost:5000/text_you_want
27. ES for Perler -- Metacpan
• search.cpan.org => metacpan.org
• use ElasticSearch as API backend;
• use Catalyst build website frontend.
• Learn API:
https://github.com/CPAN-API/cpan-api/wiki/API-docs
• Have a try:
http://explorer.metacpan.org/
28. ES for Perler – index-weekly
• A Perl script (55 lines) to index
devopsweekly into elasticsearch.
• https://github.com/alcy/index-weekly
• We can do same thing to perlweekly,right?
29. ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• http://logstash.net/
30. ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• Log is stream, not file!
• Event is something not only oneline!
31. ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• file/*mq/stdin/tcp/udp/websocket…(34
input plugins now)
32. ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• date/geoip/grok/multiline/mutate…(29
filter plugins now)
33. ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• transfer:stdout/*mq/tcp/udp/file/websocket…
• alert:ganglia/nagios/opentsdb/graphite/irc/xmpp
/email…
• store:elasticsearch/mongodb/riak
• (47 output plugins now)
36. ES for logging - Logstash
• Grok(Regexp capture):
%{IP:client:string}
%{NUMBER:bytes:int}
More default patterns at source:
https://github.com/logstash/logstash/tree/master/patterns
37. ES for logging - Logstash
For example:
10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET
/mediawiki/load.php HTTP/1.1" 304 -
"http://som.d.xiaonei.com/mediawiki/index.php"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3)
AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3
Safari/536.28.10"
38. ES for logging - Logstash
{"@source":"file://chenryn-Lenovo/home/chenryn/test.txt",
"@tags":[],
"@fields":{
"clientip":["10.2.21.130"],
"ident":["-"],
"auth":["-"],
"timestamp":["08/Apr/2013:11:13:40 +0800"],
"verb":["GET"],
"request":["/mediawiki/load.php"],
"httpversion":["1.1"],
"response":["304"],
"referrer":[""http://som.d.xiaonei.com/mediawiki/index.php""],
"agent":[""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like
Gecko) Version/6.0.3 Safari/536.28.10""]
},
"@timestamp":"2013-04-08T03:34:37.959Z",
"@source_host":"chenryn-Lenovo",
"@source_path":"/home/chenryn/test.txt",
"@message":"10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1"
304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X
10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"",
"@type":"apache“
}
45. Build Website using PerlDancer
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
46. use Dancer ‘:syntax’;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
47. use Dancer::Plugin::Auth::Extensible;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
48. use Dancer::Plugin::Ajax;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
49. use Dancer::Plugin::ElasticSearch;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
50. use Dancer::Plugin::ElasticSearch;
sub area_terms {
my ( $index, $level, $limit, $from, $to ) = @_;
my $data = elsearch->search(
index => $index,
type => $type,
facets => {
area => {
facet_filter => {
and => [
{ range => { date => { from => $from, to => $to } } },
{ numeric_range => { timeCost => { gte => $level } } },
],
},
terms => {
field => "fromArea",
size => $limit,
}
}
}
);
return $data->{facets}->{area}->{terms};
}
51. ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see
if any other metrics look similar.
• http://codeascraft.com/2013/06/11/introd
ucing-kale/
52. ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see
if any other metrics look similar.
• https://github.com/etsy/skyline
53. ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see
if any other metrics look similar.
• https://github.com/etsy/oculus
54. ES for monitor – oculus(Etsy Kale)
• import monitor data from redis/ganglia to
elasticsearch
• Using native script to calculate distance:
script.native:
oculus_euclidian.type:
com.etsy.oculus.tsscorers.EuclidianScriptFactory
oculus_dtw.type:
com.etsy.oculus.tsscorers.DTWScriptFactory
55. ES for monitor – oculus(Etsy Kale)
• https://speakerdeck.com/astanway/bring-the-noise-
continuously-deploying-under-a-hailstorm-of-metrics
56. VBox example
• apt-get install -y git cpanminus virtualbox
• cpanm Rex
• git clone https://github.com/chenryn/esdevops
• cd esdevops
• rex init --name esdevops
Notes de l'éditeur
Using LogStash::Outputs::STDOUT with `debug => true`
Schema free, but please define schema using /_mapping or template.json for performance.