1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Elasticsearch and MIT Sloan Data Analytics Hackathon
Cambridge, MA - May 10, 2014
Elasticsearch
Quick Introduction
2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
About Me
• Igor Motov
• Developer at Elasticsearch Inc.
• Github: imotov
• Twitter: @imotov
3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
About Elasticsearch Inc.
• Founded in 2012
By the people behind the Elasticsearch and Apache Lucene
http://www.elasticsearch.com
Headquarters: Amsterdam and Los Altos, CA
• We provide
Training (public & onsite)
Development support
Production support subscription (SLA)
4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
About Elasticsearch
• Real time search and analytics engine
JSON-oriented, Apache Lucene-based
• Automatic Schema Detection
Enables control of it when needed
• Distributed
Scales Up+Out, Highly Available
• Multi-tenancy
Dynamically create/delete indices
• API centric
Most functionality is exposed through an API
5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Basic Concepts
• Cluster
a group of nodes sharing the same set of indices
• Node
a running Elasticsearch instance (typically JVM process)
• Index
a set of documents of possibly different types
stored in one or more shards
• Type
a set of documents in an index that share the same schema
• Shard
a Lucene index, allocated on one of the nodes
7. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Downloading elasticsearch
• http://www.elasticsearch.org/download/
Windows Everything else
8. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
What’s in a distribution?
.
├── LICENSE.txt
├── NOTICE.txt
├── README.textile
├── bin
│ ├── elasticsearch
│ ├── elasticsearch.in.sh
│ └── plugin
├── config
│ ├── elasticsearch.yml
│ └── logging.yml
├── data
│ └── elasticsearch
├── lib
│ ├── elasticsearch-x.y.z.jar
│ ├── ...
│ └──
└── logs
├── elasticsearch.log
└── elasticsearch_index_search_slowlog.log
executable scripts
node config files
data storage
libs
log files
9. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Configuration (multicast)
• Configuration config/elasticsearch.yml
cluster.name: "elasticsearch-imotov"
unique
name
10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Configuration (stand-alone)
• Configuration config/elasticsearch.yml
cluster.name: "elasticsearch-imotov"
network.host: "127.0.0.1"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["localhost:9300", "localhost:9301", “localhost:9302"]
unique
name
listen only
on localhost
disable
multicast
search for other
nodes on localhost
11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Starting elasticsearch
• Foreground
!
!
• Background
$ bin/elasticsearch
$ bin/elasticsearch -d
12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Is it running?
{
"status" : 200,
"name" : "Kamal",
"version" : {
"number" : "1.1.1",
"build_hash" : "f1585f096d3f3985e73456debdc1a0745f512bbc",
"build_timestamp" : "2014-04-16T14:27:12Z",
"build_snapshot" : false,
"lucene_version" : "4.7"
},
"tagline" : "You Know, for Search"
}
$ curl -XGET "http://localhost:9200/?pretty"
13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Communicating with Elasticsearch
• REST API
Curl
Ruby
Python
PHP
Perl
JavaScript (community supported)
• Binary Protocol
Java
14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Pick your client
• Java
included in distribution
• Ruby, PHP, Perl, Python
http://www.elasticsearch.org/blog/unleash-the-clients-ruby-
python-php-perl/
• Everything Else
http://www.elasticsearch.org/guide/clients/
24. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Analysis
• By default string are
- Divided into words (tokens)
- All tokens are converted to lower-case
25. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Analysis Example
• “Elasticsearch is a powerful open source search
and analytics engine.”
1. elasticsearch
2. is
3. a
4. powerful
5. open
6. source
7. search
8. and
9. analytics
10. engine
27. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Elasticsearch Reference
• http://www.elasticsearch.org/guide/
28. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Ideas for hackathon
• Explore data
wikipedia
twitter
enron emails
• Play with Kibana
• Build Elasticsearch plugins
• Get prizes
29. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
Elasticsearch Meetup
http://www.meetup.com/Elasticsearch-Boston/
30. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited
We are hiring
http://www.elasticsearch.com/about/jobs/