Peggy elasticsearch應用

Field datatypes
 a simple type like string, date, long, double, boolean or ip.
 a type which supports the hierarchical nature of JSON such as object or nested.
 or a specialised type like geo_point, geo_shape, or completion.

get - 取得資料
 http://localhost:9200/_index/_type/_id
 http://localhost:9200/_index/_type/_id?pretty

get search
 http://localhost:9200/_index/_type/_search
 http://localhost:9200/_index/_type/_search?q=xxx&pretty

post search & query_string
{
"query": {
"query_string": {
"query": "*"
}
}
}
=

query_string
{
"query_string" : {
"fields" : ["content", "name"],
"query" : "this AND that"
}
}
{
"query_string": {
"query": "(content:this OR name:this) AND (content:that OR name:that)“
}
}
=

query_string - query
 string
 “手機” 套 = “手機” OR 套
 apple phone = apple OR phone
 title: “手機” OR title:套 = title: (“手機“ 套) ! = (title: “手機” 套)
 boolean
 isPCT: true
 date & range
 dateName: [2012-01-01 TO 2012-12-31]
 dateName: [2012-01-01 TO *]
 dateName: {2011-12-31 TO *]
 range: [ 1 TO 5 ]

query_string - query
 object
 inventorsRaw.name: Nicky
 _missing_ & _exists_
 _missing_: title
 _exists_: title

query_string - nested
{
"query": {
"nested": {
"path": "relatedDocumentsRaw",
"query": {
"query_string": {
"query": "relatedDocumentsRaw.type:*"
}
}
}
}
}

query – size & from
 size (default: 10)
 The size parameter allows you to configure the maximum amount of hits to be
returned.
 from (default: 0)
 The from parameter defines the offset from the first result you want to fetch.
 [query_phase_execution_exception]
 Result window is too large, from + size must be less than or equal to: [10000]
 See the scroll api for a more efficient way to request large data sets.

query – sort & _source
 sort
 Allows to add one or more sort on specific fields.
 _source
 Allows to control how the _source field is returned with every hit.
{
"query": "…",
"size": 5,
"from": 10,
"sort": [{ "pubDate": "desc" }],
"_source": ["pubDate"],
}

query - filter
{
"query": {
"query_string": { "query": "*" }
},
"filter": {
"script": {
"script": {
"lang": "groovy",
"file": "fileNamw",
"params": {
"params1": "date1",
"params2": "date2",
}
}
}
}
}

query - aggregations (aggs)
 The aggregations framework helps provide
aggregated data based on a search query.
 size: 回傳的筆數
 default :10
 size: 0 回傳全部結果
 min_doc_count: 回傳的結果最小筆數
 order: 排序
 date_histogram: 依照日期
 terms: 依照doc_dount 結果
{
"query": "…",
"aggs": {
"date_agg": {
"date_histogram": {
"field": “pubDate",
"interval": "day",
"format": "yyyy-MM-dd",
"order": { "_count": "desc" },
"min_doc_count": 1
} },
"kindCode_agg": {
"terms": {
"field": "kindCode",
"size": 20,
"shard_size": 20
} }
}
}

query - aggregations (aggs)
{
"aggregations": {
"kindCode_agg": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{ "key": "U", "doc_count": 75879 },
{ "key": "A", "doc_count": 73732 },
{ "key": "B", "doc_count": 44115 },
{ "key": "S", "doc_count": 38981 } ] },
"appDocs": {
"buckets": [
{ "key_as_string": "2016-01-06", "key": 1452038400000, "doc_count": 56079 },
{ "key_as_string": "2016-01-27", "key": 1453852800000, "doc_count": 42351 } ] }
}
}

_timestamp field
 Mapping  query result
{
"mappings": {
"my_type": {
"_timestamp": {
"enabled": true
}
}
}
}
{
"_index": "test2",
"_type": "type",
"_id": "2",
"_score": 1,
"_timestamp": 1454051014319,
"_source": {
"name": "Tony",
"day": "1990-03-21"
}
}

scan&scroll
 POST
 http://localhost:9200/{{_index}}/({{type}}/)_search?search_type=scan&scroll=1m
{
"query": {
"query_string": {
"query": “*"
}
}
}

Keeping the search context alive
 The scroll parameter (passed to the search request and to every scroll request)
tells Elasticsearch how long it should keep the search context alive.
 Its value (e.g. 1m, see the section called “Time unitsedit”) does not need to be
long enough to process all data — it just needs to be long enough to process the
previous batch of results.
 Each scroll request (with the scroll parameter) sets a new expiry time.

post
{
"_scroll_id": "c2Nhbjs1OzE5NjMzOkxXdWt2d2V2UVFHTVvdGFsX2hpdHM6MT……..",
"took": 487,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1041712,
"max_score": 0,
"hits": []
}
}

scroll
 Get
 http://localhost:9200/_search/scroll/{{_scroll_id}}?scroll=1m

{
"_scroll_id": "c2Nhbjs1OzE5NjMzOkxXdWt2d2V2UVFHTVvdGFsX2hpdHM6MT……..",
"took": 487,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1041712,
"max_score": 0,
"hits": [ {…….}, {…….}, {…….}, {…….}, {…….}, {…….}, {…….}, {…….}]
}
}
get

Ref.
 Bulk
 https://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html
 Scan & Scroll
 https://www.elastic.co/guide/en/elasticsearch/guide/current/scan-scroll.html
 http://stackoverflow.com/questions/25453872/why-does-this-elasticsearch-scan-and-
scroll-keep-returning-the-same-scroll-id

Cheaper in Bulk
{ action: { metadata }}n
{ request body }n
{ action: { metadata }}n
{ request body }n
…..

action
 delete
 { "delete": { "_index": "website", "_type": "blog", "_id": "123" }}n
 create
 { "create": { "_index": "website", "_type": "blog", "_id": "123" }} n
 { "title": "My first blog post" } n
 Index
 { "index": { "_index": "website", "_type": "blog" }} n
 { "title": "My second blog post" } n
 update
 { "update": { "_index": "website", "_type": "blog", "_id": "123", "_retry_on_conflict" : 3} } n
 { "doc" : {"title" : "My updated blog post"} } n

 status
 '200': 'OK',
 '201': 'Created',
{ "took": 4,
"errors": false,
"items": [
{ "delete": {
"_index": "website", "_type": "blog", "_id": "123", "_version": 2, "status": 200, "found": true }},
{ "create": {
"_index": "website", "_type": "blog", "_id": "123", "_version": 3, "status": 201 }},
{ "create": {
"_index": "website", "_type": "blog", "_id": "EiwfApScQiiy7TIKFxRCTw", "_version": 1, "status": 201 }},
{ "update": {
"_index": "website", "_type": "blog", "_id": "123", "_version": 4, "status": 200 }}
]
}

Error Example
{ "create": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "title": "Cannot create - it already exists" }
{ "index": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "title": "But we can update it" }

Error Example
{ "took": 3,
"errors": true,
"items": [
{ "create": {
"_index": "website",
"_type": "blog",
"_id": "123",
"status": 409,
"error": "DocumentAlreadyExistsException [[website][4] [blog][123]: document already exists]" }},
{ "index": {
"_index": "website",
"_type": "blog",
"_id": "123",
"_version": 5,
"status": 200 }}
]
}

Peggy elasticsearch應用

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Peggy elasticsearch應用

Similar to Peggy elasticsearch應用 (20)

More from LearningTech

More from LearningTech (20)

Recently uploaded

Recently uploaded (20)

Peggy elasticsearch應用

Editor's Notes