Elasticsearchことはじめ
仕事でElasticsearch使う必要があったので、GettingStarted!
これはなに?
分散型RESTful検索/分析エンジン
https://www.elastic.co/jp/products/elasticsearch
特徴
検索時は条件との完全一致ではなく"関連性が高いもの"を返す
http://gihyo.jp/dev/serial/01/js-foundation/0008
インストール
rpm -ivh https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.2.1.rpm chkconfig --add elasticsearch service elasticsearch start
基本操作
https://www.elastic.co/guide/en/elasticsearch/reference/5.2/getting-started.html
起動確認
curl -X GET http://localhost:9200 { "name" : "Ovee1U6", "cluster_name" : "elasticsearch", "cluster_uuid" : "_-CFvUxgRFy_4Q4lwKWMjw", "version" : { "number" : "5.2.1", "build_hash" : "db0d481", "build_date" : "2017-02-09T22:05:32.386Z", "build_snapshot" : false, "lucene_version" : "6.4.1" }, "tagline" : "You Know, for Search" }
インデックスの登録
curl -X PUT http://localhost:9200/customer?pretty { "acknowledged" : true, "shards_acknowledged" : true }
インデックス一覧
curl -X GET http://localhost:9200/_cat/indices?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open customer nXJXYUdyRpG_yxh-qTp7_w 5 1 0 0 650b 650b
ドキュメント登録
curl -X PUT http://localhost:9200/customer/external/1?pretty -d ' { "name": "John Doe" } '
インデックス/タイプが未作成の場合は一緒に作成される。
ID未指定の場合はPOSTメソッドを指定する。
ドキュメント確認
curl -X GET http://localhost:9200/customer/external/1 { "name": "John Doe" }
インデックス削除
curl -X DELETE http://localhost:9200/customer?pretty { "acknowledged" : true }
更新操作
変更
curl -X POST http://localhost:9200/customer/external/1/_update?pretty -d ' { "doc": {"age": 20} } '
ageカラムがある場合は上書き、ない場合は追加される。
他のカラムに影響はない。
メソッドはPUTではなくPOSTなのに注意。
スクリプト使った変更
curl -X GET http://localhost:9200/customer/external/1?pretty -d ' { "script": "ctx._source.age += 5" } '
ドキュメント削除
curl -X DELETE http://localhost:9200/customer/external/1?pretty
一括登録
curl -XPOST 'localhost:9200/customer/external/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d' {"index":{"_id":"1"}} {"name": "John Doe" } {"index":{"_id":"2"}} {"name": "Jane Doe" }
検索&集計
テストデータ登録
curl -X POST 'localhost:9200/bank/account/_bulk?pretty&refresh' --data-binary "@accounts.json" curl http://localhost:9200/_cat/indices?v
サンプルjsonはここから
https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json
ファイル名の前に@をつける必要がある。
書き方1
curl -XGET 'localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty' { "took" : 83, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1000, "max_score" : null, "hits" : [ { "_index" : "bank", "_type" : "account", "_id" : "0", "_score" : null, "_source" : { "account_number" : 0, "balance" : 16623, "firstname" : "Bradshaw", "lastname" : "Mckenzie", "age" : 29, "gender" : "F", "address" : "244 Columbus Place", "employer" : "Euron", "email" : "bradshawmckenzie@euron.com", "city" : "Hobucken", "state" : "CO" }, "sort" : [ 0 ] }, ... ] } }
q=* -> すべてを指定
sort=account_number:asc -> ソート指定
書き方2
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "match_all": {} }, "sort": [ { "account_number": "asc" } ] } '
取得件数指定
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "match_all": {} }, "sort": [ { "account_number": "asc" } ], "from": 10, "size": 10 } '
取得カラム指定
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "match_all": {} }, "_source": ["account_number", "balance"] } '
完全一致検索
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "match": { "address": "mill" } } } '
bool queryを用いた複数条件指定
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "query": { "bool": { "must": [ { "match": { "age": "40" } } ], "must_not": [ { "match": { "state": "ID" } } ], "filter": { "range": { "balance": { "gte": 20000, "lte": 30000 } } } } } } '
集計
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword" } } } } '
aggs は aggregations(集計) の略。
SQLでいうとこんな感じ。
SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC
集計(平均値を出す)
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d' { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "state.keyword", "order": { "average_balance": "desc" } }, "aggs": { "average_balance": { "avg": { "field": "balance" } } } } } } '
ざっとこんな感じで。