I used Docker image for test purpose.
https://www.elastic.co/guide/en/elasticsearch/reference/current/starting-elasticsearch.html
docker run \
-d --rm \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
docker.elastic.co/elasticsearch/elasticsearch:7.12.1
Check the status
$ curl localhost:9200
{
"name" : "cfd8f0ec08e2",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "f5fi-57iSVWngx0slmijPA",
"version" : {
"number" : "7.12.1",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "3186837139b9c6b6d23c3200870651f10d3343b7",
"build_date" : "2021-04-20T20:56:39.040728659Z",
"build_snapshot" : false,
"lucene_version" : "8.8.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/add-elasticsearch-nodes.html
When you start an instance of Elasticsearch, you are starting a node. An Elasticsearch cluster is a group of nodes that have the same
cluster.name
attribute. As nodes join or leave a cluster, the cluster automatically reorganizes itself to evenly distribute the data across the available nodes.
Yes, Elasticseach store data! And they are distributed.
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-xpack.html
X-Pack is an Elastic Stack extension that provides security, alerting, monitoring, reporting, machine learning, and many other capabilities. By default, when you install Elasticsearch, X-Pack is installed.
If you know FIPS 140-2, this is a cool option.
Elasticsearch offers a FIPS 140-2 compliant mode and as such can run in a FIPS 140-2 enabled JVM. In order to set Elasticsearch in fips mode, you must set the xpack.security.fips_mode.enabled to true in elasticsearch.yml
It is a role of Kibana.
Data or an item is called “document” in ElasticSearch.
A document is in json format.
The following REST request by cURL store a document in the index customer
with type _doc
.
curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "John Doe"
}
'
## From file
$ cat es.txt
{
"name": "John Doe"
}
$ curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H "Content-Type: application/json" -d@es.txt
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
## We can POST data without id
curl -X POST "localhost:9200/customer/_doc/" -H 'Content-Type: application/json' -d'
{
"time": "hoo2",
"value": "var2"
}
'
{"_index":"customer","_type":"_doc","_id":"Z-zkIHgBzujlMjfkth6e","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}
ElasticSearch handles json data.
pretty
means “pretty formatted JSON returned.”
By the index number (in this example, 1
)
$ curl -X GET "localhost:9200/customer/_doc/1?pretty"
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "John Doe"
}
}
https://www.elastic.co/de/blog/found-dive-into-elasticsearch-storage#elasticsearch-paths
Viaualize the data in ElasticSearch.
Here should be done more smart way by Docker network.
docker pull docker.elastic.co/kibana/kibana:7.17.0
docker run --link 6328c3ae0c34:elasticsearch -p 5601:5601 docker.elastic.co/kibana/kibana:7.17.0
Check localhost:5601/status
.
Go discover
In client side Filebeat required. In ES server side, logstash required.
In Filebeat config file like here.
Fluentd may work. It is called EFK stack. Fluentd doesn’t store log.
Sure. https://www.elastic.co/guide/en/kibana/current/production.html#high-availability
https://de.slideshare.net/NeilBaker18/elasticsearch-for-beginners
Get a list of indices:
curl http://localhost:9200/_cat/indices
Check the size of all indexes.
curl http://localhost:9200/_cat/shards?v
Get first 2 data from the index customer
:
curl "http://localhost:9200/customer/_search?size=2"
Delete the index customer
:
curl -X DELETE http://localhost:9200/customer
Delete data (id=1
) in customer
index:
curl -X DELETE "localhost:9200/customer/_doc/1?pretty"
Get items in the index customer
(change the value of size
parameter (default=10)):
curl -X GET "localhost:9200/customer/_search/?size=10&pretty"
Count the number of documents in the index customer
:
curl -X GET "localhost:9200/customer/_count
Delete whole index customer
data:
$ curl -X DELETE "localhost:9200/customer"
curl -X PUT "localhost:9200/customer/_doc/2?pretty" -H 'Content-Type: application/json' -d'
{
"name": "John Doe2"
}
'
curl -X PUT "localhost:9200/customer/_doc/3?pretty" -H 'Content-Type: application/json' -d'
{
"name": "John Doe3"
}
'
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
$ curl -X GET "localhost:9200/customer/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name" : "John Doe2"
}
}
}
The result:
{
"query": {
"match": {
"name" : "John Doe2"
}
}
}
'
{
"took" : 887,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.1143606,
"hits" : [
{
"_index" : "customer",
"_type" : "mytype",
"_id" : "2",
"_score" : 1.1143606,
"_source" : {
"name" : "John Doe2"
}
},
{
"_index" : "customer",
"_type" : "mytype",
"_id" : "3",
"_score" : 0.13353139,
"_source" : {
"name" : "John Doe3"
}
},
{
"_index" : "customer",
"_type" : "mytype",
"_id" : "1",
"_score" : 0.13353139,
"_source" : {
"name" : "John Doe"
}
}
]
}
}
The ElasticSearch returns not only “John Doe2” but also “John Doe1” and “John Doe3”, because this is full-text search.
You can see low score in the documents “John Doe2” and “John Doe1”.
You can see details with _analyze
.
$ curl -X GET "localhost:9200/customer/_analyze?pretty" -H 'Content-Type: application/json' -d'
{
"field": "name",
"text": "3"
}
'
{
"tokens" : [
{
"token" : "3",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<NUM>",
"position" : 0
}
]
}
$ curl -X GET "localhost:9200/customer/_analyze?pretty" -H 'Content-Type: application/json' -d'
{
"field": "name",
"text": "This is a test"
}
'
{
"tokens" : [
{
"token" : "this",
"start_offset" : 0,
"end_offset" : 4,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "is",
"start_offset" : 5,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "a",
"start_offset" : 8,
"end_offset" : 9,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "test",
"start_offset" : 10,
"end_offset" : 14,
"type" : "<ALPHANUM>",
"position" : 3
}
]
}
Not tested, but snippets
In address.city
field, match Berlin, get only fields name
and address
.
curl -X GET "localhost:9200/customer/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"address.city" : "Berlin"
}
},
"_source" : ["name","address"]
}
'
I tried to put the data as below:
{
"name": "John Doe",
"_id": "myid"
}
And ElasticSearch returned the error:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Field [_id] is a metadata field and cannot be added inside a document. Use the index API request parameters."
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse field [_id] of type [_id] in document with id '1'. Preview of field's value: 'myid'",
"caused_by" : {
"type" : "mapper_parsing_exception",
"reason" : "Field [_id] is a metadata field and cannot be added inside a document. Use the index API request parameters."
}
},
"status" : 400
}
As the error message said, the field [_id]
is a metadata field and cannot be added inside a document. Use the index API request parameters.
If your custom index contains special charactors, like forward slashes /
, escape URL escapes.
https://www.elastic.co/guide/en/elastic-stack-glossary/current/terms.html