Elasticsearch CRUD operations using CURL - CRUD Operations Using the Elasticsearch APIs

Elasticsearch(ELS) is a distributed search and analytics engine which runs on Apache Lucene(The indexing and search library for high performance, full text search engine). Elasticsearch running on single machine is termed as Node(Has unique id and name) and collection of node forms Cluster(nodes join in cluster using cluster name). Before delving into CRUD operation of Elasticsearch we should give brush up core terminology of ELS.

Document, types, Index, Shards and Replica 
:
Document is basic unit of information expressed in JSON which need to be indexed and it resides within index. Each document is assigned to a type.
Type is logical partitioning of documents (user defined grouping semantics), in general document belong to same fields belong to one type.
All of the types of documents make up an index.
An index can be split into multiple shards.If Documents in an index is too large to fit in hard drive of one node or too slow to serve all search requests from one node then we split the index across multiple nodes in cluster that is termed as Shards. In case of shard search is carried in parallel on multiple nodes.
A shard can be replicated Zero or more times. By default in Elasticsearch an index has 5 shards and 1 replica.
Summary: An Elasticsearch cluster can contain multiple Indices (databases), which in turn contain multiple Types (tables). These types hold multiple Documents (rows), and each document has Properties(columns).

Elasticsearch uses Rest API to administer cluster and performing CRUD & search operations. Data is exchanged (Send and Receive) in form of JSON. In this post we will use CURL - a tool which allows transfer data from and to a server using variety of protocol (HTTP, FTP, LDAP, IMAP, etc). Since we are using REST API so exchange protocol used is HTTP.

Pre-requisite for CRUD operation :-
1. Download Elasticsearch and start Elasticsearch service from bin directory, using following command.
➜  elasticsearch-6.1.1 ./elasticsearch -Ecluster.name=devinline_es -Enode.name=devnode

Here is -Ecluster.name=devinline_es -Enode.name=devnode is optional, just ./elasticsearch will also start service with default config. Once it started successfully console should display something like
"[INFO ][o.e.n.Node  ] [devnode] started"

2. Install CURL if it is not installed in your system. Open terminal window and verify CURL is installed successfully using command
 > CURL --version
Now we are ready to start for CRUD operation which includes creating, Updating and Deleting index and documents. I am executing all commands form Elasticsearch directory, its not mandatory.
Note: Below CRUD commands follows directory name "elasticsearch-6.1.1".

Elasticsearch CRUD Operation


Create indices(index) 

Indices are created using PUT method. Below we creates two index - customers and products and response also states that name of index created.
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/customers?&pretty'    
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "customers"
}
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/products?&pretty'  
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "products"
}

Display list of indices(index) :

GET method is used to display indices in ELS. Below command shows all indices created. First line of response is header which displays health, index name, document count and all. Currently health is yellow and number of document is 0 in both indices. Why health is Yellow?
➜  elasticsearch-6.1.1 curl -XGET 'localhost:9200/_cat/indices?v&pretty'
health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   products  Zeckb9OpSC6areLrEUQr1g   5   1          0            0      1.1kb          1.1kb
yellow open   customers j5KPYo3mRGuf4ahPFgbF0g   5   1          0            0      1.1kb          1.1kb

Add document in index :

PUT method is used to create documents in index. Below customers is name of index where we want to create document and vendors in type name which is used for logical classification of documents. Here we have created two document in index customers with type name vendors. Now if we execute list of indices command it should list out doc.count as 2.
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/customers/vendors/1?pretty' -d'{"name":"Michael Sharpe","age":22,"gender":"male","email":"michaelsharpe@talkalot.com","phone":"+1 (942) 544-2868","street":"858 Bushwick Court","city":"Dorneyville","state":"American Samoa, 3711"}
' -H 'Content-Type: application/json'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/customers/vendors/2?pretty' -d'{"name":"Abigail Garcia","age":31,"gender":"female","email":"abigailgarcia@talkalot.com","phone":"+1 (928) 499-3611","street":"114 Bulwer Place","city":"Wyano","state":"Utah, 118"}' -H 'Content-Type: application/json'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "2",
  "_version" : 3,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 4,
  "_primary_term" : 1
}

Display complete Document :

GET command is used to display document. Here we are display whole document details, customers in index name and vendors is type name. We can also partially display document
➜  elasticsearch-6.1.1 curl -XGET 'localhost:9200/customers/vendors/1?pretty'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "name" : "Michael Sharpe",
    "age" : 22,
    "gender" : "male",
    "email" : "michaelsharpe@talkalot.com",
    "phone" : "+1 (942) 544-2868",
    "street" : "858 Bushwick Court",
    "city" : "Dorneyville",
    "state" : "American Samoa, 3711"
  }
}

Display partial document: 

By specifying fields name from we can retrieve only specified field. In general, partial doc is retrieved to avoid large exchange of JSON over network.
➜  elasticsearch-6.1.1 curl -XGET 'localhost:9200/customers/vendors/1?pretty&_source=name,email'  
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "name" : "Michael Sharpe",
    "email" : "michaelsharpe@talkalot.com"
  }
}

Update whole document 

PUT method is used to update whole document in given index. Service URL for updating whole doc is same as creating doc except for updating ID should pre-exist. Here we are going to update ID=2 with new details.Response of update command indicates that version of document is changed to 2. Here whole doc is replaced by new _source details and it can be validated by executing display document command.
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/customers/vendors/2?pretty' -d'{"name":"Bell Fitzgerald","age":35,"gender":"male","email":"bellfitzgerald@talkalot.com","phone":"+1 (958) 567-2131","street":"173 Tompkins Place","city":"Epworth","state":"Puerto Rico, 2262"}' -H 'Content-Type: application/json'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "2",
  "_version" : 7,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 8,
  "_primary_term" : 1
}

Update partial document :  

POST method is used to update document in given index. In order to update document fields we have to specify JSON with key value "doc". Here we are updating age and city of customer with id=1. Response of update command indicates that version of document is changed to 2. "_update" in URL indicates we are going to perform Update operation.
➜  elasticsearch-6.1.1 curl -XPOST 'localhost:9200/customers/vendors/1/_update?pretty' -d'
{
  "doc": {
     "age": "24",
     "city":"Wyano"
  }
}
' -H 'Content-Type: application/json'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}
Verify whether document has been updated with new value. Execute display document command with specific fields.
➜  elasticsearch-6.1.1 curl -XGET 'localhost:9200/customers/vendors/1?pretty&_source=name,age,city'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "1",
  "_version" : 2,
  "found" : true,
  "_source" : {
    "city" : "Wyano",
    "name" : "Michael Sharpe",
    "age" : "24"
  }
}

Delete document :

DELETE method is used to delete document. We will delete one document with id=1.
➜  elasticsearch-6.1.1 curl -XDELETE 'localhost:9200/customers/vendors/1?pretty'
{
  "_index" : "customers",
  "_type" : "vendors",
  "_id" : "1",
  "_version" : 8,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 7,
  "_primary_term" : 1
}

Delete Index :  

DELETE method is used to delete index. Below command delete index products and on success acknowledged by server as true .
➜  elasticsearch-6.1.1 curl -XDELETE 'localhost:9200/products?pretty'           
{
  "acknowledged" : true
}


Multi-doc retrieval /Bulk Operations in Elasticsearch(ELS):

Before executing Bulk/Multiple document retrieval operations we will create new index and create document in it. Below command create index and add new document with type name shoes in it.
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/products?&pretty' 
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "products"
}
Create document in index products with type name shoes.
➜  elasticsearch-6.1.1 curl -XPUT 'localhost:9200/products/shoes/1?pretty' -d'
{
  "name": "Nike",
  "size": 8,
  "color": "white"
}
' -H 'Content-Type: application/json'
{
  "_index" : "products",
  "_type" : "shoes",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

Multiple documents retrieval :  _mget API 

Elasticsearch allows to retrieve multiple documents using "_mget" API. Here we are retrieving 2 documents one from Index customers and another from products.
➜  elasticsearch-6.1.1 curl -XGET 'localhost:9200/_mget?pretty' -d'       
{
    "docs" : [
        {
            "_index" : "customers",
            "_type" : "vendors",
            "_id" : "2"
        },
        {
            "_index" : "products",
            "_type" : "shoes",
            "_id" : "1"
        }
    ]
}' -H 'Content-Type: application/json'
{
  "docs" : [
    {
      "_index" : "customers",
      "_type" : "vendors",
      "_id" : "2",
      "_version" : 7,
      "found" : true,
      "_source" : {
        "name" : "Bell Fitzgerald",
        "age" : 35,
        "gender" : "male",
        "email" : "bellfitzgerald@talkalot.com",
        "phone" : "+1 (958) 567-2131",
        "street" : "173 Tompkins Place",
        "city" : "Epworth",
        "state" : "Puerto Rico, 2262"
      }
    },
    {
      "_index" : "products",
      "_type" : "shoes",
      "_id" : "1",
      "_version" : 1,
      "found" : true,
      "_source" : {
        "name" : "Nike",
        "size" : 8,
        "color" : "white"
      }
    }
  ]
}
Note: Bulk retrieval request can be of various forms, if we pass index name in service URL then it need not be passed in Docs request body for each document.

Retrieve multiple documents with Index passed in service URL :
Here we are passing customers as index name in URL and docs array does not have entry of Index.
➜  elasticsearch-6.1.1 curl -XGET 'localhost:9200/customers/_mget?pretty&_source=name,email' -d'       
{
    "docs" : [
        {

            "_type" : "vendors",
            "_id" : "1"
        },
        {

            "_type" : "vendors",
            "_id" : "2"
        }
    ]
}' -H 'Content-Type: application/json'

{
  "docs" : [
    {
      "_index" : "customers",
      "_type" : "vendors",
      "_id" : "1",
      "_version" : 1,
      "found" : true,
      "_source" : {
        "name" : "Michael Sharpe",
        "email" : "michaelsharpe@talkalot.com"
      }
    },
    {
      "_index" : "customers",
      "_type" : "vendors",
      "_id" : "2",
      "_version" : 7,
      "found" : true,
      "_source" : {
        "name" : "Bell Fitzgerald",
        "email" : "bellfitzgerald@talkalot.com"
      }
    }
  ]
}
Note: Similarly we can pass type name in service URL ( 'localhost:9200/customers/vendors/_mget?pretty&_source=name,email' ) and we just have to pass ID in docs array for retrieval.

Bulk Operations(Multiple operations in one request) :_bulk API

In ELS using "_bulk" API we can execute multiple operations in one request. Here we are adding two new documents in Index products wit ID 3 and 4. In request body, first line indicates where we have to add next line is payload - name, size and colour.
➜  elasticsearch-6.1.1 curl -XPOST 'localhost:9200/products/_bulk?pretty' -d'
{ "index" : {"_type" : "shoes", "_id" : "3" } }
{"name": "Lucy","size": 7,"color": "white"}
{ "index" : {"_type" : "shoes", "_id" : "4"} }
{"name": "Redtape","size": 11,"color": "Red"}
' -H 'Content-Type: application/json'
{
  "took" : 4,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "4",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

Bulk operations with Index as part of URL:
Bulk operation request can also in various form - if index and type name is passed in URL then we do not have to pass as part of request body. Below command creates 2 new documents in products with type name shoes and delegate the responsibility to generate ID to Elasticsearch (ID is randomly generated).
➜  elasticsearch-6.1.1 curl -XPOST 'localhost:9200/products/shoes/_bulk?pretty' -d'
{ "index" : {} }
{"name": "Volley","size": 6,"color": "Red"}
{ "index" : {} }
{"name": "Reebok","size": 5,"color": "Black"}
' -H 'Content-Type: application/json'
{
  "took" : 6,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "63-l6WABB3_D7Pc8PhDx",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "7H-l6WABB3_D7Pc8PhDx",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}
Note: Instead of using index keyword for creating document in index we can use "create". In this case request will looks like
{ "create" : {"_id" : "5" } }
{ "name": "Heelys","size": 11,"color": "black" }

Bulk operation to Create, Update and Delete: _bulk API can be used to perform create, update and delete operation in one request.
➜  elasticsearch-6.1.1 curl -XPOST 'localhost:9200/products/shoes/_bulk?pretty' -H 'Content-Type: application/json' -d'
{ "index" : {"_id" : "3" } }
{ "name": "Reef","size": 6,"color": "black" }
{ "index" : {"_id" : "4" } }
{ "name": "PFFlys","size": 7,"color": "Red" }
{ "delete" : {"_id" : "2" } }
{ "create" : {"_id" : "5" } }
{ "name": "Heelys","size": 11,"color": "black" }
{ "update" : {"_id" : "1"} }
{ "doc" : {"color" : "Green"} }
'
{
  "took" : 7,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "3",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "index" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "4",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "delete" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "2",
        "_version" : 1,
        "result" : "not_found",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 404
      }
    },
    {
      "create" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "5",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "update" : {
        "_index" : "products",
        "_type" : "shoes",
        "_id" : "1",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

Note:- All above REST API call can also be executed through browser. Below is one example to show list of indices running on given node.
Related post:  How to create documents from JOSN file using "_bulk" API.

3 Comments

  1. Save you developer and admins valuable time and save them for performing tedious job to open and update each field one by one using Salesforce Standard process.There is an app exchange tool called Salesforce Bulk Object Field Creator (BOFC), to overcome this situation and save user precious time.

    ReplyDelete
  2. Hi,

    There is a Salesforce tool formally known as migSO a Salesforce Metadata Migration App, that helps admins or developers to clone Reports, Objects, and Fields within the same or External Salesforce Org in a few clicks.

    ReplyDelete
Previous Post Next Post