Sunday, March 17, 2013

ElasticSearch cluster setup in 2 minutes


ElasticSearch is based on popular Apache Lucene search engine library, another popular variant of  Lucene is Apache SOLR. I'm attracted towards ElasticSearch dues to it's simplicity in JSON style configuration and data access support, I don't intend this blog to be a comparison on ElasticSearch Vs SOLR but I wanted to share my experience around how simple it was to bring up a two node cluster of ElasticSearch.

SOLR 4.2 now includes advanced support for clustering and data sharding similar to ElasticSearch, SOLR depends on Apache Zookeeper instances for these features, you don't need any additional components for ElasticSearch to support these features. The largest reference implementation of ElasticSearch that I know of is Github which recently migrate from SOLR. Here is an interesting techblog on their experiences with this migration
https://github.com/blog/1397-recent-code-search-outages


Step 1: Download ElasticSearch

Download Elastic Search wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.0.Beta1.tar.gz


Step 2: Setup instances

tar -xvf elasticsearch-0.90.0.Beta1.tar.gz

Rename the instance
mv elasticsearch-0.90.0.Beta1 elastic_inst1

Create another copy for the second instance
cp -R elastic_inst1 elastic_inst2

Step 3: Configure instances

NOTE: This step is not mandatory Elasticsearch is smart enough to auto assign a node name and also discover nodes and add them to a cluster, as a best practice it is recommended to assign node and cluster names

vi elastic_inst1/config/elasticsearch.yml
cluster.name: elasticsearch_dc1
node.name: "elastic_inst1"

vi elastic_inst2/config/elasticsearch.yml
cluster.name: elasticsearch_dc1
node.name: "elastic_inst2"


Step 4: Start the instances


./elastic_inst1/bin/elasticsearch -f

./elastic_inst2/bin/elasticsearch -f
When you start the second instance you should see highlighted messages which indicates that the node was automatically added to the cluster


[2013-03-17 19:23:36,845][INFO ][cluster.service          ] [elastic_inst2] detected_master [elastic_inst1][NiEZKK6FSNGuLVNrw_OmbQ][inet[/192.168.0.101:9300]], added {[elastic_inst1][NiEZKK6FSNGuLVNrw_OmbQ][inet[/192.168.0.101:9300]],}, reason: zen-disco-receive(from master [[elastic_inst1][NiEZKK6FSNGuLVNrw_OmbQ][inet[/192.168.0.101:9300]]])

Step 5:  Test instances

Hit following URL from browser to check Cluster health
http://localhost:9200/_cluster/health?pretty
You should see similar output

{
  "cluster_name" : "elasticsearch_dc1",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

Step 6: Shutdown a node or entire cluster

You can run these commands from your favorite browser REST API plugins or through curl from command line

curl -XPOST http://localhost:9200/_cluster/nodes/_shutdown

You should see a similar output

{"cluster_name":"elasticsearch_dc1","nodes":{"NiEZKK6FSNGuLVNrw_OmbQ":{"name":"elastic_inst1"},"JC4FSShJQzOeMtpRUu_6Ng":{"name":"elastic_inst2"}}}

To shutdown an individual instance within the cluster, use following commands
curl -XPOST http://localhost:9200/_cluster/nodes/elastic_inst1/_shutdown
curl -XPOST http://localhost:9201/_cluster/nodes/elastic_inst2/_shutdown

13 comments:

  1. Hi Hari,
    how does the cluster recognize it's nodes when their located on different machines?

    ReplyDelete
  2. A very detailed write up on this is available here
    http://www.elasticsearch.org/guide/reference/modules/discovery/zen/

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. hi Oren Orgad
    how does the cluster recognize it's nodes when their located on different machines?
    Ans:
    Step 1 go to Config/elasticSearch.yml
    add cluster.name= same name in all PC
    add node.name = Add Different name in all PC
    Start Your elasticSearch

    Hit following URL from browser to check Cluster health
    http://localhost:9200/_cluster/health?pretty
    it Show your a Cluster Status..

    ReplyDelete
  5. Great Article !! It helped me a lot.

    ReplyDelete
  6. Anyway to setup cluster on 2 (or more than 1) servers and load balancing them? I searched around on Google and not found. Please help me.

    ReplyDelete
    Replies
    1. Elasticsearch nodes will find each other via unicast, by default.
      discovery.zen.ping.unicast.hosts: ["127.0.0.1", "[::1]"]

      Delete
  7. This is the way of presentation here.... we like to share this information
    Websphere Training In Hyderabad

    ReplyDelete
  8. really good piece of information, I had come to know about your site from my friend shubodh, kolkatta,i have read atleast nine posts of yours by now, and let me tell you, your site gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanks a lot once again, Regards, hybris training in hyderabad

    ReplyDelete
  9. This is really very nice article which I am searching from long time.
    Thank you..!

    ReplyDelete
  10. What other configurations are required apart from cluster & node names? like host ip or something?

    ReplyDelete
  11. It was very nice blog to learn about Selenium.Thanks for sharing new things.selenium training in chennai

    ReplyDelete
  12. In spite of being aware that smoking is a dangerous habit many people get hooked on to cigarettes. There are several extrinsic and intrinsic factors that push a person towards smoking and ultimately make them smoking addicts. gostream Some people get into smoking out of peer pressure or to look cool while others get addicted to smoking simply because they want to look cool.

    ReplyDelete