solr-stack

Ambari stack service for easily installing and managing Solr on HDP cluster

Github stars Tracking Chart

An Ambari Service for Solr

Ambari service for easily installing and managing Solr/HdpSearch on both existing HDP clusters or fresh installs via blueprints

Limitations:

  • This is not an officially supported service and is not meant to be deployed in production systems. It is only meant for testing demo/purposes
  • It does not support Ambari/HDP upgrade process and will cause upgrade problems if not removed prior to upgrade
  • Not tested on secured clusters

Steps:

  • Download HDP 2.4 sandbox VM image (Hortonworks_sanbox_with_hdp_2_4_vmware.ova) from Hortonworks website
  • Import Hortonworks_sanbox_with_hdp_2_4_vmware.ova into VMWare and set the VM memory size to 8GB
  • Now start the VM
  • After it boots up, find the IP address of the VM and add an entry into your machines hosts file e.g.
192.168.191.241 sandbox.hortonworks.com sandbox    
  • Connect to the VM via SSH (password hadoop) and start Ambari server
ssh root@sandbox.hortonworks.com
/root/start_ambari.sh
  • To deploy the Solr stack, run below
VERSION=`hdp-select status hadoop-client, sed 's/hadoop-client - \([0-9]\.[0-9]\).*/\1/'`
sudo git clone https://github.com/abajwa-hw/solr-stack.git /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/SOLR
  • Restart Ambari
#on sandbox
sudo service ambari restart

#on non-sandbox
sudo service ambari-server restart

  • Then you can click on 'Add Service' from the 'Actions' dropdown menu in the bottom left of the Ambari dashboard:

On bottom left -> Actions -> Add service -> check Solr service -> Next -> Next -> Next -> Deploy

  • Also ensure that the install location you are choosing (/opt/solr by default) does not exist

  • On successful deployment you will see the Solr service as part of Ambari stack and will be able to start/stop the service from here:
    Image

  • You can see the parameters you configured under 'Configs' tab
    Image

Option 2: Automated deployment of fresh cluster via blueprint

  • Bring up 4 VMs imaged with RHEL/CentOS 6.x (e.g. node1-4 in this case)

  • On non-ambari nodes, install ambari-agents and point them to ambari node (e.g. node1 in this case)

export ambari_server=node1
curl -sSL https://raw.githubusercontent.com/seanorama/ambari-bootstrap/master/ambari-bootstrap.sh, sudo -E sh
  • On Ambari node, install ambari-server
export install_ambari_server=true
curl -sSL https://raw.githubusercontent.com/seanorama/ambari-bootstrap/master/ambari-bootstrap.sh, sudo -E sh
yum install -y git
git clone https://github.com/abajwa-hw/solr-stack.git /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SOLR
  • Ensure Solr is only started after Zookeeper
    • Edit the /var/lib/ambari-server/resources/stacks/HDP/2.3/role_command_order.json file to include below:
"SOLR_MASTER-START" : ["ZOOKEEPER_SERVER-START"],
  • Ensure that by default, Solr is started on multiple nodes (3 in this example)
    • Edit the /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/stack_advisor.py file
      from:
  def getMastersWithMultipleInstances(self):
    return ['ZOOKEEPER_SERVER', 'HBASE_MASTER']      
  def getCardinalitiesDict(self):
    return {
      'ZOOKEEPER_SERVER': {"min": 3},
      'HBASE_MASTER': {"min": 1},
      }

to:

  def getMastersWithMultipleInstances(self):
    return ['ZOOKEEPER_SERVER', 'HBASE_MASTER','SOLR_MASTER']
  def getCardinalitiesDict(self):
    return {
      'ZOOKEEPER_SERVER': {"min": 3},
      ’SOLR_MASTER': {"min": 3},
      'HBASE_MASTER': {"min": 1},
      }
      
  • Restart Ambari
service ambari-server restart
service ambari-agent restart    
  • Confirm 4 agents were registered and agent remained up
curl -u admin:admin -H  X-Requested-By:ambari http://localhost:8080/api/v1/hosts
service ambari-agent status
  • (Optional) - Generate Ambari Blueprint and cluster file using Ambari recommendations API using below steps.
    For more details, on the bootstrap scripts see bootstrap script git
yum install -y python-argparse
git clone https://github.com/seanorama/ambari-bootstrap.git

#optional - limit the services for faster deployment

#for minimal services
export ambari_services="HDFS MAPREDUCE2 YARN ZOOKEEPER HIVE SOLR"

#for most services
#export ambari_services="ACCUMULO FALCON FLUME HBASE HDFS HIVE KAFKA KNOX MAHOUT OOZIE PIG SLIDER SPARK SQOOP MAPREDUCE2 STORM TEZ YARN ZOOKEEPER SOLR"

export deploy=false
cd ambari-bootstrap/deploy
bash ./deploy-recommended-cluster.bash
  • Configure your Solr install by editting /root/ambari-bootstrap/deploy/tempdir*/blueprint.json.
    For example to include configurations for Ranger audits in Solr make below changes:
    {
      "solr-config": {
        "solr.datadir": "/opt/ranger_audit_server",
        "solr.download.location": "HDPSEARCH",
        "solr.znode":"/ranger_audits"
        }  
    },
    {
      "solr-env": {
        "solr.port": "6083"
        }
    },

  • Register Bluprint
curl -u admin:admin -H  X-Requested-By:ambari http://localhost:8080/api/v1/blueprints/recommended -d @blueprint.json
  • Deploy Blueprint
curl -u admin:admin -H  X-Requested-By:ambari http://localhost:8080/api/v1/clusters/solrCluster -d @cluster.json

Use Solr

  • Lauch the Solr webapp via navigating to http://sandbox.hortonworks.com:8983/

  • Alternatively, you can launch it from Ambari via iFrame view
    Image

  • Create a test collection. The below creates a collection names testCollection with 1 shard and replication factor of 1, but you can repeat as necessary

export JAVA_HOME=<JAVA_HOME>

/opt/lucidworks-hdpsearch/solr/bin/solr create -c testCollection \
   -d data_driven_schema_configs \
   -s 1 \
   -rf 1 
export SERVICE=SOLR
export PASSWORD=admin
export AMBARI_HOST=sandbox.hortonworks.com
export CLUSTER=Sandbox

#get service status
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#start service
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#stop service
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

Remove Solr service

  • To remove the Solr service:
    • Stop the service via Ambari
    • Delete the service
export SERVICE=SOLR
export PASSWORD=admin
export AMBARI_HOST=sandbox.hortonworks.com
export CLUSTER=Sandbox    
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X DELETE http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE
  • Remove artifacts

    rm -rf /var/lib/ambari-server/resources/stacks/HDP/2.2/services/solr-stack
    rm -rf /opt/solr
    
  • Restart Ambari

    service ambari restart
    

Overview

Name With Ownerabajwa-hw/solr-stack
Primary LanguagePython
Program languagePython (Language Count: 2)
Platform
License:
Release Count0
Created At2015-04-01 07:37:40
Pushed At2018-01-03 07:28:29
Last Commit At2018-01-02 23:28:28
Stargazers Count38
Watchers Count7
Fork Count20
Commits Count67
Has Issues Enabled
Issues Count8
Issue Open Count3
Pull Requests Count2
Pull Requests Open Count0
Pull Requests Close Count0
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top