Cleaning up ElasticSearch After Problems with Analytics Service in Workspace ONE Access On-Premises

Cleaning up Unassigned Shards in Workspace ONE Access On-Premises

If you see an error on your VMware Identity Manager appliance dashboard regarding the analytics service, it usually means there is an issue with ElasticSearch, often times due to unassigned shards.

There are two options for resolving this (and an Alternate).

Reference:


Option 1:

1. SSH to the appliance or one of your appliances in the cluster. Login as root.

2. Run the following command to determine if you have unassigned shards.

curl http://localhost:9200/_cluster/health?pretty

The output should look something like this:

{
"cluster_name" : "horizon",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 20,
"active_shards" : 40,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
}

3. Run the following command to view the unassigned shards.
curl -XGET ‘http://localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason’ | grep UNASSIGNED

4. Delete the unassigned shards by running the following command
curl -XDELETE ‘http://localhost:9200/v3_YYYY-MM-DD/

Once all the unassigned shards have been deleted refresh the dashboard and Analytics connection should be successful.

And running the curl http://localhost:9200/_cluster/health?pretty should report 0 unassigned shards and status: green.



Option 2:

The other option for cleaning shards is on the console through the file system. This will have to be done manually on all nodes:

NOTE: If using SSH on a Mac or Windows system, then anything in green text with a black background can be directly copied and pasted from the note (if using a browser to view this doc, check the single and double quotes pasted correctly by pasting into a text editor and replacing the quotes and/or double quotes to see if they show differently). Just ensure if you SSH in with ’sshuser’ that you execute the su command to uplevel your permissions to root.

    1. Find out what indices have hung shards.
curl -XGET 'http://localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason'|grep UNASSIGNED

    1. Make sure elasticsearch is not running on all of the nodes:
service elasticsearch stop

    1. Remove the index's files from the disk on each node:
rm -rf /db/elasticsearch/horizon/nodes/0/indices/<INDEX_NAME>

    1. Restart elasticsearch on each node:
service elasticsearch start



Option 2-ALTERNATE:

WARNING: DO NOT DO THIS on a production environment!!
The lazy way to do this is dump the node’s ‘indices’ folder. The downside to doing this is you lose all other historical data. If you’re fine with that, then certainly this is an option. The command is simply "rm -rf /db/elasticsearch/horizon/nodes//indices/".

NOTE: If you dump the Indices folder, the default settings are shown below.
drwxr-xr-x 30 elasticsearch users 4096 Sep 12 10:35 indices

To recreate the folder, stop elasticsearch, create the folder, set the user and group ownership, and then set the permissions, then restart elasticsearch.
  1. service elasticsearch stop
  2. md indices
  3. chown elasticsearch:users indices/
  4. chmod 755 indices/
  5. Do an ls -l -a on the numbered folder (typically "0") to see if permissions are correctly set.
  6. service elasticsearch start

The commands for the above are shown here.
md /db/elasticsearch/horizon/nodes/0/indices
chown elasticsearch:users /db/elasticsearch/horizon/nodes/0/indices/
chmod 755 /db/elasticsearch/horizon/nodes/0/indices/

Once this is completed, you may wish to recheck the folder is created with the correct permissions. Simply run the "ls -l -a" command on the node folder.
ls -l -a /db/elasticsearch/horizon/nodes/0/

From that point, restart the elasticsearch service on each node one at a time.


Combined Commands for copy/paste into SSH.

ELASTICSERCH NODE 0:

NOTE: STOP ELASTICSEARCH ON ALL WORKSPACE ONE ACCESS NODES FIRST
service elasticsearch stop

rm -rf /db/elasticsearch/horizon/nodes/0/indices/
md /db/elasticsearch/horizon/nodes/0/indices
chown elasticsearch:users /db/elasticsearch/horizon/nodes/0/indices/
chmod 755 /db/elasticsearch/horizon/nodes/0/indices/
ls -l -a /db/elasticsearch/horizon/nodes/0/
service elasticsearch start

ELASTICSERCH NODE 1:

NOTE: STOP ELASTICSEARCH ON ALL WORKSPACE ONE ACCESS NODES FIRST

service elasticsearch stop

rm -rf /db/elasticsearch/horizon/nodes/1/indices/
md /db/elasticsearch/horizon/nodes/1/indices
chown elasticsearch:users /db/elasticsearch/horizon/nodes/1/indices/
chmod 755 /db/elasticsearch/horizon/nodes/1/indices/
ls -l -a /db/elasticsearch/horizon/nodes/1/
service elasticsearch start

BOTH ELASTICSERCH NODE’S 0 AND 1:

NOTE: STOP ELASTICSEARCH ON ALL WORKSPACE ONE ACCESS NODES FIRST
service elasticsearch stop

rm -rf /db/elasticsearch/horizon/nodes/0/indices/
md /db/elasticsearch/horizon/nodes/0/indices
chown elasticsearch:users /db/elasticsearch/horizon/nodes/0/indices/
chmod 755 /db/elasticsearch/horizon/nodes/0/indices/
rm -rf /db/elasticsearch/horizon/nodes/1/indices/
md /db/elasticsearch/horizon/nodes/1/indices
chown elasticsearch:users /db/elasticsearch/horizon/nodes/1/indices/
chmod 755 /db/elasticsearch/horizon/nodes/1/indices/
ls -l -a /db/elasticsearch/horizon/nodes/0/
ls -l -a /db/elasticsearch/horizon/nodes/1/
service elasticsearch start

NOTE: You can copy/paste all of the commands into Mac OS Terminal, including carriage returns.



Just some optional notes…

Similar to above… Resetting and purging the queue and a reindex using a CURL command instead of above:
rm -rf /db/elasticsearch/horizon

rm -rf /opt/vmware/elasticsearch/logs

rabbitmqctl purge_queue -.analytics.127.0.0.1
curl http://localhost:9200/_cluster/health?pretty
curl http://localhost:9200/_cluster/state/nodes,master_node?pretty
. /usr/local/horizon/scripts/hzn-bin.inc && /usr/local/horizon/bin/curl -v -k -XPUT -H "Authorization:HZN <cookie>" -H "Content-Type: application/vnd.vmware.horizon.manager.systemconfigparameter+json" https://localhost/SAAS/jersey/manager/api/system/config/SearchCalculatorMode -d '{ "name": "SearchCalculatorMode", "values": { "values": ["REINDEX"] } }’