<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=299788&amp;fmt=gif">

Jira Data Center Node Status - Part 2

Atlassian, Jira

Back in July, I wrote about the node status saga.

1 year and 4 months later and a major jump in versions, we finally have a self-service method to fix this problem.

Atlassian introduced an API for deleting old nodes!

Fixed in 8.1.0, so this post is behind the times. To be fair (tongue), Atlassian addressed this issue and produced a solution within 5 years....that's pretty good right? (thumbs down)

I'm writing this post in November, they put this fix out in July so it must not have been terribly urgent I suppose.

Looking at this API: Maybe there are useful features here!


[
   {
      "nodeId":"\"ip-10-0-3-7\"",
      "state":"ACTIVE",
      "lastStateChangeTimestamp":1574648449656,
      "ip":"10.0.3.7",
      "cacheListenerPort":40001,
      "nodeBuildNumber":805001,
      "nodeVersion":"8.5.1",
      "alive":true
   }
]

If we know when the node status changed, we can compare and make a time diff evaluation of whether the node should be removed or not.

#!/bin/bash
USRNM=${1}
PSSWD=${2}
BASEURL=${3}
for i in $(curl --user ${USRNM}:${PSSWD} -sb --url "${BASEURL}/rest/api/2/cluster/nodes" | jq -r '.[] | select(.alive=="false",.state=="OFFLINE") | .nodeId' | tr '\n' ' ')
do
# printf "\n\n Node ID: ${i} is being removed"
currentTime=$(python -c 'from time import time; print int(round(time() * 1000))')
lastStateChangeTimestamp=$(curl --user ${USRNM}:${PSSWD} -sb --url "${BASEURL}/rest/api/2/cluster/nodes" | jq -r '.[] | select(.alive=="false",.state=="OFFLINE") | .lastStateChangeTimestamp' | tr '\n' ' ')
timeDiff=$(( ($currentTime-$lastStateChangeTimestamp)/60000 ))
echo "Time DiFF = $timeDiff"
if [ $timeDiff -gt 20 ]; then
printf "\n\n Node ID: ${i} is being deleted because it is idle or invalid in the cluster, and changed to inactive more than 20 mins ago."
curl -X "DELETE" --user ${USERNAME}:${PASSWORD} -sb --url "${BASEURL}/rest/api/2/cluster/node/${i}"
else
printf "\n\n Node ID: ${i} is inactive...lets give it a few more mins to re-join the cluster"
fi
done

Our Click2Clone using clients who are on data center, would benefit from a new job that checks this very thing periodically and then removes the stale nodes!

Managing JIRA at Scale White Paper

TAGS: Atlassian, Jira

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Subscribe to Our Newsletter

Recent Blog Posts