Time Navigator HA Cluster Agent Configuration
Posted: August 5th, 2010 | Author: rthomson | Filed under: Sysadmin | Tags: atempo, backup, linux, server, software, tina, unix | No Comments »I’ve been wanting to post about a configuration that allows for seamless file-level backup of storage attached to an active/passive high availability cluster in an uninterrupted fashion using Atempo’s Time Navigator and I’m finally going to do it.
The Problem
The initial difficulty lies in the requirement that the data must be consistently backed up at every interval, no matter which cluster node is currently the active node with the backend storage mounted. To do this, an agent is required to be configured as a cluster resource in order to “follow” the mounting/exporting of the storage to any cluster node. So in order to accomplish this, N + 1 tina agents are required. That is, if you have two cluster nodes, you need three agents to successfully backup each node with the local agent and the storage, as it floats about the cluster nodes depending on failure or migration events.
Luckily for me, the good people at Atempo have engineered the agent in such a way that multiple agents can be ran on a single node, each binding to it’s own IP address and each individually controlled via it’s own init script. Of course, we need to make some file edits to make all this happen and that’s what I’m going share!
System Configuration
This configuration is based on CentOS 5.x and Time Navigator 4.2 but should the concepts should be mostly portable to other popular Linux or UNIX distributions. The underlying cluster software used for the majority of my experience with this configuration is Heartbeat 2.1.3, right before the Pacemaker split but has also been more recently tested on Pacemaker 1.0 / Heartbeat 3.0.x. DRBD is used to provide the active/passive cluster-aware state and configuration information to where I’ve installed the Atempo Time Navigator agent but it is possible to install a second agent on each cluster node and configure it identically but this just seems like more work. DRBD does a great job of making sure the latest cluster-aware tina agent is consistently configured and available on the active cluster node, no matter which node that actually is.
For the purpose of this post, I’m going to assume you already have a working Heartbeat/Pacemaker/DRBD configuration up and running with proper STONITH and all that good jazz. Maybe some other time.
Installing and Configuring the Agent on DRBD
The first thing that needs to be done is the tina agent must be installed to a filesystem hosted on DRBD. I generally just SSH around the Linux-X64.tar or Linux-X86.tar Time Navigator installation archive and then decompress and run the install script.
Assuming the dedicated (to this agent resource) DRBD filesystem is mounted as /cluster/tina on the active cluster node:
$ cd /cluster/tina $ scp user@remote.fqdn:/path/to/Linux-X86.tar ./ $ tar -xf Linux-X86.tar $ cd Linux-X86 $ ./install.sh
This will bring up the GUI installer. Alternatively use the batch install method, whatever works for you.
Set /cluster/tina as the installation directory and otherwise proceed normally as per site configuration. Unique ports do not need to be used for the second cluster agent as this configuration bind to a floating cluster resource IP address while the local agent binds to (one of) the servers “real” IP address(es).
Once installed, there is one important edit to make in the tina agent environment configuration scripts named .tina.sh (sh/bash) and .tina.csh (csh/tcsh) located in the installation directory (/cluster/tina). The key change to make in the relevant script is to modify the value where the $TINA environment variable is being set. In .tina.sh that would be changing the line:
TINA=tina
to instead read something like this:
TINA=tina_ha
where tina_ha is a unique identifier for this instance of the agent. Basically, it needs to be anything BUT tina. This is one of two key components that had me tricked for a while. I had first tried modifying the $TINA_SERVICE_NAME environment variable but that was a giant red herring because uniquely setting that variable to something other than tina does not produce the desired effect, despite what the looking through the tina environment scripts and init scripts might have you believe.
The second thing we must do is to create an LSB-compliant init script for the cluster-aware tina agent. The LSB compliance is very important to ensure the cluster can manage the resource properly. If any return codes are out of the LSB spec, the cluster will behave erratically and unpredictably when dealing with starting, stopping and monitoring the tina agent.
Since the installation creates a good init script for us, we can copy that script with a new name and edit it.
$ cp /etc/init.d/tina.tina /etc/init.d/tina.tina_ha $ nano /etc/init.d/tina.tina_ha
First, replace every instance of the path to the local agent’s tina install path with that of the cluster agent’s installation path. A simple search (Ctrl-W) then replace (Cntrl-R) in nano should suffice.
Additionally, we need a small section at the top that will exit the script in case the DRBD filesystem is not mounted. The HA cluster will do resource status checks on all nodes in the cluster and we need the init script to be able to exit with a sane exit code, even if the DRBD filesystem is not accessible (as it is on all passive nodes). Something like this:
if [ -f /cluster/tina/.tina.sh ] ; then . /cluster/tina/.tina.sh > /dev/null 2>&1 else echo "Unable to start Time Navigator daemon" echo "because the \"/cluster/tina/.tina.sh\" file does not exist" retval=3 fi
In order to make the script LSB compliant, we need to ensure the correct exit status is returned during the correct operations. Instead of pointing out each specific place I had to edit in order for this to happen, I will simply post my entire “/etc/init.d/tina.tina_ha” init script:
#!/bin/sh
# UPDATED BY SETUP - BEGIN
########################################################
#WARNING :
#THIS FILE IS GENERATED AUTOMATICALLY
#AND WILL BE OVERWRITTEN WHEN UPGRADING
#YOUR VERSION OF Time Navigator PRODUCT
########################################################
PATH="$PATH:/bin:/usr/bin:/sbin:/usr/sbin:/etc:/usr/etc"
export PATH
if [ "${TINA_HOME:+$TINA_HOME}" != "" ] ; then
if [ "/cluster/tina" != "$TINA_HOME" ] ; then
echo "Unable to start Time Navigator daemon for \"/cluster/tina\""
echo "because the Time Navigator environment is already set by \"$TINA_HOME\""
retval=3
fi
fi
if [ -f /cluster/tina/.tina.sh ] ; then
. /cluster/tina/.tina.sh > /dev/null 2>&1
else
echo "Unable to start Time Navigator daemon"
echo "because the \"/cluster/tina/.tina.sh\" file does not exist"
retval=3
fi
# UPDATED BY SETUP - END
# @(#) $Id: rc.tina.orig,v 1.1.6.10.4.4.2.4 2007/09/20 16:26:50 dle Exp $
#
# Time Navigator startup script
# (C) 1999-2005 - Atempo
# tina_daemon starting...
#
OS_TYPE=`uname -s`
if echo "\c" | grep "c">/dev/null ; then
ECHOMODE=Bsd
else
ECHOMODE=Sys5
fi
ECHONOCR() {
if [ "$ECHOMODE" = Bsd ] ; then
echo -n "$*"
else
echo "$*\c"
fi
}
PING() {
os_type=`uname -s`
case $os_type in
HP-UX) result=`ping $1 -n 2 2>/dev/null`; return $?;;
*) result=`ping -c 2 $1 2>/dev/null`; return $?;;
esac
}
ISREDHATLIKE=1
# Source function library.
if [ -f /etc/init.d/functions ] ; then
. /etc/init.d/functions
elif [ -f /etc/rc.d/init.d/functions ] ; then
. /etc/rc.d/init.d/functions
else
ISREDHATLIKE=0
fi
ISSUSE=1
if [ -f /etc/rc.status ] ; then
. /etc/rc.status
else
ISSUSE=0
fi
RCStart()
{
if [ -x ${TINA_HOME}/Bin/ndmpd ] ; then
echo "Starting NDMP Data Server..."
${TINA_HOME}/Bin/ndmpd
elif [ -x ${TINA_HOME}/Bin/tina_nts ] ; then
echo "Starting NDMP Tape Server..."
${TINA_HOME}/Bin/tina_nts
fi
TINA_DAEMON=$TINA_HOME/Bin/tina_daemon
if [ -x "$TINA_DAEMON" ]; then
ECHONOCR "Starting Time Navigator ($TINA_SERVICE_NAME)..."
if [ -d /var/lock/subsys ] ; then
touch /var/lock/subsys/tina.$TINA_SERVICE_NAME
fi
i=1
while [ $i -le 60 ] ; do
if [ $OS_TYPE = "Darwin" ] ; then
echo `date` "Trying to start tina_daemon ($TINA_SERVICE_NAME) daemon" >> /var/log/system.log
fi
echo `date` "Trying to start tina_daemon ($TINA_SERVICE_NAME) daemon $i" >> ${TINA_HOME}/Adm/auto_start.log
hostname=`hostname 2>/dev/null`
if [ ! -z "$hostname" ] ; then
echo `date` "Trying to start tina_daemon ($TINA_SERVICE_NAME) daemon: hostname $hostname is defined" >> ${TINA_HOME}/Adm/auto_start.log
PING $hostname
status=$?
if [ $status -eq 0 ] ; then
echo `date` "Trying to start tina_daemon ($TINA_SERVICE_NAME) daemon: ping $hostname is ok" >> ${TINA_HOME}/Adm/auto_start.log
$TINA_DAEMON
sleep 2
RCStatus no_mess
if [ ! -z "$is_running" ] ; then
if [ $OS_TYPE = "Darwin" ] ; then
echo `date` "tina_daemon ($TINA_SERVICE_NAME) daemon is started" >> /var/log/system.log
fi
echo `date` "tina_daemon ($TINA_SERVICE_NAME) daemon is started" >> ${TINA_HOME}/Adm/auto_start.log
break
else
echo `date` "tina_daemon ($TINA_SERVICE_NAME) daemon is not started" >> ${TINA_HOME}/Adm/auto_start.log
fi
else
echo `date` "Trying to start tina_daemon ($TINA_SERVICE_NAME) daemon: ping $hostname is ko" >> ${TINA_HOME}/Adm/auto_start.log
fi
else
echo `date` "Trying to start tina_daemon ($TINA_SERVICE_NAME) daemon: hostname is not defined" >> ${TINA_HOME}/Adm/auto_start.log
fi
sleep 5
i=`expr $i + 1`
done
if [ $ISREDHATLIKE -eq 1 ]; then
echo_success
echo
elif [ $ISSUSE -eq 1 ]; then
rc_status -v
else
echo
fi
# Start ACSLS daemons (mini_el and ssi)
if [ -d "$TINA_HOME/Vtl" ] ; then
for VL_path in $TINA_HOME/Vtl/*
do
[ ! -d $VL_path ] && continue
VL_name=`basename $VL_path`
if [ $VL_name = "Install" -o $VL_name = "Bin" -o $VL_name = "Log" -o $VL_name = "Tmp" ] ; then
continue
fi
# If there is no tina_stk.conf, give up
[ ! -f "$VL_path/tina_stk.conf" ] && continue
[ ! -x "$TINA_HOME/Vtl/Bin/ACSLS/start.sh" ] && continue
ECHONOCR "Starting ACSLS client daemon for $VL_name virtual library ..."
$TINA_HOME/Vtl/Bin/ACSLS/start.sh $VL_name
echo
done
fi
elif [ ! -f ${TINA_HOME}/.ndmp.sh ] ; then
if [ $ISREDHATLIKE -eq 1 ]; then
ECHONOCR "Starting Time Navigator (${TINA_SERVICE_NAME})..."
echo_failure
echo
elif [ $ISSUSE -eq 1 ]; then
rc_failed 1
else
echo
fi
fi
}
RCStop()
{
#Stop ndmp daemon
NDMPDAEMON=""
if [ -x ${TINA_HOME}/Bin/ndmpd ] ; then
NDMPDAEMON="ndmpd"
elif [ -x ${TINA_HOME}/Bin/tina_nts ] ; then
NDMPDAEMON="tina_nts"
fi
if [ ! -z "$NDMPDAEMON" ] ; then
file="/var/tmp/$NDMPDAEMON.pid"
if [ -f $file ] ; then
if [ "$NDMPDAEMON" = ndmpd ] ; then
echo "Shutting down NDMP Data Server..."
elif [ "$NDMPDAEMON" = tina_nts ] ; then
echo "Shutting down NDMP Tape Server..."
fi
kill `cat $file`
fi
fi
#Stop Time Navigator daemon
if [ -x ${TINA_HOME}/Bin/tina_stop ]; then
if [ -d /var/lock/subsys ] ; then
rm -f /var/lock/subsys/tina.$TINA_SERVICE_NAME
fi
ECHONOCR "Shutting down Time Navigator ($TINA_SERVICE_NAME)..."
if [ $OS_TYPE = "Darwin" ] ; then
echo `date` "Stopping tina_daemon ($TINA_SERVICE_NAME) daemon" >> /var/log/system.log
fi
echo `date` "Stopping tina_daemon ($TINA_SERVICE_NAME) daemon" >> ${TINA_HOME}/Adm/auto_start.log
$TINA_HOME/Bin/tina_stop > /dev/null
retval=0
if [ $ISREDHATLIKE -eq 1 ]; then
echo_success
echo
elif [ $ISSUSE -eq 1 ]; then
rc_status -v
else
echo
fi
elif [ ! -f ${TINA_HOME}/.ndmp.sh ] ; then
if [ $ISREDHATLIKE -eq 1 ]; then
echo "Shutting down Time Navigator ($TINA_SERVICE_NAME)..."
echo_failure
echo
elif [ $ISSUSE -eq 1 ]; then
rc_failed 1
else
echo
fi
fi
}
RCStatus()
{
## Check status with checkproc(8), if process is running
## checkproc will return with exit status 0.
# Status has a slightly different for the status command:
# 0 - service running
# 1 - service dead, but /var/run/ pid file exists
# 2 - service dead, but /var/lock/ lock file exists
# 3 - service not running
if [ -f $TINA_HOME/Conf/hosts ] ; then
host_to_ping=`cat $TINA_HOME/Conf/hosts | grep ^localhostname | awk '{print $2}' 2>/dev/null`
if [ $? != 0 -o -z "$host_to_ping" ] ; then
host_to_ping="127.0.0.1"
fi
else
host_to_ping="127.0.0.1"
fi
is_running=`$TINA_HOME/Bin/tina_ping -host $host_to_ping -language English | grep "is running"`
if [ $# -eq 0 ] ; then
ECHONOCR "Checking for Time Navigator ($TINA_SERVICE_NAME): "
if [ $OS_TYPE = "Darwin" ] ; then
echo `date` "Checking tina_daemon ($TINA_SERVICE_NAME) daemon" >> /var/log/system.log
fi
echo `date` "Checking tina_daemon ($TINA_SERVICE_NAME) daemon" >> ${TINA_HOME}/Adm/auto_start.log
if [ ! -z "$is_running" ] ; then
echo "tina_daemon is running"
echo `date` "Checking tina_daemon ($TINA_SERVICE_NAME) daemon: tina_daemon is running" >> ${TINA_HOME}/Adm/auto_start.log
retval=0
else
echo "tina_daemon is stopped"
echo `date` "Checking tina_daemon ($TINA_SERVICE_NAME) daemon: tina_daemon is stopped" >> ${TINA_HOME}/Adm/auto_start.log
retval=3
fi
fi
}
test "$ISSUSE" -eq 1 && rc_reset
case "$1" in
start)
RCStart
retval=0
;;
stop)
RCStop
retval=0
;;
start_msg)
echo "Starting Time Navigator ($TINA_SERVICE_NAME)" ;;
stop_msg)
echo "Shutting down Time Navigator ($TINA_SERVICE_NAME)" ;;
restart)
RCStop
sleep 3
RCStart ;;
status)
RCStatus ;;
*)
echo "usage: /etc/init.d/tina {start|stop|restart|status}" ;;
esac
exit $retval
One final Time Navigator configuration change must be made. The tina agent “hosts” file must be configured to set the “localhostname” of our agent to the FQDN of the floating or virtual IP address service so that the agent will only try to bind to that IP address instead of all IP addresses on the system.
$ cd /cluster/tina/Conf $ cp hosts.sample hosts $ nano hosts
Add a line to the file specifying the “localhostname” like so:
localhostname myserver.company.com
For this to work properly, you must also set any other tina agents running on the cluster nodes to also have a “localhostname” set in their respective “hosts” file to prevent other host-based agents from binding to all IP addresses on the host, including the virtual IP address.
That’s it! The tina service can be added to the HA cluster as an LSB resource agent, grouped with your storage resource agents so it will always be running on the same node as your storage.
Conclusion
Ok, so I rushed the end. Big deal. Sue me. I doubt anyone cares anyways!
Related posts:
- Atempo Time Navigator 4.2 Archive Media Selection Tunable
- Nanorcs: Ultrasimplistic Configuration File Revision Control
- Migration Weekend: Success
- Cfengine 3 Snippets Part 1: DenyHosts
- Migration Weekend


Leave a Reply