System Tools

Help Scripts

System Tools

In this section some system tools are discussed that are mainly highly related with the typical properties of the Gigabit networks described in the introduction.

Check Link Up/Down Daemon for Debian Linux

Introduction

Hosts running Linux may crash when a Gigabit link becomes DOWN and still traffic is running. Especially this has been observed when ICMP packets are continuously send. Probably this will be caused by buffer overflows in the AceNIC device driver (an often used driver in Linux). This problem has only been observed with Linux V. 2.4.X kernels, not with V. 2.2.X kernels.

Please note that it is a property of some Lambda oriented network equipment to bring all links DOWN when the Lambda connection becomes unavailable. Hosts oriented network equipment generally will keep the other links UP when the external connection becomes unavailable. In normal production situations the software described in this document is in general unnecessary: it is typically intended for these kind of networks.

Description

This package contains a Perl daemon which brings an interface down when the link becomes DOWN. When the link is UP again, the interface also will be brought UP. The current UP/DOWN state of the link is read periodically from the system log. Default the check period when the link is DOWN is much larger than when the link is UP. That is to protect the hosts against rapid sequences DOWN - UP - DOWN - ...

When the links is DOWN default the interface is brought up to force Link UP log messages for who there will be checked in the system log after a short period. However, often these log messages are standard generated by the system after the link did come UP. Therefore, with the -if_up_when_link option, the actively UP bringing of the link can be switched off.

Each up/down interface action of the daemon will be logged in the (circular) log files that are situated in the Log sub directory. Optionally also periodic marks can be printed containing a time stamp and the current state of the link.

The daemon has been written to be used with Debian (V. 3.0) Linux with V. 2.4.X kernels. It is assumed that the Link UP/DOWN state has been logged in the system log. Default the log files /var/log/messages* are checked, sorted by modification time. The log files can be gzip compressed. The interfaces are brought up/down with the ifup/ifdown command, available at Debian Linux.

A usage message will be given with the command line:

./Bin/check_link_if_daemon -help

Installation

Because this package is only useful in special test situations, there is no real system installation procedure provided. Follow the following steps to install this package.

Unpack the tar-gzip archive that contains this package in a useful directory. Default the /root home directory of the super user is assumed.
Check the Perl path, specified in the first line of the daemon script ./Bin/check_link_if_daemon after the phrase #!.
Copy the startup template ./Init.d/check_link_if_daemon to the appropriate startup directory (often /etc/init.d). When needed edit the following variables:

ROOT_DIR

Contains the root directory where this package has been installed. The default value is: /root/Check_Link_IF.

DAEMON_OPTS

Contains the used daemon options. The command line ./Bin/check_link_if_daemon -help displays the default option values. The default value of this variable is: -if_up_when_link -mark 1200, with:

-if_up_when_link

Bring only the interface up when again a link is available, and do not test actively if the link is UP (see also the "Description" section (above).

-mark 1200

Print each 20 m. a mark in the log when there was no output.
Finally the appropriate startup links for the various run levels should be created. See man init for more information.

Download

The check link daemon tar-gzip archive can be downloaded from this site.

Periodic Ping Commands between a Cluster of Hosts

Description

This package can be used to execute ping commands between a cluster of hosts. At periodic intervals one of the hosts sends a number of ICMP packets (the default is one) to all other hosts in the cluster. The order which host is on turn is determined by the alphabetical order of the hostnames in the cluster.

The main reason to develop this package is given by the fact that when UDP only is send to a destination, the destination does not need to answer with any packet. A switch or bridge in between may then forget in his table at which port the destination was connected since it does not see any packets coming back, so it goes in flooding mode. If one ping would be run to that destination through the switch, the switch learns again and for a few minutes the broadcast well not be seen. It never will be seen with TCP since the ACK's keep the bridge relearning.

Hence the idea to send periodically ICMP between a cluster of hosts that are typically situated in the same VLAN and connected with the same switch or bridge.

The main script in this package is a Perl server script that is responsible for the periodic ping commands to the other hosts in the cluster. Parameters to this script can be supplied with the program arguments and / or a configuration file. Using these parameters a group of cluster hosts are defined by one or more Perl regular expressions that are related to an unique cluster name identifying that cluster.

After startup the server script first tries to identify to which cluster(s) it belongs. For all clusters a separate process will be spawned to ping the other hosts in the cluster. The other member hosts of the cluster are found by parsing the static hosts file /etc/hosts. Optionally an alternative hosts file may be specified. For each of the cluster(s) a separate process will be spawn.

In general the cluster hosts are defined by Perl regular expressions that are specified in combination with a cluster name. The hosts which are members of a cluster are determined by parsing the static hosts file /etc/hosts. Also an alternative hosts file specified in the program parameters can be used instead. When one or more host(s) do not respond to the ping packets for a shorter or longer time, the ping periods and corresponding pinging order are also temporarily adjusted such that no holes in the periodic ping patterns are introduced.

Requirements

For non-Linux hosts the NIKHEF ping is the required ping command. It can be downloaded from ftp://ftp.nikhef.nl/pub/network/ping.tar.Z and should be installed in its default path: /usr/local/etc/ping.

Installation

This package can be installed with the following steps.

It might be convenient to create a special user to run the periodic ping commands.
Unpack this archive using the command gunzip -c Period_Clust_Ping.tar.gz | tar xfv - in the directory where this package should be installed. There is further no file installation procedure.
Check the Perl server for a usage message by running Bin/period_clust_ping. A program option -Param Value can be changed into a corresponding line from the configuration file with: Param = Value.
Define the host cluster(s) by adjusting in the configuration file Config/period_clust_ping.conf the lines cluster = ClusterRE ClusterName. More ClusterRE Perl regular expressions may be specified with the same ClusterName.
Put all hosts participating with the periodic ping cluster in the system static hosts file /etc/hosts. When you do not have root privileges, you may define your own hosts file in Hosts/hosts. That file will be recognised by the Bin/check_period_clust_ping startup script (see below).
The file Crontab/crontab_input can be used to check periodically if the ping server is running. When required change $HOME into the correct sup directory where this package is residing.
Start the server by executing Bin/check_period_clust_ping. Log messages will appear in the circular log files Log/period_clust_ping_Hostname[.ID].log[.old], with:
- The extension .ID is only used when there is more than one cluster defined.
- The extension .old is used for the circular backup of the log file.

Remark:

Please note that when the configuration file has been modified it will be reread. However, when there are multiple clusters defined, this feature probably will not work well. In that case it is better to restart the server (see above).