Rob Rati and I gave a tutorial on highly-available job queues at Condor Week this year. While it was not a Wallaby-specific tutorial, we did point out that configuring highly-available job queues is easier for users who manage and deploy their configurations with Wallaby; compare the manual and automated approaches in the following slides:

Configuring highly-available central managers (HA CMs) is rather more involved than configuring highly-available job queues. Here’s what a successful HA CM setup requires:

• all hosts that serve as candidate central managers (CMs) must be included in the CONDOR_HOST variable across the pool
• the had and replication daemons must be set up to run on candidate CMs
• the HAD_LIST and REPLICATION_LIST configuration variables must include a list of candidate CMs and the ports on which the had and replication daemons are running on these hosts
• various tunable settings related to shared-state and failure detection must be set

Wallaby includes HACentralManager, a ready-to-install feature that has sensible defaults for setting up a candidate CM. The tedious work of constructing lists of hostnames and ports – and ensuring that these are set everywhere that they must be – can take great advantage of Wallaby’s scriptability. At the bottom of this post is a simple Wallaby shell command that sets up a highly-available central manager with several nodes serving as candidate CMs. To use it, download it and place it in your WALLABY_COMMAND_DIR (review how to install Wallaby shell command extensions if necessary). Then invoke it with

wallaby setup-ha-cms fred barney wilma betty


The above invocation will set up fred, barney, wilma, and betty as candidate CMs, place the candidate CMs in the PotentialCMs group (creating this group if necessary), and configure Wallaby’s default group to use the highly-available CM cluster. (The setup-ha-cms command takes options to put candidate CMs in a different group or apply this configuration to some subset of the pool; invoke it with --help for more information.)

Once you’ve set up your candidate CMs, be sure to activate the new configuration:

wallaby activate


Of course, wallaby activate will alert you to any problems that prevent your configuration from taking effect. Correct any errors that come up and activate again, if necessary. The setup-ha-cms command is a pretty simple example of automating configuration, but it saves a lot of repetitive and error-prone effort!

UPDATE: The command will now remove all nodes from the candidate CM group before adding any nodes to it. This ensures that if the command is run multiple times with different candidate CM node sets, only the most recent set will receive the candidate CM configuration. (The command as initially posted would apply the candidate CM configuration to every node that was in the candidate CM group at invocation time, but only those nodes that were named in its most recent invocation would actually become candidate CMs.) Thanks to Rob Rati for the observation.