5.6.3 Corosync

Role of Corosync in a Cluster Stack

Corosync is the messaging and membership layer commonly used in Linux HA clusters, especially together with Pacemaker. In the overall stack:

Corosync provides:

Cluster membership (who is in/out of the cluster).
Quorum calculation.
Reliable message broadcasting between nodes.
A configuration database (confdb).

Pacemaker (covered in the parent chapter) uses Corosync as its communication and membership provider.

You normally do not run Pacemaker without Corosync in a classic “Corosync + Pacemaker” stack; they are tightly integrated but are separate projects and daemons.

Corosync Architecture Overview

Key architectural concepts specific to Corosync:

Nodes: Each physical/virtual machine participating in the cluster; identified by a nodeid and usually a name.
Rings:

Logical communication channels over which Corosync sends messages.
Usually:

ring0 – primary network (e.g., main LAN).
ring1 – optional redundant network (e.g., second NIC / VLAN).

Each ring is bound to an IP and network interface.

Totem protocol:

Corosync’s internal group communication protocol.
Implements a reliable ordered broadcast using a logical token ring.

Services / APIs:

cpg (Closed Process Group): Message groups used by Pacemaker and other apps.
confdb: Configuration and runtime information service.
quorum: Quorum information and notifications.
votequorum: Advanced quorum/voting, used for two-node and more complex setups.

Transport:

udpu: Unicast UDP.
udp: (older) multicast UDP, often not used in modern configurations due to network restrictions.

In practice you will mostly interact with the configuration file, daemons, and corosync-* tools; the internal APIs are used by cluster-aware applications (like Pacemaker) rather than directly by admins.

Installing and Enabling Corosync

Installation is distribution-specific but follows the same pattern.

Examples:

RHEL / CentOS / Rocky / Alma:

  sudo dnf install corosync corosync-qdevice
  sudo systemctl enable --now corosync

Debian / Ubuntu:

  sudo apt install corosync corosync-qdevice
  sudo systemctl enable --now corosync

SUSE / openSUSE:

  sudo zypper install corosync corosync-qdevice
  sudo systemctl enable --now corosync

On many “HA cluster” stacks, additional packages (such as Pacemaker) are installed at the same time; this chapter focuses on the Corosync side only.

Corosync Configuration File (`corosync.conf`)

The main configuration file is usually:

/etc/corosync/corosync.conf

It is cluster-wide: every node in the same cluster must have the same corosync.conf (except for local directives like nodelist ringX_addr which are node-specific but consistent across the list).

A minimal modern configuration using udpu might look like:

totem {
    version: 2
    secauth: on
    cluster_name: mycluster
    transport: udpu
}
nodelist {
    node {
        ring0_addr: node1.example.com
        nodeid: 1
    }
    node {
        ring0_addr: node2.example.com
        nodeid: 2
    }
}
quorum {
    provider: corosync_votequorum
}
logging {
    to_syslog: yes
    to_stderr: no
    logfile: /var/log/corosync/corosync.log
    timestamp: on
}

You will usually copy and adapt such a file across all nodes.

`totem` Section

The totem block configures the messaging protocol and cluster basics. Common parameters:

version: Protocol version; for current Corosync 2.x clusters this is usually 2.
cluster_name:

Human-readable name for this cluster.
Should be unique in your environment; used in logging and identification.

transport:

udpu: Unicast UDP, recommended on most modern networks.
udp: Multicast UDP, used only if your network supports and you need multicast.

secauth:

on / off: Enables message authentication and encryption.
When on, Corosync uses a shared key to secure communication.
In production, this should always be on.

token:

Token timeout in milliseconds (e.g., token: 3000).
How long a node waits before declaring the token lost and starting a membership change.
Lower values → faster failure detection but more sensitivity to transient network hiccups.

token_retransmits_before_loss_const:

Number of retransmits before concluding the token is lost.

join, consensus, max_messages, etc.:

Fine-tuning parameters; defaults are acceptable for most environments and tuning is an advanced topic.

Multiple Rings in `totem`

To use more than one network path, you can configure multiple rings, e.g.:

totem {
    version: 2
    secauth: on
    cluster_name: mycluster
    transport: udpu
    interface {
        ringnumber: 0
        bindnetaddr: 192.168.10.0
        mcastport: 5405
    }
    interface {
        ringnumber: 1
        bindnetaddr: 10.10.10.0
        mcastport: 5407
    }
}

ringnumber: Logical number of the ring.
bindnetaddr: Network to bind to; for udpu this is still used to pick the interface.
mcastport is more relevant for multicast setups; with udpu you’ll commonly rely primarily on ringX_addr per node in nodelist.

Corosync can automatically fail over to another ring if one network path fails, improving cluster robustness.

`nodelist` Section

The nodelist defines all nodes in the cluster:

nodelist {
    node {
        name: node1
        ring0_addr: 192.168.10.11
        ring1_addr: 10.10.10.11
        nodeid: 1
    }
    node {
        name: node2
        ring0_addr: 192.168.10.12
        ring1_addr: 10.10.10.12
        nodeid: 2
    }
}

Key elements:

nodeid:

Unique integer per node, used internally.
Must not change once the cluster is in production unless you rebuild the cluster.

ring0_addr, ring1_addr, etc.:

IP or hostname on each ring for that node.
Every node’s corosync.conf must have the same list, with the same IDs and addresses.

name:

Optional; helps with readability and is used by some tools.

Corosync does not use automatic discovery in typical Pacemaker setups; you list all nodes explicitly in nodelist.

`quorum` Section

Quorum is the mechanism used to ensure that only a subset of nodes (with a majority or valid vote) can run cluster resources.

Quorum on Corosync is typically handled by the votequorum service:

quorum {
    provider: corosync_votequorum
    expected_votes: 2
    two_node: 1
}

Key parameters:

provider:

For modern clusters, use corosync_votequorum.

expected_votes:

Total number of votes in the cluster.
Often equals the number of nodes but can be changed if you adjust per-node node_votes.
If not set, Corosync can calculate it from the node list.

two_node:

Special handling for 2-node clusters (set 1 to enable).
Primarily for simple two-node environments; production HA often uses a quorum device instead (see below).

You will typically not do detailed quorum policy here; that’s handled by Pacemaker, which uses the quorum information Corosync provides.

`logging` Section

Corosync logging is configured via the logging block:

logging {
    fileline: off
    to_syslog: yes
    to_stderr: no
    to_logfile: yes
    logfile: /var/log/corosync/corosync.log
    timestamp: on
    debug: off
}

Common parameters:

to_syslog: Log messages through syslog/journald.
to_logfile: Log to a specific file.
logfile: Path to the log file.
debug: When on, increases verbosity — useful for troubleshooting but noisy.
timestamp: Include timestamps in file logs.
fileline: When on, adds source file/line information (useful during development or in-depth debugging).

Authentication and Cluster Keys

With secauth: on, Corosync uses a shared key for message authentication and encryption. This key is stored in:

/etc/corosync/authkey

The file must:

Exist and be identical on all nodes.
Have strict permissions, usually 600 and owned by root.

To generate the key:

sudo corosync-keygen

This will:

Create /etc/corosync/authkey.
Fill it with random data suitable for Corosync authentication.

Distribute the generated key securely to all nodes, e.g.:

Using scp over SSH:

  sudo scp /etc/corosync/authkey root@node2:/etc/corosync/authkey

Then ensure permissions are correct on all nodes:

  sudo chown root:root /etc/corosync/authkey
  sudo chmod 600 /etc/corosync/authkey

Do not edit authkey manually; always regenerate when you need a new key.

Starting, Stopping, and Status

Corosync is managed through systemd on most distributions.

Common operations:

# Start Corosync
sudo systemctl start corosync
# Enable at boot
sudo systemctl enable corosync
# Check current status
systemctl status corosync
# Stop Corosync
sudo systemctl stop corosync
# Restart (after config changes)
sudo systemctl restart corosync

When running with Pacemaker, cluster management tools may expect both services to be running; always ensure that changes to Corosync are coordinated with the rest of the cluster.

Inspecting Cluster Membership and Quorum

Corosync provides several CLI tools to inspect membership and quorum. These are particularly useful to verify that Corosync itself is healthy before investigating resource-manager issues.

`corosync-cmapctl`

Displays key-value pairs in Corosync’s configuration and runtime database (cmap / confdb):

# Show all keys and values
corosync-cmapctl
# Filter by category, e.g., runtime membership
corosync-cmapctl | grep runtime.members

Helpful keys:

runtime.members – count of members.
runtime.members.<nodeid>.name – name of the node.
runtime.connections – active connections.

Use this tool when you need low-level detail about what Corosync believes about the cluster state.

`corosync-quorumtool`

Shows quorum status and membership:

corosync-quorumtool

Typical output:

Quorum information
------------------
Date:             Fri Dec 12 10:22:34 2025
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          1
Ring ID:          1/12345
Quorate:          Yes
Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           1  
Flags:            2node, Quorate
Membership information
----------------------
    Nodeid      Name
         1      node1 (local)
         2      node2

Key aspects:

Quorate: Yes/No – whether the cluster currently has quorum.
Nodes and Membership information – which nodes are in the cluster from Corosync’s perspective.
Expected votes, Total votes, and Quorum – how the majority is calculated.

You can also query only membership:

corosync-quorumtool -s

This provides a one-line summary, convenient for quick checks or scripts.

Two-Node Clusters and `two_node` vs QDevice

Two-node clusters are common but tricky, because:

With two nodes, losing one means you lose strict majority (1 of 2 is not > 50%).
Without extra measures, you can face split-brain risks.

Corosync provides two main concepts for this:

two_node mode in votequorum (simple / lab setups):

two_node: 1 and expected_votes: 2.
The remaining single node can still be quorate after its peer fails.
Does not protect you from network partitions where both think the other is gone.
Suitable mainly for simple or non-critical setups.

QDevice (Quorum Device) (recommended for production two-node clusters):

A separate “arbitrator” node or service (qnetd) that provides an additional vote.
Implemented via the corosync-qdevice daemon and a qnetd server.
Typical scenario:

Node1, Node2, and a QDevice (e.g., small VM elsewhere).
Cluster uses 3 votes.
Any partition without at least 2 votes (e.g., one node alone) becomes non-quorate.

To use QDevice, you would:

Set up a qnetd server (outside of your two main nodes).
Configure Corosync with corosync-qdevice on each node.
Adjust quorum section as needed for QDevice votes.

Deep QDevice configuration is beyond this chapter; the key point is that Corosync has built-in support to handle quorum in small and asymmetric clusters more safely than two_node alone.

Using Multiple Rings for Redundancy

Multiple rings let Corosync continue working even if one network path fails.

Typical configuration steps:

Provide separate interfaces and IP ranges, e.g.:

ring0 on ens33 with 192.168.10.x.
ring1 on ens34 with 10.10.10.x.

Configure nodelist with both addresses:

   nodelist {
       node {
           name: node1
           ring0_addr: 192.168.10.11
           ring1_addr: 10.10.10.11
           nodeid: 1
       }
       node {
           name: node2
           ring0_addr: 192.168.10.12
           ring1_addr: 10.10.10.12
           nodeid: 2
       }
   }

Ensure totem interfaces match these networks.

Corosync will:

Prefer ring0 when it’s healthy.
Automatically switch to ring1 for messages if ring0 fails.
Attempt to recover the primary ring once it comes back.

You can see ring-related status via corosync-cmapctl (e.g., keys under runtime.totem.pg.mrp.srp).

Common Operational Tasks

Below are tasks specifically tied to Corosync management and troubleshooting, not Pacemaker resources.

Rolling Corosync Configuration Changes

When updating corosync.conf:

Edit the file on one node and validate syntax:

   sudo corosync-cfgtool -R

(-R only checks and reloads for some options; not all settings are reloadable.)

Distribute the identical file to all other nodes:

   sudo scp /etc/corosync/corosync.conf node2:/etc/corosync/

Restart Corosync one node at a time:

Ensure the cluster remains quorate and functional between restarts.

   sudo systemctl restart corosync

Verify cluster membership after each restart:

   corosync-quorumtool

Some settings (like transport, ring definitions) require a full restart. Keep cluster impact in mind.

Checking Logs

Corosync logs are crucial when diagnosing membership problems:

With to_syslog: yes:

Logs appear in journald:

    journalctl -u corosync

With to_logfile: yes:

Check:

    sudo less /var/log/corosync/corosync.log

Common message patterns:

Node joins / leaves:

“configured nodeid”, “joined the cluster”, “left the cluster”.

Token timeout:

Indications of link failure or too low token timeout.

Authentication issues:

Complaints about authkey mismatch or missing file.

Use timestamps and Ring IDs to correlate with observed resource failovers or Pacemaker events.

Typical Corosync Problems and How to Approach Them

A few frequent Corosync-specific issues and troubleshooting angles:

Node not joining the cluster:

Verify corosync.conf is identical on all nodes.
Check authkey presence and permissions.
Confirm network connectivity (ping between ring addresses).
Look at journalctl -u corosync for membership or authentication errors.

Frequent membership flapping (nodes joining/leaving):

Network instability:

Packet loss, congestion, or interface flaps.

Token timeout too low:

Consider increasing token in totem to allow more time before declaring failure.

Misconfigured MTU:

Ensure consistent MTU across the cluster networks.

Split-brain scenarios in two-node clusters:

Using two_node without a tie-breaker:

Consider implementing QDevice.

Unreliable inter-node link:

Improve network redundancy or quality; consider multiple rings.

Corosync won’t start after config change:

Syntax error in corosync.conf:

Use corosync-cfgtool or run Corosync in the foreground temporarily to see errors:

       sudo systemctl stop corosync
       sudo corosync -f

(Press Ctrl+C to stop, then fix the issue.)

One ring fails but cluster survives:

Check ring status via corosync-cmapctl.
Investigate physical interface, VLAN, or routing problems.
Ensure both rings are truly independent (ideally separate switches/paths).

Summary

Corosync is the low-level cluster engine providing:

Reliable group messaging.
Membership and quorum information.
Secure, authenticated communication between cluster nodes.

For effective use in high-availability clusters:

Design and test your corosync.conf carefully (rings, nodes, quorum).
Protect communication with secauth and a properly managed authkey.
Use the provided tools (corosync-quorumtool, corosync-cmapctl, logs) to understand and debug cluster state.
For two-node clusters, prefer QDevice over simple two_node mode in serious environments.
Combine Corosync with Pacemaker (and other higher-level tools) to manage cluster resources, relying on Corosync for the underlying membership and quorum guarantees.

Comments

Please login to add a comment.

Don't have an account? Register now!