5.6.3 Corosync

Table of Contents

Introduction

Corosync is a cluster communication engine that provides reliable messaging, membership, and quorum services for high availability clusters. It is typically used together with Pacemaker, which handles resource management, while Corosync focuses on how the nodes in the cluster talk to each other and agree on who is in the cluster at any given time.

In this chapter, you will see what Corosync does in the cluster stack, its main components, and how its configuration and behavior affect clustering and high availability. Details of resource management and failover logic belong to the Pacemaker and broader clustering chapters, so they will not be repeated here.

Role of Corosync in a Cluster Stack

Corosync sits at the core of many Linux HA stacks as the communication and membership layer. Multiple nodes in a cluster run the Corosync service, which maintains a consistent view of which nodes are currently active and reachable. It also transports messages between nodes in a reliable, ordered way.

Pacemaker and other cluster managers build on top of Corosync, subscribing to its membership and quorum events. When Corosync detects that a node has joined, left, or become unreachable, it notifies Pacemaker, which then decides how to reassign resources. Similarly, Pacemaker uses Corosync as its messaging channel so that nodes can exchange cluster state and commands.

Without a dependable and deterministic messaging layer, the higher level clustering logic cannot safely decide where services should run. Corosync is therefore a foundational component, even though it does not manage resources itself.

Corosync Architecture Overview

Corosync is organized around a set of services that run within a single process on each cluster node. The most important logical pieces are the messaging transport, the membership and quorum engine, and the APIs that client applications use.

On each node, the Corosync daemon maintains connections to the other cluster nodes using one or more network interfaces. It sends and receives small messages that contain membership updates, health checks, and data from applications such as Pacemaker. Corosync imposes a total order on messages in a given message queue, which simplifies how higher layers implement distributed state machines.

Internally, Corosync can operate in different modes, such as multicast or unicast, depending on the network configuration. It also supports redundant rings, where multiple independent network paths are used to carry traffic, improving resilience to network failures.

Membership and Quorum Functions

Cluster membership is the set of nodes that are currently considered part of the cluster and able to participate in decisions. Corosync continuously observes the liveness of nodes through heartbeat messages and network exchanges. If a node stops responding or loses connectivity, Corosync removes it from the membership and generates an event.

Quorum is related to the size of the cluster partition that currently has the authority to run cluster services. A simple and common policy is that a majority of the configured nodes must be present and able to see each other. If a subset of nodes is split off by a network issue and does not have quorum, it should not modify shared resources, which helps avoid data corruption.

Corosync provides a quorum service that higher layers can query. It calculates quorum based on a configured number of expected votes and the votes of the currently active and reachable nodes. When the membership changes, the quorum state is recomputed and applications receive notifications so they can adjust their behavior.

In a cluster using quorum, resources must not run on a partition that does not have quorum, or they risk conflicting with resources that may still be active elsewhere.

Messaging and APIs

Corosync exposes APIs that cluster-aware applications can use to send and receive messages, observe membership, and query quorum. These APIs are accessed through libraries such as libcpg (Closed Process Group), libcmap (Configuration and Map), and libquorum.

The Closed Process Group interface allows an application running on each node to join a named group. Messages sent to that group from any node are delivered to all group members in the same order. This keeps all nodes in sync regarding cluster state.

The configuration and map interface provides access to runtime configuration values and status, giving applications a key-value store that is consistent across the cluster membership. The quorum interface allows applications to be notified when the cluster gains or loses quorum.

Higher level cluster software rarely uses these APIs directly in their raw form in modern distributions. Instead, they link against helper libraries and frameworks that wrap the Corosync services in more convenient abstractions. Nonetheless, understanding that Corosync supplies ordered group messaging and cluster state notifications helps explain how Pacemaker and similar tools cooperate across nodes.

Transport and Network Configuration Concepts

Corosync supports different transport modes for moving messages between nodes. Two broad concepts are broadcast style and unicast style traffic. In environments where IP multicast is available and allowed, Corosync can use multicast addresses to send a single packet that all nodes receive. In more restricted or complex network designs, Corosync can use unicast, where each message is sent directly to each peer.

The choice of transport impacts performance characteristics, network load, and configuration complexity. Unicast often requires more explicit configuration of peer addresses, but it is compatible with a wider range of network infrastructures where multicast is blocked or filtered.

Corosync also supports redundant ring configurations. In this model, each node participates in multiple logical communication rings, each bound to a different physical or virtual network interface. Primary traffic flows on the first ring. If connectivity on that ring is disrupted, Corosync can fail over to the alternate ring to maintain cluster communication.

To gain benefit from redundant rings, each ring should use a separate, independent network path, ideally with different switches and cabling, so that a single failure does not disable all rings.

Configuration File Structure

Corosync reads its configuration from a main configuration file, typically /etc/corosync/corosync.conf. This file uses a structured, section based format, with sections such as totem, quorum, nodelist, and logging. Each section contains parameter assignments that define behavior.

The totem section controls the low level transport layer. Parameters here determine the transport type, timeouts, encryption and authentication options, and ring identifiers used for redundant networks. The nodelist section enumerates the cluster nodes and the network addresses they use for each ring. In many modern setups, this section is auto generated or managed by cluster configuration tools rather than edited entirely by hand.

The quorum section specifies how quorum is calculated and whether tie breaking rules or special vote assignments are used. The logging section defines how and where Corosync writes its log output, including log levels and destinations such as syslog or dedicated log files.

Because this configuration directly affects how nodes discover and trust each other, it is essential that all cluster nodes share an identical corosync.conf file, aside from node specific fields such as addresses where appropriate. Typically, the configuration is created on one node and then copied to all others before Corosync is started.

Authentication and Encryption

Corosync can secure its traffic so that only authorized nodes participate in the cluster and so that messages cannot be easily read or altered in transit. Authentication is based on shared keys, and crypto settings in the configuration control the choice of cipher and hash algorithm.

A shared key file is generated and distributed to all legitimate nodes. This key is used to derive message authentication codes and encryption keys. Since the same key must be present on every node in the cluster, it is critical to protect this file with appropriate file permissions and secure distribution practices.

When encryption is enabled, Corosync both authenticates and encrypts its packets. Authentication ensures that packets come from trusted cluster members. Encryption prevents outsiders from simply observing the traffic to gain insight into cluster operations or attempting to craft malicious packets.

All cluster nodes must share the same Corosync key and compatible crypto settings. A mismatch will prevent nodes from forming a cluster and can lead to isolated nodes that cannot join.

Runtime Behavior and Cluster Changes

During normal operation, Corosync periodically exchanges heartbeat messages between nodes. These messages are used to detect failures or network issues. If a configured timeout is reached without hearing from a node, Corosync assumes that the node is unreachable and initiates a membership change.

When a membership change occurs, Corosync generates notifications for its clients and, in cooperation with quorum settings, recalculates which partition, if any, has quorum. Applications such as Pacemaker then use these notifications to reconsider where resources should run or whether they should be stopped.

Corosync is also responsible for ensuring that membership changes are applied in a consistent way. Each membership change is associated with a sequence number and a view of which nodes are part of the cluster. All nodes in the surviving partition agree on this view, which prevents conflicting decisions across nodes.

Logging and Troubleshooting Concepts

Corosync logs provide insight into the cluster communication layer. They typically include messages about nodes joining or leaving, quorum status changes, network errors, and warnings when timeouts are close to being exceeded. By examining these logs, administrators can determine whether observed cluster issues originate from network instability, misconfiguration, or node level problems.

Important troubleshooting information includes timestamps of membership changes, the node IDs considered active at each event, and any errors related to transport or crypto configuration. For example, if Corosync reports that a node has repeatedly timed out, this may indicate packet loss, saturated links, or faulty cabling affecting that node.

In many cases, apparent resource level failures turn out to be consequences of unstable cluster communication. Understanding Corosync log messages and their relation to membership and quorum decisions is therefore key when diagnosing intermittent failovers or split brain symptoms.

Interaction with Fencing and Split Brain Prevention

Corosync itself is primarily responsible for membership and quorum, not fencing. However, its view of which nodes are active drives higher level fencing decisions. Fencing is typically configured in Pacemaker and ensures that nodes considered failed are prevented from accessing shared resources.

Split brain describes a situation where two or more partitions of a cluster believe they are active owners of the same data. Corosync mitigates this risk by providing accurate membership and quorum information to applications, which then use fencing to forcibly remove nodes that are no longer trusted.

Although Corosync does not execute fencing actions directly in a typical configuration, it is the source of the information about which nodes have been lost and which partitions have quorum. Misconfigurations in Corosync, especially around nodelist, transport, and quorum, can undermine the integrity of fencing strategies and must therefore be handled carefully.

Summary

Corosync supplies the communication, membership, and quorum foundation on which many Linux high availability clusters are built. It maintains a consistent view of which nodes are active, offers ordered group messaging to cluster aware applications, and computes quorum based on configured policies.

Through its configuration file, Corosync defines how nodes connect, which transport mechanisms are used, how many rings exist, and how security and quorum are handled. Proper configuration and secure key management are essential, and all nodes must share a consistent view of the cluster.

By understanding Corosync’s role at the messaging and membership layer, you gain insight into why higher level cluster behavior looks the way it does and how communication problems propagate upward to resource management and failover decisions, which are covered elsewhere in this course.

Comments

Please login to add a comment.

Don't have an account? Register now!