Kahibaro
Discord Login Register

5.6.2 Pacemaker

Overview of Pacemaker in a Cluster Stack

Pacemaker is a cluster resource manager. In a typical Linux HA stack it sits above:

Its jobs:

You normally do not use Pacemaker alone: you combine it with Corosync and fencing to get a complete HA cluster.

Core Pacemaker Concepts

Cluster Information Base (CIB)

The CIB is a cluster-wide XML configuration + state database. It contains:

You manipulate the CIB with tools like pcs or crm rather than editing XML directly on modern setups.

Under the hood, CIB changes are versioned and replicated to all nodes; the Designated Controller (DC) node coordinates this.

Designated Controller (DC)

At any time, one node is the DC:

If the DC fails, another node takes over automatically.

Resources and Resource Agents

Pacemaker manages resources via resource agents (RAs). Key points:

Pacemaker itself doesn’t know how to run PostgreSQL or an IP address; it just calls the agent with parameters you configure.

Resource Classes and Types

When defining a resource you specify:

Example conceptually (not focusing on specific front-ends yet):

Resource Stickiness and Migration Threshold

Pacemaker uses scores to decide where to place resources. Two important notions:

These control how aggressively resources are moved after failures or topology changes.

Fencing and STONITH in Pacemaker

Fencing is essential in HA clusters. Pacemaker integrates with STONITH agents:

You configure fencing devices as Pacemaker resources like any other, but they are treated specially by the cluster engine.

Tools and Interfaces

`pcs` vs `crm` vs Low-Level Tools

Pacemaker itself exposes core daemons and XML CIB, but you normally use higher-level tools:

Whichever front-end you use, all are ultimately manipulating the same CIB and resource definitions.

High-Level vs Low-Level Configuration

Pacemaker supports two broad configuration approaches:

Often you will mix both: use a group for a simple stack (VIP + filesystem + service) and then add specific constraints as needed.

Resource Types in Pacemaker

Primitive Resources

A primitive resource is the basic building block:

Examples (conceptually): a database instance, a virtual IP, a filesystem mount.

Resource Groups

Groups are ordered, colocated sets of resources that behave as one unit:

Use groups for stacks like:

Groups simplify configuration by avoiding explicit ordering and colocation constraints between every pair.

Clones

Clones run the same primitive on multiple nodes simultaneously, e.g.:

Important concepts:

You still may apply constraints to clones (e.g. certain clones should avoid specific nodes).

Multi-State (Master/Slave) Resources

Multi-state resources support master and slave roles:

You can express policies like:

Constraints and Scheduling

Pacemaker uses a scoring-based scheduler; constraints modify scores or enforce ordering. The three most used categories:

Location Constraints

Location constraints influence where a resource is allowed or preferred to run:

Use cases:

Colocation Constraints

Colocation constraints express togetherness or separation between resources:

Examples:

When using groups, basic colocation is handled implicitly within the group, but you still use colocation for interactions between groups or between primitives and clones.

Ordering Constraints

Ordering constraints define start/stop sequence:

Common patterns:

Ordering and colocation together let you build predictable service stacks across nodes.

Scores and Decision-Making

Pacemaker’s scheduler calculates a final score per resource per node, combining:

If a resource has equal best scores on multiple nodes, Pacemaker may choose based on tie-breakers (e.g., lexicographic node names) unless constrained otherwise.

Operations and Timeouts

Operations: Start, Stop, Monitor, Promote, Demote

Each resource defines operations with parameters:

You can have multiple monitors:

Timeouts and Failure Semantics

Timeouts must be realistic:

When an operation fails:

Resource Meta-Attributes

In addition to RA parameters, Pacemaker supports meta-attributes that affect scheduling:

Common examples:

These are separate from RA parameters like IP addresses, ports, or paths.

Node Management and Cluster Behavior

Node States

Pacemaker tracks node states such as:

You can manually put nodes in standby or maintenance mode for safe maintenance.

Quorum and Two-Node Quirks

Pacemaker is quorum-aware:

Special attention is needed for:

Pacemaker integrates with the cluster layer’s quorum subsystem; policies like “no-quorum-policy” can be tuned.

Fencing and Recovery Workflow

A typical Pacemaker reaction to a severe failure:

  1. Detect failure (via monitor timeout or Corosync membership changes).
  2. Mark node unclean if it disappears.
  3. Invoke a fence device (STONITH) to power-off/reset the node.
  4. Wait for fencing confirmation.
  5. Recalculate placement and start resources elsewhere, respecting constraints.

Misconfigured fencing can lead to:

Typical Pacemaker Use Patterns

Highly-Available IP + Service

Simplest pattern:

If node A fails, Pacemaker moves the group to node B; client connections keep using the same IP.

Active/Passive Database with Storage

Common pattern:

Pacemaker ensures storage role and service placement are consistent.

Distributed Service via Clones

Pattern for cluster-wide agents/services:

Pacemaker gradually starts/stops clone instances across nodes, respecting overall cluster health.

Monitoring and Troubleshooting Pacemaker

Status and Cluster View

Useful concepts:

You can observe:

Common Misconfigurations

Typical issues specific to Pacemaker:

Diagnosing often involves:

Design Considerations for Pacemaker-Based Clusters

When designing with Pacemaker:

Pacemaker gives you a powerful policy engine; your main task as an administrator is to express the correct policies (constraints, meta-attributes, fencing strategy) for your environment.

Views: 119

Comments

Please login to add a comment.

Don't have an account? Register now!