Table of Contents
Understanding Cluster Resources
In a high-availability (HA) cluster, resources are the things the cluster starts, stops, moves, and monitors to provide services. Typical examples:
- IP addresses (virtual IPs)
- Filesystem mounts
- Web servers (
apache,nginx) - Databases (MySQL, PostgreSQL)
- Custom applications or scripts
Cluster resource management is about:
- Defining what these resources are
- Specifying how they depend on each other
- Controlling where and how they run
- Ensuring they are monitored and recovered on failure
The concrete tools and syntax differ (e.g., Pacemaker, Corosync integration, etc.), but the concepts below are common across modern Linux HA stacks.
Resource Types and Agents
Most HA stacks use resource agents to manage resources in a standardized way. A resource agent knows how to:
- Start the service
- Stop the service
- Check if it’s healthy
- Promote/demote when needed (for master/slave or primary/secondary roles)
Common agent families:
- LSB (
/etc/init.dstyle scripts) - systemd units
- OCF (Open Cluster Framework) agents (very common with Pacemaker)
- Specialized agents for cloud resources (e.g., AWS, Azure, GCP)
You typically create a resource by choosing a type/agent and providing parameters. Conceptually:
# Pseudo-example, not tied to a specific tool
create resource vip type=IPaddr2 params ip=10.0.0.100 cidr_netmask=24
create resource web type=apache params configfile=/etc/httpd/conf/httpd.confThe agent abstracts the complexity. The cluster uses the same start/stop/monitor interface for all resources regardless of their internal details.
Resource States and Lifecycle
Across cluster solutions, resources usually have a small set of states:
stopped– Not running on any nodestarting– Cluster is starting the resourcestarted– Successfully runningstopping– Cluster is stopping itfailed– The agent reports failure, or monitoring detects a problempromoted/demoted– For multi-state resources (e.g., database primary/secondary)
The cluster resource manager (CRM):
- Decides target state (e.g.,
startedon some node). - Issues actions: start, stop, monitor, promote, demote, migrate.
- Watches results and updates its internal view.
- Reacts to failures according to configured policies.
Resource Constraints
Constraints tell the cluster how resources should be arranged. They are the core of resource management.
Most systems have three fundamental constraint types:
Location Constraints
Answer: On which node(s) may this resource run?
Examples:
- Prefer
webto run on nodenode1. - Never allow
dbto run onnode3. - Only run
storageon nodes with access to shared SAN.
Conceptually:
location prefer_web_on_node1 web prefer node1
location avoid_db_on_node3 db ban node3Location constraints can be:
- Hard: resource must or must not run on a node.
- Soft: resource should preferably run on some node but may move if needed.
Colocation Constraints
Answer: Which resources should run together (or apart)?
Examples:
webmust run on the same node asvip(so the IP and service are together).drbd_primarymust run on the same node asfilesystem.- Do not place
dbandbackupon the same node (to avoid contention).
Conceptually:
colocate web_with_vip web with vip # Same node
colocate fs_with_drbd filesystem with drbd_primary
colocate never_db_with_backup db with backup score=-INFINITYColocation is often used to express resource stacking:
- Storage (DRBD, LVM, etc.)
- Filesystem mount
- Database
- Application
Order Constraints
Answer: In what order should resources start or stop?
Examples:
- Start
drbdbefore mountingfilesystem. - Start
dbbeforeweb. - Stop in reverse order (web → db → fs → drbd).
Conceptually:
order start_stack: drbd_primary then filesystem then db then webOrder and colocation often go together:
- Colocation: "They run on the same node."
- Order: "They must start and stop in this sequence."
Resource Groups
A resource group is a simple way to say “these resources go together as a unit.”
Characteristics:
- Resources in a group:
- Are started in order (top to bottom in the group).
- Are stopped in reverse order.
- Are automatically colocated on the same node.
- You can move, start, or stop the whole group as one logical resource.
Example conceptual group:
group web_stack \
vip \
filesystem \
webInstead of creating separate constraints for each pair, you define a group and apply constraints to the group as a whole.
When to use groups:
- Simple linear stacks (IP → filesystem → service).
- You don’t need complex per-resource rules inside.
When not to use groups:
- You need different locations or failure policies for members.
- You have complex cross-dependencies.
Multi-State Resources (Master/Slave)
Some resources can exist in multiple roles, typically:
Master(Primary)Slave(Secondary)
Typical examples:
- Replicated storage (DRBD)
- Some database replication setups
Key points:
- A multi-state resource can be running in
Slavemode on multiple nodes, but only one node may beMaster. - Additional constraints:
- Services depending on master data (e.g., filesystem) must be colocated with the
Master, not just any instance. - Ordering is often:
- Promote storage master
- Mount filesystem
- Start dependent services
Conceptual example:
ms drbd_resource drbd_agent meta master-max=1 clone-max=2
# Filesystem colocated with *Master* instance
colocate fs_with_drbd_master filesystem with drbd_resource:Master
order start_drbd_before_fs: drbd_resource:promote then filesystem:startResource Stickiness and Scoring
The scheduler decides where to place resources based on a scoring system:
- Each node has scores for each resource (from location rules, failures, etc.).
- The node with the highest score wins, if other constraints allow.
Important concept: stickiness — how strongly a resource prefers to stay where it is.
- High stickiness: avoid unnecessary failovers.
- Low or zero stickiness: cluster may move the resource freely for balancing.
Typical use:
- Prevent "resource ping-pong" between nodes.
- Prefer stability unless there is a strong reason to move.
Conceptually:
# Make web prefer to stay where it is
set_property resource_stickiness=100
# Or per resource
set_meta web resource-stickiness=200Additionally, you may use:
- Preferred nodes (positive score)
- Banned nodes (negative infinite score)
to influence placement.
Resource Monitoring and Timeouts
Each resource should be monitored periodically:
monitoroperation: a health check done by the agent.- Interval: how often (e.g., every 10s, 30s, 1m).
- Timeout: how long the cluster waits before assuming failure.
Good practice:
- Set realistic timeouts (long enough for normal operation, short enough to detect real failures).
- Different intervals for different roles (e.g., master vs slave).
Conceptual operation definitions:
op monitor interval=30s timeout=10s on-fail=restart
op monitor role=Master interval=10s timeout=20s on-fail=demoteMonitoring policies affect:
- How quickly the cluster reacts to failures.
- Whether a failure triggers restart, demotion, or relocation.
Failure Handling and Recovery Policies
When a resource fails, the cluster follows policies such as:
- Restart on the same node (default in many cases).
- Migrate to another node after certain attempts.
- Fence (reboot/power off) a node if it is suspected of being in a bad state (handled in the fencing/STONITH part of clustering).
Typical per-resource failure settings (conceptually):
migration-threshold: number of failures before the resource is moved off the node.failure-timeout: after how long the failure count is forgotten.
Example:
set_meta db migration-threshold=3 failure-timeout=60sInterpretation:
- Retry up to 3 times on the same node within 60 seconds.
- If failures exceed this threshold in that window, move
dbto another node.
Also common:
on-failbehavior per operation:restartmigrateignorefence
Careful tuning is important:
- Over-aggressive fencing or migrations can cause instability.
- Too lenient policies can lead to prolonged outages.
Clones and Distributed Resources
Some resources are intended to run on all or several nodes:
- Cluster filesystems that need helpers everywhere.
- Distributed lock managers.
- Background daemons used by other services.
A clone resource represents N identical instances of the same resource.
Properties:
clone-max: maximum node count on which clones run.clone-min: minimum number of instances required to consider the cluster healthy.
Conceptually:
clone dlm_clone dlm_agent meta clone-max=4 clone-min=2For services that must be available on multiple nodes simultaneously, clones work in tandem with:
- Colocation constraints (to ensure other resources run only where the clone instance is also running).
- Order constraints (start shared infrastructure before dependent services).
Maintenance Mode and Manual Control
You often need to temporarily override automatic behavior:
- For maintenance on a node (kernel upgrade, hardware change).
- For testing failover behavior.
- For debugging failing resources.
Key mechanisms:
- Maintenance mode: the cluster stops taking automatic actions on resources (but may still report state).
- Manual start/stop/migrate:
- You can tell the cluster to move a resource to a specific node.
- You can disable a resource temporarily (set target role to
Stopped).
Conceptually:
# Stop the cluster from automatically managing resources
set_property maintenance-mode=true
# Stop a single resource
set_meta web target-role=Stopped
# Force-move a resource to another node
migrate web node2
Always use the cluster’s control commands rather than starting/stopping services directly with systemctl, unless your stack explicitly supports it, to avoid the cluster becoming “confused” about resource state.
Designing a Resource Management Strategy
When setting up cluster resource management in practice, common steps are:
- Identify all required resources:
- IPs, storage, filesystems, applications, supporting daemons.
- Define resource agents and parameters:
- Choose the appropriate agent type (OCF, systemd, etc.).
- Configure paths, ports, config files.
- Model relationships with constraints:
- Use colocation + order for stacks.
- Use groups for simple linear stacks.
- Use multi-state resources where needed (replication).
- Plan placement rules:
- Location constraints for node preferences.
- Stickiness and scores to reduce churn.
- Configure monitoring and timeouts:
- Different monitor intervals per role if necessary.
- Ensure timeouts are realistic.
- Define failure and recovery policies:
- migration thresholds
- failure timeouts
- fencing behavior (handled in cluster-wide settings and fencing configuration).
- Test scenarios:
- Node failure.
- Resource failure.
- Manual migrations.
- Restart of cluster software itself.
The result is a cluster that behaves predictably under normal operations and failures, with resources placed, started, and recovered according to your explicit policies rather than ad-hoc behavior.