Table of Contents
Why Configuration Management Matters
In any non-trivial environment, you will have more than one Linux system: development machines, test servers, production servers, CI runners, maybe cloud instances in multiple regions. Manually configuring each system (installing packages, editing config files, adding users) quickly becomes:
- Slow — repeating the same setup by hand.
- Inconsistent — “works on server A but not server B”.
- Opaque — nobody knows exactly what changed, when, or why.
- Fragile — a crashed server is hard to rebuild identically.
Configuration management (CM) solves this by treating your system configuration as code: version-controlled, repeatable, testable, and automatable.
Core Ideas in Configuration Management
Declarative vs. Imperative Configuration
Tools typically follow one of two styles:
- Imperative: “Do these steps in this order.”
- Example idea:
install nginx,edit /etc/nginx/nginx.conf,restart nginx. - You define how to get there.
- Declarative: “This is the state the system must be in.”
- Example idea: “Package
nginxis installed, servicenginxis enabled and running, file/etc/nginx/nginx.confhas this content.” - You define what you want; the tool figures out how.
Most modern CM tools lean heavily toward declarative configuration, because it’s easier to reason about and re-apply safely. When you see a CM file (e.g., Ansible playbook, Puppet manifest), try to identify whether it describes state (“package is installed”) or procedures (“run this command”).
Idempotence
A key property in CM is idempotence:
- An operation is idempotent if running it once or a hundred times yields the same result and doesn’t cause harm by repetition.
CM tools strive for idempotence:
- If
nginxis already installed and correctly configured, applying the configuration again should: - Not reinstall it unnecessarily.
- Not restart services without reason.
- Not keep changing files that are already correct.
Idempotence matters because:
- You can safely re-apply your configuration as often as you want (e.g., via cron, CI, or orchestration).
- Drift (manual changes, partial failures) can be corrected automatically by re-applying the config.
When writing CM code, always think: “If this runs twice, will the second run be a no-op?”
Desired State vs. Configuration Drift
- Desired state: What your CM code describes — exactly how systems should look.
- Actual state: How systems look right now.
- Drift: The difference between actual and desired state, often caused by:
- Manual changes made directly on servers (“quick hotfixs”).
- Failed deployments or partial changes.
- Software updates made outside the CM tool.
CM helps by:
- Detecting drift (e.g., “file content changed from what is defined”).
- Correcting drift (restoring the desired state).
- Optionally reporting or failing when unsupported drift occurs.
A good practice is to avoid manual changes on managed systems and instead update the CM definitions and re-apply them.
Typical Use Cases for Configuration Management
Configuration management is used for more than “install a package.” Common scenarios include:
- Provisioning new servers
- Install base packages.
- Configure SSH, users, sudo rules.
- Set time zone, locale, NTP.
- Application configuration
- Set config files for web servers, application servers, databases.
- Template application config based on environment (dev, staging, prod).
- Security baseline
- Ensure firewall configuration.
- Enforce SSH settings.
- Disable unneeded services.
- Ensure specific permissions on sensitive files.
- User and access management
- Create users and groups.
- Control which SSH keys are authorized.
- Infra consistency across environments
- Keep dev/staging/prod aligned.
- Ensure test environments are reproducible.
- Rebuilding or scaling systems
- Quickly spawn identical instances behind a load balancer.
- Rebuild failed nodes from scratch.
Configuration Management vs Related Concepts
CM often appears alongside other DevOps topics; it’s worth distinguishing them:
- Configuration Management vs Provisioning
- Provisioning: “Create the resource” (VM, container, cloud instance).
- CM: “Configure what’s inside the resource” (packages, services, configs).
- In practice, tools like Terraform handle provisioning, while CM tools configure the OS and applications.
- Configuration Management vs Containers
- Containers package application + dependencies, often reducing the need for per-server configuration.
- But you still configure:
- The base images.
- The container host (e.g., Docker, Podman).
- Non-containerized services.
- CM is still useful in container-heavy environments, especially for the underlying hosts and shared services.
- Configuration Management vs Image Baking
- Image baking: pre-build VM/container images with everything pre-installed.
- CM: can be used inside image builds, and also after boot to finalize or maintain state.
Key Features of Modern Configuration Management Tools
Different tools (like Ansible, Puppet, and Chef, which are covered in the later subsections) share some common capabilities:
Modules / Resources / Providers
Most CM tools abstract system operations into resources (sometimes called modules, providers, or tasks), such as:
- A package resource: install or remove packages.
- A service resource: enable and start/stop services.
- A file or template resource: manage file content and permissions.
- A user resource: manage user accounts and groups.
Instead of writing apt install nginx, you define a resource such as “package nginx is present”. The tool then uses the system’s package manager under the hood.
Inventory / Node Targeting
Configuration management is about multiple machines, not just one.
- You maintain an inventory (a list of hosts/nodes) grouped by roles, environments, or functions.
- Examples of groups:
web_serversdb_serversstagingproduction- CM code can be applied selectively:
- “Run this configuration on all
web_servers.” - “Apply this change only in
staging.”
Variables and Parameterization
To avoid duplicating configuration for each environment, CM tools rely on variables:
- Example variables:
env(e.g.,dev,prod)db_hostmax_connections- Variables can:
- Change values per host/group/environment.
- Be combined into structured data (dictionaries, lists) to drive complex configurations.
- This allows you to keep one general configuration that behaves differently based on where it’s applied.
Templating
Many config files are similar between servers but differ in a few values. CM tools support templates, typically text files with placeholders for variables:
- Template example (conceptually):
listen {{ port }};server_name {{ server_name }};- At run time, the tool:
- Fills in variables for each host.
- Generates the correct config file.
- Optionally triggers a service restart when the file changes.
Handlers / Notifications
When configuration changes, you often need to run follow-up actions (e.g., restart a service). CM tools provide handlers or similar mechanisms:
- A handler is triggered only when something actually changes.
- This avoids unnecessary restarts:
- If the config file is identical, the handler is not called; the service is not restarted.
Idempotent Execution Reports
When CM runs, you usually get a structured report, for example:
- Changed: Items that were modified.
- OK: Items already in desired state.
- Failed: Items that couldn’t be applied.
This helps you:
- See exactly what changed.
- Detect unexpected changes.
- Integrate with CI/CD (pipeline fails if configuration application fails).
Common Configuration Management Design Patterns
Roles and Reuse
To avoid repetition, CM code is organized into reusable units often called roles, cookbooks, or modules (terminology differs by tool).
Typical examples:
base: Configuration for all servers (time syncing, basic packages, security policies).web: Common config for all web servers.db: Common config for database servers.
You can then compose these units:
- A cache server might have
base+cacheroles. - A monolithic app server might have
base+web+approles.
Environment-Specific Overrides
You rarely want identical settings for dev and production. Typical patterns:
- Separate variables per environment:
dev: debug mode on, low resource limits.prod: debug off, higher resource limits.- Shared configuration code with different data inputs:
- Same “how” applied to all environments; “what values” differ.
This keeps behavior consistent while respecting environment differences.
Git-Driven Configuration
Since CM treats configuration as code, it fits naturally into Git workflows:
- Store all CM definitions in a repository.
- Use branches for:
- Feature work on infrastructure.
- Testing changes in non-prod environments.
- Use pull/merge requests:
- Peer review for infra changes.
- Audit trail of who changed what, when, and why.
- Tag or release versions of your configuration.
Often called GitOps when applied more broadly, this idea is central for modern CM.
Integrating Configuration Management into a DevOps Workflow
With CI/CD
Configuration management isn’t just run manually from your laptop:
- Linting and syntax checks:
- Catch errors in configuration files early.
- Unit-style tests:
- Verify that templates render, variables exist, etc.
- Integration tests (often with tools like Testinfra or similar, depending on CM):
- Boot a test VM/container.
- Apply CM.
- Assert that services are running, ports are open, files exist.
You can then:
- Automatically apply CM to non-prod on successful tests.
- Use change approvals for production.
With Cloud Infrastructure as Code
When combined with tools like Terraform:
- Terraform:
- Creates infrastructure: VMs, networks, security groups, load balancers.
- Configuration management:
- Configures the OS and applications inside those VMs.
This separation keeps your CM focus on system state rather than cloud resource lifecycles.
Practical Considerations and Best Practices
Start with a Baseline
Before automating everything, define a baseline for all servers:
- Security hardening level.
- Logging settings.
- Monitoring/agent installation.
- Basic system packages.
Apply this baseline via CM to every machine. Then layer on more specific roles for individual services.
Avoid Manual Changes on Managed Nodes
Mixing manual changes with automated CM leads to surprises:
- CM may revert manual changes on the next run.
- Manual changes may temporarily “work” but aren’t preserved.
Preferred approach:
- Treat configuration as code: all changes go into your CM definitions.
- Re-apply CM to enforce the new desired state.
Define Clear Ownership
Configuration management introduces new “code” that someone must own:
- Who maintains base roles?
- Who reviews infra changes?
- How are emergencies handled (e.g., quick fixes)?
Clarifying ownership helps avoid drift and inconsistent styles.
Go Incrementally
You don’t need to automate everything at once:
- Start with easy, high-value items:
- Common package installs.
- Simple config files.
- User accounts and SSH keys.
- Gradually migrate more complex services.
- Eventually, avoid building “snowflake servers” (unique, manually tweaked machines).
How This Ties into the Next Chapters
In the following chapters you will see specific tools:
- Ansible basics:
- Agentless, SSH-based, using YAML playbooks.
- Playbooks and inventories:
- How to define what to run and on which hosts.
- Puppet overview:
- Agent-based, declarative manifests applied continuously.
- Chef overview:
- Ruby-based cookbooks, similar in goals but different in style.
This chapter’s concepts — idempotence, desired state, variables, templates, roles, inventories — are common to all of them. As you learn each tool, look for how it implements these shared ideas.