6.3.3 Puppet overview

Table of Contents

Where Puppet Fits in Configuration Management

Puppet is one of the oldest and most widely used configuration‑management tools. In the broader configuration‑management ecosystem (Ansible, Chef, Salt, etc.), Puppet is:

Model‑driven / declarative: you describe what the system should look like, not how to get there.
Primarily agent‑based: managed nodes run a Puppet agent that talks to a central Puppet server.
Resource‑oriented: configuration is expressed as resources (files, packages, services, users, etc.).

For DevOps work, Puppet is commonly used to:

Keep large fleets of servers consistent.
Enforce security and compliance baselines.
Automate recurring system configuration (packages, users, services, config files).

This chapter focuses on Puppet’s concepts, workflow, and how it differs from other tools, not on step‑by‑step installation.

Puppet Architecture in a Nutshell

Puppet’s “classic” architecture (Puppet Enterprise and open‑source Puppet) is built around a client‑server model:

Puppet Master / Server

Central service that compiles catalogs based on your manifests and modules.
Stores code in an environment (e.g., production, development).
Often fronted by a CA to issue certificates to agents.

Puppet Agent

Installed on each managed node.
Wakes up on a schedule (default ~30 minutes) or on demand.
Requests a catalog from the server, applies it locally, and reports back.

PuppetDB (optional but common)

Stores facts and reports from agents.
Enables exported resources and more advanced queries.

You can also use Puppet apply in a masterless setup:

puppet apply runs manifests locally, without talking to a Puppet server.
Handy for small environments, testing, or simple one‑off automation.

Declarative Model and Resources

Puppet is declarative. Instead of writing step‑by‑step commands, you define desired state using resources.

A resource describes one component on a system:

File
Package
Service
User/group
Cron job
Etc.

Example resource declarations:

package { 'nginx':
  ensure => installed,
}
service { 'nginx':
  ensure  => running,
  enable  => true,
  require => Package['nginx'],
}

Key points:

ensure describes the target state (installed, present, absent, running, etc.).
Puppet decides what to execute to move from current state to desired state.
require is a dependency: the service resource waits for the package to be managed first.

The Puppet Language (Puppet DSL)

Puppet has its own domain‑specific language (DSL). For beginners, you mainly deal with:

Resources: as shown above.
Classes: group related resources into reusable units.
Defines (defined resource types): parameterized resource blocks.
Variables and parameters: pass in values to make configurations reusable.

Example of a simple class:

class webserver (
  String $docroot = '/var/www/html',
) {
  package { 'nginx':
    ensure => installed,
  }
  file { $docroot:
    ensure  => directory,
    owner   => 'www-data',
    group   => 'www-data',
    mode    => '0755',
    require => Package['nginx'],
  }
  service { 'nginx':
    ensure     => running,
    enable     => true,
    subscribe  => File[$docroot],
  }
}

Concepts visible here:

Class parameters: $docroot is configurable.
Relationships:

require ensures order (file after package).
subscribe triggers service refresh when the file changes.

You include or assign classes to nodes (more on that in classification).

Catalogs: From Code to Actions

Understanding catalogs is key to Puppet:

Agent gathers facts (system information like OS, IPs, CPU, memory) using Facter.
Agent sends facts to the Puppet server.
Puppet server compiles a catalog for that specific node:

Applies your manifests, modules, and Hiera data.
Resolves variables and templates.
Orders all resources and their relationships.

Agent applies the catalog:

Checks each resource’s current state.
Changes only what’s necessary to match the catalog.

Agent sends reports back to the server/PuppetDB.

A catalog is:

Node‑specific.
A complete desired‑state description (resources plus relationships).
The basis of idempotent execution: rerunning the same catalog should not keep making changes once the system matches it.

Idempotence and Convergence

Puppet aims for idempotent behavior:

Applying the same manifest repeatedly leads to a stable, unchanged system once it’s in the desired state.
Resources should describe end‑state, not one‑time actions.

Examples:

Desired: file { '/etc/app.conf': ensure => file, content => '...' }

Puppet updates the file only when the current content differs.

Undesired patterns: running random exec resources that always modify things, breaking idempotence unless carefully managed.

Puppet converges systems toward the defined state at each agent run, useful for:

Automatically fixing drift (manual changes).
Enforcing policies continuously.

Node Classification

Node classification is how you decide which classes apply to which machines.

Common methods:

Site manifest (site.pp)

Central place in an environment for node definitions.
Example:

   node 'web01.example.com' {
     include webserver
   }
   node /^db\d+\.example\.com$/ {
     include database_server
   }

External Node Classifier (ENC)

External script/API that tells Puppet which classes and parameters to assign.
Often integrated with CMDBs, ticket systems, or Puppet Enterprise’s console.

Role and Profile pattern (very common modern pattern):

Profiles: implement specific technology stacks (profile::nginx, profile::postgresql).
Roles: business roles that group profiles (role::web, role::db).
Nodes are assigned only one role class, and that role pulls in profiles internally.

Example:

   class role::web {
     include profile::base
     include profile::nginx
     include profile::app
   }
   node 'web01.example.com' {
     include role::web
   }

Puppet Modules

A module is Puppet’s unit of organization and reuse. A module typically includes:

manifests/: Puppet classes and defined types.
templates/: ERB or EPP templates.
files/: static files to deploy.
lib/: custom types and providers, functions, etc.
metadata.json: module metadata (name, version, dependencies).

Example simple module layout:

nginx/
  manifests/
    init.pp       # class nginx
    config.pp     # class nginx::config
  templates/
    nginx.conf.epp
  files/
    default_site.html

You usually place modules in:

/etc/puppetlabs/code/environments/production/modules/ (Puppet server)
or a similar path depending on installation.

You can download community modules from:

Puppet Forge: https://forge.puppet.com

For example, instead of writing your own Nginx logic, you might install puppet/nginx and use its classes.

Data Separation with Hiera

Puppet’s design encourages separating code from data:

Code: manifests, classes, modules that describe logic.
Data: site‑specific values (passwords, IPs, package versions, feature flags).

Hiera is Puppet’s hierarchical data lookup system:

Data is typically stored in YAML files.
You define a hierarchy, for example:

location/%{facts.location}.yaml
role/%{trusted.extensions.pp_role}.yaml
common.yaml

Puppet code then uses functions like lookup() to retrieve values:

class webserver (
  String $docroot = lookup('webserver::docroot'),
) {
  # ...
}

Benefits:

Reuse the same code across environments, roles, and locations.
Override values per environment, per node, or per group, without editing manifests.

Facter and Facts

Facter is a tool that gathers facts about the system:

Core facts: OS, hostname, IP addresses, CPU, memory, etc.
Custom facts: site‑specific or application‑specific values you define.

Within Puppet manifests you can use facts directly, like:

$facts['os']['family']
$facts['ipaddress']

Facts influence:

Conditional logic (if $facts['os']['family'] == 'Debian' { ... }).
Hiera lookup keys (%{facts.os.family}).

This enables writing cross‑platform modules that adapt to the underlying system.

Typical Puppet Workflow

A common beginner‑level lifecycle with Puppet in a team looks like:

Design

Decide what you want to manage: services, users, configs, etc.
Plan your modules, profiles, and roles.

Develop

Write or modify Puppet code in a Git repository.
Use modules and Hiera for reuse and data separation.

Test

Use puppet apply locally or in a test environment.
Optionally use unit tests (e.g., rspec-puppet) and linters.

Deploy

Push code to a Puppet server environment (production, staging, etc.).
Agents pull catalogs and apply them on their next run or via a triggered run.

Monitor

Check Puppet reports, logs, and dashboards.
Ensure runs are successful and changes behave as expected.

Comparison to Other Configuration Management Tools

From a beginner’s perspective, Puppet differs from some other tools you may encounter:

Puppet vs Ansible

Puppet:

Primarily agent‑based (though masterless is possible).
Pull model by default (agents pull catalogs).
Strong focus on declarative, model‑driven configuration.

Ansible:

Typically agentless, using SSH.
Push model by default.
Mix of declarative and procedural styles.

Puppet vs Chef

Puppet:

Uses its own DSL.
Catalog‑based, resource graph, strong idempotency focus.

Chef:

Uses Ruby as its DSL.
Also agent‑based, similar conceptual space but with different language and ecosystem.

Understanding these differences helps you decide when Puppet is a good fit:

Large, long‑lived server fleets.
Strict compliance and drift‑correction needs.
Teams who value strong separation of code/data and reusable modules.

Getting Hands‑On (Conceptual Overview)

Without going into full installation details, the basic commands and artifacts you’ll see in Puppet usage are:

Commands (agent side):

puppet agent -t – trigger an immediate agent run.
puppet apply manifest.pp – apply manifests locally (masterless or testing).

Commands (server / admin):

puppet module install author-modulename – install modules from Forge.
puppet parser validate file.pp – check syntax of Puppet manifests.

Key files/directories (varies slightly with installs):

site.pp – top‑level site manifest in an environment.
environments/ – separates production, dev, etc.
modules/ – where your Puppet modules live.
hiera.yaml and related data files – Hiera configuration and data.

For a beginner, the first steps usually are:

Learn to read and write basic resources and classes.
Use puppet apply to test manifests on a single VM.
Explore a simple module from Puppet Forge to see structure and style.

When to Consider Puppet in Your Toolset

Puppet is particularly appealing when you:

Manage many servers with similar roles.
Need continuous enforcement of configuration (automatic drift correction).
Want a mature ecosystem (Puppet Forge, Puppet Enterprise, integrations).
Prefer a strongly declarative style of infrastructure management.

Even if you ultimately use a different tool in production, understanding Puppet gives you:

A solid model of declarative configuration management.
Transferable ideas (modules, roles/profiles, facts, hierarchical data) that show up in many other tools and platforms.

Comments

Please login to add a comment.

Don't have an account? Register now!