2.6.1 What is a process?

Table of Contents

Understanding Processes in Linux

In Linux, almost everything that "runs" is a process. When you execute a command, start a program from your desktop, or run a background service, the system represents that activity internally as one or more processes.

This chapter focuses on what a process is conceptually, without yet going deep into how to view or control them (that’s for later subsections).

Basic Definition

A process is an instance of a running program, together with all the information the operating system needs to manage it.

You can think of it as:

The program’s code (the instructions to execute)
The program’s data (variables, buffers, etc.)
The resources it uses (CPU time, memory, open files, network connections)
Its execution state (where it is in the code, what it’s waiting for)
Its identity information (who owns it, what IDs it has, which parent process started it)

The same program file on disk can correspond to many processes. For example, if you open three terminal windows, you are likely running three separate bash processes, all from the same binary.

Processes vs Programs vs Commands

These three words are often mixed up, but they are not the same:

Program: A file on disk containing executable code (e.g. /usr/bin/ls).
Command: What you type in the shell, which may run a program, built-in, alias, or function (e.g. ls -l /home).
Process: The running instance created when a program is executed.

So:

You type a command (ls).
The shell locates the program (/usr/bin/ls).
The kernel loads the program into memory and creates a process to run it.

When the process exits, the program file still exists; only that running instance ends.

Key Properties of a Process

Each process has a set of key attributes the kernel tracks. Later chapters will show how to see these, but here’s what they represent.

Process ID (PID)

Every process has a unique Process ID (PID), which is a non-negative integer assigned by the kernel.

Used by tools to refer to processes (for example, when stopping or signaling them).
PIDs are reused after processes exit, but never overlap at the same time.

Example PIDs you might see:

1 – a special system process (systemd or similar on modern Linux).
High numbers – regular user processes like shells, editors, browsers.

Parent and Child Processes (PPID)

Processes form a tree:

When a process creates another process (e.g., your shell running a command), the new one is a child.
The creator is the parent.

Key ideas:

Every process (except the very first one) has a parent process ID (PPID).
When a parent exits before its child, the child gets “adopted” by a special system process (commonly PID 1).

This parent–child relationship is important for job control, cleanup, and how shells manage pipelines and background jobs.

User and Group Ownership (UID, GID)

Each process runs on behalf of a user:

UID (User ID) – which account owns the process.
GID (Group ID) – which primary group the process belongs to.

This affects:

Which files the process can read/write/execute.
Which network ports it can open.
Whether it can perform administrative actions (e.g., UID 0 for root).

Process State

At any moment, a process is in some state, such as:

Running – currently executing on a CPU or ready to run.
Sleeping – waiting for something (disk I/O, user input, network, etc.).
Stopped – execution paused (e.g., by a signal).
Zombie – has finished executing but still has an entry in the process table until its parent has acknowledged its exit.

You’ll see abbreviations for these states when viewing processes (e.g., R, S, T, Z).

Memory Layout and Address Space

Each process has its own address space in memory:

Code segment – the actual instructions.
Data segments – global and static data.
Heap – dynamically allocated memory (malloc, new in programming languages).
Stack – function calls, local variables, return addresses.

This separation means:

One process normally cannot read or write another process’s memory directly.
If a process crashes, it usually doesn’t break others (isolation).

Environment

When a process starts, it receives a set of environment variables (like PATH, HOME, LANG):

Inherited from its parent.
Influences how the program behaves.
Can be changed by the shell before running the program.

For example, setting LANG=C might affect how sorting or text handling works for a process.

File Descriptors and Resources

A process has a table of file descriptors – small integers that represent open resources:

Regular files
Pipes
Sockets
Terminals
Devices

Standard ones are:

0 – standard input (stdin)
1 – standard output (stdout)
2 – standard error (stderr)

Redirection in the shell (>, <, 2>, etc.) manipulates these for the processes you run.

The kernel also tracks other resource usage per process (CPU time, open files count, etc.).

How Processes Are Created

At a high level, most new processes on Linux are created using two core operations:

fork – the parent creates a (almost) copy of itself.
exec – the new process replaces its code and data with a different program.

For a typical command:

Your shell process calls fork() – now there are two nearly identical shell processes.
The child process calls execve() (or related) to load the program you requested (e.g., /usr/bin/ls).
The child process becomes ls, runs, then exits.
The parent shell waits, then shows a prompt again.

You don’t need to call fork or exec manually when using the shell; they’re done behind the scenes.

Foreground vs Background Processes

From the point of view of a terminal:

A foreground process is attached to that terminal:

It receives input from your keyboard.
It can write directly to your screen.
Your shell usually waits for it to finish before giving a prompt.

A background process is detached from user input:

It runs while you keep using the shell.
It normally doesn’t read from the terminal.

The shell’s job-control features manage how your commands become foreground or background processes, which is covered in a later section.

Processes vs Threads (High-Level)

Both processes and threads are ways to run code concurrently:

A process has its own address space and resources.
A thread shares the same address space and many resources with other threads in the same process.

From the user’s perspective on Linux:

Complex programs (browsers, IDEs, servers) often create multiple threads within one process.
Process-viewing tools can show both processes and threads, but conceptually:

Multiple processes = multiple separate running programs.
Multiple threads = multiple flows of control inside a single program.

Detailed thread handling is beyond this chapter, but it’s useful to know that not all concurrency is multiple processes.

Why Processes Matter for Everyday Linux Use

Understanding what a process is helps you:

Recognize what is actually running on your system.
Find and stop misbehaving programs.
Understand resource usage (which process is using CPU/RAM).
Manage long-running tasks (running commands in the background).
Understand why permissions, ownership, and environment variables affect how programs run.

The next chapters in “Working with Processes” will build on this by showing how to see, control, and manage these processes from the command line.

Comments

Please login to add a comment.

Don't have an account? Register now!