Table of Contents
In graph theory, trees are a special kind of graph that capture the idea of branching without cycles. They are simple enough to understand visually, yet powerful enough to model many important structures in mathematics, computer science, and everyday situations.
In this chapter we assume you already know what a graph is (vertices, edges, paths, connectedness, cycles) from the general Graph Theory chapter. Here we focus on what is unique to trees.
What Is a Tree?
A (simple) tree is a graph with two key properties:
- It is connected: there is a path between any two vertices.
- It has no cycles: you cannot start at a vertex, follow edges, and come back to where you started without repeating an edge.
A tree is often described informally as “a connected graph with no loops (cycles).”
A graph that has no cycles but may not be connected is called a forest. Each connected component of a forest is itself a tree.
So:
- Tree: connected + acyclic.
- Forest: acyclic, but may be disconnected; a collection of trees.
Equivalent Characterizations of Trees
Trees have several different descriptions that all turn out to mean the same thing. For a finite graph, the following are equivalent ways to define a tree:
Let $G$ be a finite simple graph with $n$ vertices. Then the following statements are equivalent:
- $G$ is connected and has no cycles (the usual definition).
- $G$ is connected and has exactly $n - 1$ edges.
- $G$ has no cycles and has exactly $n - 1$ edges.
- Between any two distinct vertices of $G$ there is exactly one simple path.
- $G$ is connected, and removing any edge disconnects it (it is minimally connected).
- $G$ has no cycles, and adding any new edge creates a cycle (it is maximally acyclic).
You do not need to prove all of these equivalences now, but you should understand what each one is saying. They highlight different aspects of trees:
- “Unique path” emphasizes structure.
- “$n-1$ edges” emphasizes counting.
- “Minimally connected” and “maximally acyclic” emphasize how fragile the structure is: you cannot remove or add an edge without changing its essential nature.
When working with problems, it is often useful to choose whichever characterization fits best.
Basic Properties of Trees
Assume throughout this section that our trees are finite and simple (no loops, no multiple edges between the same pair of vertices).
Let $T$ be a tree with $n$ vertices and $m$ edges.
- Edges vs. vertices
A finite tree satisfies:
$$
m = n - 1.
$$
This is one of the most useful facts: if you are told you have a connected acyclic graph on $n$ vertices, you can immediately conclude it has $n-1$ edges, and conversely.
- Every nontrivial tree has at least two leaves
A leaf (also called an end-vertex) is a vertex with degree $1$ (one incident edge). Every tree with at least two vertices has at least two leaves.
Intuitively, if you picture a branching diagram, there must be “endpoints” at the outer edges.
- Removing leaves
If you remove a leaf and its incident edge from a tree with at least two vertices, the remaining graph is still a tree (with one fewer vertex). This gives a natural way to simplify a tree while preserving its structure type.
- Minimal connectivity
Because a tree has no cycles, every edge is essential for connectivity. Removing any edge disconnects the tree.
- Uniqueness of simple paths
Between any two vertices in a tree, there is exactly one simple path. If there were two different simple paths, together they would form a cycle, which is not allowed in a tree.
These properties are frequently used in proofs about trees and in algorithms that work on trees.
Leaves and Internal Vertices
Within a tree, vertices can be classified by degree:
- A leaf has degree $1$.
- A vertex of degree at least $2$ is often called an internal vertex (especially in rooted trees, discussed later).
Some simple consequences:
- A tree on $n \ge 2$ vertices has at least two leaves.
- If all vertices had degree at least $2$, the total degree would be at least $2n$, but each edge contributes $2$ to the total degree, so
$$
\sum \deg(v) = 2m = 2(n-1) = 2n - 2,
$$
which is less than n$ for $n \ge 2$. This contradiction shows that at least one vertex (and in fact at least two) must have degree $.
Knowing about leaves is useful for inductive arguments: many proofs about trees proceed by removing a leaf, applying induction to the smaller tree, and then adding the leaf back.
Rooted Trees and Hierarchical Structure
So far, a tree is an undirected graph. In many applications, we choose a special vertex and think of the tree as “growing” out from this point.
A rooted tree is a tree with one distinguished vertex called the root.
Once a root is chosen, we can think in terms of hierarchy:
- Parent and child:
- The root has no parent.
- For any other vertex, its parent is the adjacent vertex that is one step closer to the root.
- Any vertices directly connected to a vertex and further from the root are its children.
- Ancestors and descendants:
- If you can go from vertex $u$ to vertex $v$ by repeatedly following parent-child links downward, then $u$ is an ancestor of $v$, and $v$ is a descendant of $u$.
- Levels (or depth):
- The root is at level $0$.
- Its children are at level $1$, their children at level $2$, and so on.
- Subtrees:
- Given a vertex $v$, consider $v$ and all its descendants together with the edges between them. This forms a subtree rooted at $v$.
This hierarchical point of view is very common in computer science, for example in file system directories and organizational charts.
Note that the underlying graph is still the same tree; “rooted” just means we have added extra structure (a choice of a special vertex and a direction for how we think about it).
Special Types of Trees
Several specific families of trees are important both theoretically and in applications.
Paths
A path graph $P_n$ is a tree where all vertices have degree at most $2$, and exactly two vertices (the endpoints) have degree $1$.
Visually, this is just a straight line of $n$ vertices connected in a row. It is a tree because it is connected and has no cycles.
Stars
A star graph $S_n$ is a tree with one central vertex of degree $n-1$ and $n-1$ leaves. Every leaf is connected directly to the center, and there are no other edges.
- For $n = 1$, a single vertex can be considered a trivial star.
- For $n = 2$, a single edge is both a path and a star.
Stars are extreme in the sense that one vertex has very high degree and all others have degree $1$.
Binary Trees (informal introduction)
A binary tree is a rooted tree where each vertex has at most two children. Binary trees appear heavily in computer science (for example in searching, sorting, and representing expressions). Details about balanced binary trees or algorithms on them belong to other, more specialized chapters, but it is useful to recognize them here as a special case of rooted trees.
Spanning Trees
Given a connected graph that is not itself a tree (because it has cycles), we can often extract a tree that touches all its vertices.
Let $G$ be a connected graph. A spanning tree of $G$ is a subgraph $T$ such that:
- $T$ is a tree.
- $T$ contains all the vertices of $G$.
In other words, a spanning tree “spans” all the vertices but uses only some of the edges, just enough to keep the graph connected and acyclic.
Key points:
- Every connected finite graph has at least one spanning tree.
- A spanning tree of a graph with $n$ vertices has $n-1$ edges (since it is a tree).
- Removing any edge from a spanning tree disconnects it.
- Adding any extra edge from the original graph to a spanning tree creates exactly one cycle.
Spanning trees are important because they reduce a complex graph to a simpler structure while still connecting all vertices. This is a foundation for many algorithms and optimization problems.
Tree Counting and Degree Relations (Overview)
There are rich counting results about trees, especially in more advanced discrete mathematics. One classical result, for example, says that the number of different labeled trees on $n$ vertices (where each vertex is given a distinct label, say $1,2,\dots,n$) is:
$$
n^{\,n-2}.
$$
This is known as Cayley’s formula. The proof is not necessary at a beginner level, but you should be aware that trees can be systematically counted and studied.
A simple but useful degree relation in any tree $T$ with vertex degrees $d_1, d_2, \dots, d_n$ is:
$$
\sum_{i=1}^{n} d_i = 2(n-1).
$$
This follows from the general handshaking lemma for graphs and the fact that a tree has $n-1$ edges. This identity is often combined with knowing how many leaves a tree has, or with bounds on degrees, to get information about the structure.
Trees in Applications
Even without going into algorithmic detail, it is worth recognizing where trees naturally appear:
- Family trees: ancestors and descendants form a branching structure.
- Organizational charts: positions in a company with supervisors and subordinates.
- Decision trees: choices and possible outcomes forming a branching process.
- File systems: folders and subfolders can be modeled as a rooted tree.
- Communication networks: minimal ways to connect multiple locations without redundancy often form a tree.
In each of these, the essential features are:
- A sense of hierarchy or branching.
- No cycles in the structure (you do not have two different “supervisors” in a strict tree, for example).
- A unique simple path between any two points in the structure.
Recognizing that these are all instances of the same mathematical idea helps unify many seemingly different situations.
Summary
Trees are connected, acyclic graphs with a very rigid and useful structure. For finite trees:
- There are $n-1$ edges for $n$ vertices.
- Any two vertices are joined by exactly one simple path.
- Removing any edge disconnects the tree; adding an edge creates a cycle.
- There are always at least two leaves if there are at least two vertices.
Rooted trees introduce hierarchy, leading to concepts like parent, child, and levels. Spanning trees connect all vertices of a graph without cycles, and they lie at the heart of many applications and algorithms.