Chapter 2
DFS in Directed Graphs, Strong
Connected Components, and DAGs
CS 473: Fundamental Algorithms, Fall 2011
August 25, 2011
2.0.0.1 Strong Connected Components (SCCs)
Algorithmic Problem
Find all SCCs of a given directed graph.
Previous lecture:
Saw an O(n · (n + m)) time algorithm.
This lecture: O(n + m) time algorithm.
Basic Graph Theory
Breadth First search
Depth First Search
Directed Graphs
Digraphs and Connectivity
Digraph Representation
Searching
Directed Graphs
AB C
DE F
G H
Definition
A directed graph (also called a digraph) is G = (V , E ), where
V is a set of vertices or nodes
E V × V is set of ordered pairs of vertices called edges
Viswanathan CS473ug
2.0.0.2 Graph of SCCs
.
.
A
.
B
.
C
.
D
.
E
.
F
.
G
.
H
Graph G
.
.
B, E, F
.
G
.
H
.
A, C, D
Graph of SCCs G
SCC
Meta-graph of SCCs
Let S
1
, S
2
, . . . S
k
be the strong connected components (i.e., SCCs) of G. The graph of SCCs
is G
SCC
(A) Vertices are S
1
, S
2
, . . . S
k
(B) There is an edge (S
i
, S
j
) if there is some u S
i
and v S
j
such that (u, v) is an edge
in G.
1
2.0.0.3 Reversal and SCCs
Proposition 2.0.1 For any graph G, the graph of SCCs of G
rev
is the same as the reversal
of G
SCC
.
Proof : Exercise.
2.0.0.4 SCCs and DAGs
Proposition 2.0.2 For any graph G, the graph G
SCC
has no directed cycle.
Proof : If G
SCC
has a cycle S
1
, S
2
, . . . , S
k
then S
1
S
2
· · · S
k
is an SCC in G. Formal
details: exercise.
2.1 Directed Acyclic Graphs
2.1.0.5 Directed Acyclic Graphs
Definition 2.1.1 A directed graph G is a
directed acyclic graph (DAG) if there
is no directed cycle in G.
..
1
.
2
.
3
.
4
2.1.0.6 Sources and Sinks
source
sink
1
2 3
4
Definition 2.1.2 (A) A vertex u is a
source if it has no in-coming edges.
(B) A vertex u is a sink if it has no out-
going edges.
2
2.1.0.7 Simple DAG Properties
(A) Every DAG G has at least one source and at least one sink.
(B) If G is a DAG if and only if G
rev
is a DAG.
(C) G is a DAG if and only each node is in its own strong connected component.
Formal proofs: exercise.
2.1.0.8 Topological Ordering/Sorting
..
1
.
2
.
3
.
4
Graph G
..
1
.
2
.
3
.
4
Topological Ordering of G
Definition 2.1.3 A topological ordering/topological sorting of G = (V, E) is an or-
dering < on V such that if (u, v) E then u v.
Informal equivalent definition: One can order the vertices of the graph along a line
(say the x-axis) such that all edges are from left to right.
2.1.0.9 DAGs and Topological Sort
Lemma 2.1.4 A directed graph G can be topologically ordered iff it is a DAG.
Proof : =: Suppose G is not a DAG and has a topological ordering . G has a cycle
C = u
1
, u
2
, . . . , u
k
, u
1
.
Then u
1
u
2
. . . u
k
u
1
!
That is... u
1
u
1
.
A contradiction (to being an order).
Not possible to topologically order the vertices.
2.1.0.10 DAGs and Topological Sort
Lemma 2.1.5 A directed graph G can be topologically ordered iff it is a DAG.
Proof :[Continued] : Consider the following algorithm:
(A) Pick a source u, output it.
(B) Remove u and all edges out of u.
(C) Repeat until graph is empty.
(D) Exercise: prove this gives an ordering.
3
Exercise: show above algorithm can be implemented in O(m + n) time.
2.1.0.11 Topological Sort: An Exam-
ple
..
1
.
2
.
3
.
4
Output: 1 2 3 4
2.1.0.12 Topological Sort: Another
Example
a
b
c
d
e
f
g
h
2.1.0.13 DAGs and Topological Sort
Note: A DAG G may have many different topological sorts.
Question: What is a DAG with the most number of distinct topological sorts for a
given number n of vertices?
Question: What is a DAG with the least number of distinct topological sorts for a
given number n of vertices?
2.1.1 Using DFS...
2.1.1.1 ... to check for Acylicity and compute Topological Ordering
Question
Given G, is it a DAG? If it is, generate a topological sort.
DFS based algorithm:
(A) Compute DFS(G)
(B) If there is a back edge then G is not a DAG.
(C) Otherwise output nodes in decreasing post-visit order.
Correctness relies on the following:
Proposition 2.1.6 G is a DAG iff there is no back-edge in DFS(G).
Proposition 2.1.7 If G is a DAG and post( v) > post(u), then (u, v) is not in G.
Proof : There are several possibilities:
(A) [pre(v), post(v)] comes after [pre(u), post(u)] and they are disjoint. But then, u was
visited first by the DFS, if (u, v) E(G) then DFS will visit v during the recursive
call on u. But then, post(v) < post(u). A contradiction.
4
(B) [pre(v), post(v)] [pre(u), post(u)]: impossible as post(v) > post(u).
(C) [pre(u), post(u)] [pre(v), post(v)]. But then DFS visited v, and then visited u.
Namely there is a path in G from v to u. But then if (u, v) E(G) then there would
be a cycle in G, and it would not be a DAG. Contradiction.
(D) No other possibility - since “lifetime” intervals of DFS are either disjoint or contained
in each other.
2.1.1.2 Example
..
1
.
2
.
3
.
4
2.1.1.3 Back edge and Cycles
Proposition 2.1.8 G has a cycle iff there is a back-edge in DFS(G).
Proof : If: (u, v) is a back edge implies there is a cycle C consisting of the path from v to u
in DFS search tree and the edge (u, v).
Only if: Suppose there is a cycle C = v
1
v
2
. . . v
k
v
1
.
Let v
i
be first node in C visited in DFS.
All other nodes in C are descendants of v
i
since they are reachable from v
i
.
Therefore, (v
i1
, v
i
) (or (v
k
, v
1
) if i = 1) is a back edge.
2.1.1.4 DAGs and Partial Orders
Definition 2.1.9 A partially ordered set is a set S along with a binary relation such
that is
1. reflexive (a a for all a V ),
2. anti-symmetric (a b and a 6= b implies b 6 a), and
3. transitive (a b and b c implies a c).
Example: For numbers in the plane define (x, y) (x
0
, y
0
) iff x x
0
and y y
0
.
Observation: A finite partially ordered set is equivalent to a DAG. (No equal elements.)
Observation: A topological sort of a DAG corresponds to a complete (or total) ordering
of the underlying partial order.
5
2.1.2 What’s DAG but a sweet old fashioned notion
2.1.2.1 Who needs a DAG...
Example
(A) V : set of n products (say, n different types of tablets).
(B) Want to buy one of them, so you do market research...
(C) Online reviews compare only pairs of them.
...Not everything compared to everything.
(D) Given this partial information:
(A) Decide what is the best product.
(B) Decide what is the ordering of products from best to worst.
(C) ...
2.1.3 What DAGs got to do with it?
2.1.3.1 Or why we should care about DAGs
(A) DAGs enable us to represent partial ordering information we have about some set (very
common situation in the real world).
(B) Questions about DAGs:
(A) Is a graph G a DAG?
Is the partial ordering information we have so far is consistent?
(B) Compute a topological ordering of a DAG.
Find an a consistent ordering that agrees with our partial information.
(C) Find comparisons to do so DAG has a unique topological sort.
Which elements to compare so that we have a consistent ordering of the items.
2.2 Linear time algorithm for finding all strong con-
nected components of a directed graph
2.2.0.2 Finding all SCCs of a Directed Graph
Problem
Given a directed graph G = (V, E), output all its strong connected components.
6
Straightforward algorithm:
Mark all vertices in V as not visited.
for each vertex u V not visited yet do
find SCC(G, u) the strong component of u:
Compute rch(G, u) using DF S(G, u)
Compute rch(G
rev
, u) using DF S(G
rev
, u)
SCC(G, u) rch(G, u) rch(G
rev
, u)
u SCC(G, u): Mark u as visited.
Running time: O(n(n + m)) Is there an O(n + m) time algorithm?
2.2.0.3 Structure of a Directed Graph
.
.
A
.
B
.
C
.
D
.
E
.
F
.
G
.
H
Graph G
.
.
B, E, F
.
G
.
H
.
A, C, D
Graph of SCCs G
SCC
Reminder G
SCC
is created by collapsing every strong connected component to a single
vertex.
Proposition 2.2.1 For a directed graph G, its meta-graph G
SCC
is a DAG.
2.2.1 Linear-time Algorithm for SCCs: Ideas
2.2.1.1 Exploit structure of meta-graph...
Wishful Thinking Algorithm
(A) Let u be a vertex in a sink SCC of G
SCC
(B) Do DFS(u) to compute SCC(u)
(C) Remove SCC(u) and repeat
Justification
(A) DFS(u) only visits vertices (and edges) in SCC(u)
(B) DFS done only in G (not in G
rev
) to compute u strong connected component (SCC).
[Magic!]
(C) DFS(u) takes time proportional to size of SCC(u)
(D) Therefore, total time O(n + m)!
2.2.1.2 Big Challenge(s)
How do we find a vertex in the sink SCC of G
SCC
?
Can we obtain an implicit topological sort of G
SCC
without computing G
SCC
?
Answer: DFS(G) gives some information!
7
.
.
B, E, F
.
G
.
H
.
A, C, D
.11 .16
.
5
.
9
G
SCC
with post times
2.2.1.3 Post-visit times of SCCs
Definition 2.2.2 Given G and a SCC S of G, define post(S) = max
uS
post(u) where post
numbers are with respect to some DFS(G).
2.2.1.4 An Example
AB C
DE F
G H
Graph G
[1
,
16]
[2
,
11] [12
,
15]
[13
,
14][3
,
10] [6
,
7]
[4
,
5]
[8
,
9]
AB C
DE F
G H
Graph with pre-post times for DFS(A);
black edges in tree
2.2.2 Graph of strong connected components
2.2.2.1 ... and post-visit times
Proposition 2.2.3 If S and S
0
are SCCs in G and (S, S
0
) is an edge in G
SCC
then post(S) >
post(S
0
).
Proof : Let u be first vertex in S S
0
that is visited.
(A) If u S then all of S
0
will be explored before DFS(u) completes.
(B) If u S
0
then all of S
0
will be explored before any of S.
A False Statement: If S and S
0
are SCCs in G and (S, S
0
) is an edge in G
SCC
then for
every u S and u
0
S
0
, post(u) > post(u
0
).
2.2.2.2 Topological ordering of the strong components
Corollary 2.2.4 Ordering SCCs in decreasing order of post(S) gives a topological ordering
of G
SCC
Recall: for a DAG, ordering nodes in decreasing post-visit order gives a topological sort.
So...
DFS(G) gives some information on topological ordering of G
SCC
!
8
2.2.2.3 Finding Sources
Proposition 2.2.5 The vertex u with the highest post visit time belongs to a source SCC
in G
SCC
Proof :¡2-¿
(A) post(SCC(u)) = post(u)
(B) Thus, post(SCC(u)) is highest and will be output first in topological ordering of G
SCC
.
2.2.2.4 Finding Sinks
Proposition 2.2.6 The vertex u with highest post visit time in DFS(G
rev
) belongs to a sink
SCC of G.
Proof :¡2-¿
(A) u belongs to source SCC of G
rev
(B) Since graph of SCCs of G
rev
is the reverse of G
SCC
, SCC(u) is sink SCC of G.
2.2.3 Linear Time Algorithm
2.2.3.1 ...for computing the strong connected components in G
do DFS(G
rev
) and sort vertices in decreasing post order.
Mark all nodes as unvisited
for each u in the computed order do
if u is not visited then
DFS(u)
Let S
u
be the nodes reached by u
Output S
u
as a strong connected component
Remove S
u
from G
Analysis
Running time is O(n + m). (Exercise)
2.2.3.2 Linear Time Algorithm: An Example - Initial steps
Graph G:
G
FE
B C
D
H
A
=
Reverse graph G
rev
:
G
FE
B C
D
H
A
9
=
DFS of reverse graph:
G
FE
B C
D
H
A
=
Pre/Post DFS numbering of reverse
graph:
6][1
,
[7
,
12]
[9
,
10] [8
,
11]
[13
,
16]
[14
,
15]
[2
,
5]
[3
,
4]
G
FE
B C
D
H
A
2.2.4 Linear Time Algorithm: An Example
2.2.4.1 Removing connected components: 1
Original graph G with rev post numbers:
G
FE
B C
D
H
A
16
11
6
12
10
15
5
4
=
Do DFS from vertex G
remove it.
FE
B C
D
H
A
11
6
12
10
15
5
4
SCC computed:
{G}
2.2.5 Linear Time Algorithm: An Example
2.2.5.1 Removing connected components: 2
Do DFS from vertex G
remove it.
FE
B C
D
H
A
11
6
12
10
15
5
4
SCC computed:
{G}
=
Do DFS from vertex H, remove it.
FE
B C
D
A
11
6
12
10
5
4
SCC computed:
{G}, {H}
10
2.2.6 Linear Time Algorithm: An Example
2.2.6.1 Removing connected components: 3
Do DFS from vertex H, remove it.
FE
B C
D
A
11
6
12
10
5
4
SCC computed:
{G}, {H}
=
Do DFS from vertex F
Remove visited vertices:
{F, B, E}.
C
D
A
6
5
4
SCC computed:
{G}, {H}, {F, B, E}
2.2.7 Linear Time Algorithm: An Example
2.2.7.1 Removing connected components: 4
Do DFS from vertex F
Remove visited vertices:
{F, B, E}.
C
D
A
6
5
4
SCC computed:
{G}, {H}, {F, B, E}
=
Do DFS from vertex A
Remove visited vertices:
{A, C, D}.
SCC computed:
{G}, {H}, {F, B, E}, {A, C, D}
2.2.8 Linear Time Algorithm: An Example
2.2.8.1 Final result
G
FE
B C
D
H
A
11
SCC computed:
{G}, {H}, {F, B, E}, {A, C, D}
Which is the correct answer!
2.2.9 Obtaining the meta-graph...
2.2.9.1 Once the strong connected components are computed.
Exercise:
Given all the strong connected components of a directed graph G = (V, E) show that the
meta-graph G
SCC
can be obtained in O(m + n) time.
2.2.9.2 Correctness: more details
(A) let S
1
, S
2
, . . . , S
k
be strong components in G
(B) Strong components of G
rev
and G are same and meta-graph of G is reverse of meta-graph
of G
rev
.
(C) consider DFS(G
rev
) and let u
1
, u
2
, . . . , u
k
be such that post(u
i
) = post(S
i
) = max
vS
i
post(v).
(D) Assume without loss of generality that post(u
k
) > post(u
k1
) . . . post(u
1
) (renum-
ber otherwise). Then S
k
, S
k1
, . . . , S
1
is a topological sort of meta-graph of G
rev
and
hence S
1
, S
2
, . . . , S
k
is a topological sort of the meta-graph of G.
(E) u
k
has highest post number and DFS(u
k
) will explore all of S
k
which is a sink component
in G.
(F) After S
k
is removed u
k1
has highest post number and DFS(u
k1
) will explore all of
S
k1
which is a sink component in remaining graph G S
k
. Formal proof by induction.
2.3 An Application to make
2.3.1 make utility
2.3.1.1 make Utility [Feldman]
(A) Unix utility for automatically building large software applications
(B) A makefile specifies
(A) Object files to be created,
(B) Source/object files to be used in creation, and
(C) How to create them
12
.
.
project
.
main.o
.
utils.o
.
command.o
.
main.c
.
utils.c
.
defs.h
.
command.h
.
command.c
2.3.1.2 An Example makefile
project: main.o utils.o command.o
cc -o project main.o utils.o command.o
main.o: main.c defs.h
cc -c main.c
utils.o: utils.c defs.h command.h
cc -c utils.c
command.o: command.c defs.h command.h
cc -c command.c
2.3.1.3 makefile as a Digraph
2.3.2 Computational Problems
2.3.2.1 Computational Problems for make
(A) Is the makefile reasonable?
(B) If it is reasonable, in what order should the object files be created?
(C) If it is not reasonable, provide helpful debugging information.
(D) If some file is modified, find the fewest compilations needed to make application consis-
tent.
2.3.2.2 Algorithms for make
(A) Is the makefile reasonable? Is G a DAG?
(B) If it is reasonable, in what order should the object files be created? Find a topological
sort of a DAG.
(C) If it is not reasonable, provide helpful debugging information. Output a cycle. More
generally, output all strong connected components.
(D) If some file is modified, find the fewest compilations needed to make application consis-
tent.
(A) Find all vertices reachable (using DFS/BFS) from modified files in directed graph,
and recompile them in proper order. Verify that one can find the files to recompile
and the ordering in linear time.
13
2.3.2.3 Take away Points
(A) Given a directed graph G, its SCCs and the associated acyclic meta-graph G
SCC
give a
structural decomposition of G that should be kept in mind.
(B) There is a DFS based linear time algorithm to compute all the SCCs and the meta-
graph. Properties of DFS crucial for the algorithm.
(C) DAGs arise in many application and topological sort is a key property in algorithm de-
sign. Linear time algorithms to compute a topological sort (there can be many possible
orderings so not unique).
14