DFS and topological sort
CS340
Depth first search
breadth
depth
Search "deeper" whenever possible
*example shows discovery times
Depth first search
Input: G = (V,E), directed or undirected. No source vertex is given!
Output: 2 timestamps on each vertex:
v.d discovery time
v.f finishing time
These will be useful for other algorithms later on.
Can also compute v.π
3
Depth first search
Will methodically explore every edge.
Start over from different vertices as necessary.
As soon as we discover a vertex, explore from it.
Unlike BFS, which puts a vertex on a queue so that we explore from it later.
DFS may repeat from multiple source nodes
Unlike BFS, result is a forest of DFS trees
Depth first search
As DFS progresses, every vertex has a color:
WHITE = undiscovered
GRAY = discovered, but not finished (not done exploring from it)
BLACK = finished (have found everything reachable from it)
Discovery and finishing times:
Unique integers from 1 to 2|V|
For all v, v.d < v.f
In other words, 1 ≤ v.d < v.f ≤ 2|V|
DFS, recursive version
DFS
Time complexity
The procedure DFS-VISIT is called exactly once for each vertex v ∈ V , since the vertex u on which DFS-VISIT is invoked must be white and the first thing DFS-VISIT does is paint vertex u gray.
During an execution of DFS-VISIT, the loop on lines 4–7 executes Adj[E] times = Θ(E).
The running time of DFS is therefore Θ(V + E)
Notice that BFS was O(V + E) because it was not certain that every vertex would be visited.
Properties of DFS
Discovery and finishing times have a parenthesis structure
If we represent the discovery of vertex u with a left parenthesis “(u” and represent its finishing by a right parenthesis “u)”, then the history of discoveries and finishings makes a well-formed expression in the sense that the parentheses are properly nested.
Parenthesis Structure
For all u,v, exactly one of the following holds:
1. u.d < u.f < v.d < v.f or v.d < v.f < u.d < u.f (i.e., the intervals [u.d, u.f ] and [v.d, v.f] are disjoint) and neither of u and v is a descendant of the other.
( ) [ ]
2. u.d < v.d < v.f < u.f and v is a descendant of u.
[ ( ) ]
3. v.d < u.d < u.f < v.f and u is a descendant of v.
( [ ] )
10
Nesting of descendants
Vertex v is a proper descendant of vertex u in the depth-first forest for a (directed or undirected) graph G if and only if u.d < v.d < v.f < u.f.
White path theorem
v is a descendant of u if and only if at time u.d, there is a path from u to v consisting of only white vertices. (Except for u, which was just colored gray.)
Classification of edges
Tree edge: in the depth-first forest. Found by exploring (u, v).
Back edge: (u,v), where u is a descendant of v.
Forward edge: (u,v), where v is a descendant of u, but not a tree edge.
Cross edge: any other edge. Can go between vertices in same depth-first tree or in different depth-first trees.
Classification of edges
When we first explore an edge (u,v), the color of
vertex v tells us something about the edge:
1. WHITE indicates a tree edge,
2. GRAY indicates a back edge, and
3. BLACK indicates a forward or cross edge.
Classification of edges
Forward and cross edges never occur in a depth-first search of an undirected graph.
In a depth-first search of an undirected graph G, every edge of G is either a tree edge or a back edge.
Interview Questions
Show how depth-first search works. Assume that the for loop of lines 5–7 of the DFS procedure considers the vertices in alphabetical order, and assume that each adjacency list is ordered alphabetically. Show the discovery and finishing times for each vertex, and show the classification of each edge.
Solution
Interview Questions
Show that using a single bit to store each vertex color suffices by arguing that the DFS procedure would produce the same result if line 3 of DFS-VISIT was removed.
Rewrite the procedure DFS, using a stack to eliminate recursion.
18
Interview Questions
Let G = (V,E) be a connected, undirected graph. Give an O(V+E)-time algorithm to compute a path in G that traverses each edge in E exactly once in each direction.
Describe how you can find your way out of a maze if you are given a large supply of pennies.
19
Interview Questions
Depth first search will turn out to be an incredibly useful algorithm. Tell how DFS can help solve the following problems:
Minimum spanning tree and all pair shortest path tree (when will DFS work?).
Detecting a cycle in a graph
Path Finding (find a path between two given vertices u and z)
Topological Sorting
Testing if a graph is bipartite
Finding Strongly Connected Components
Solving puzzles with only one solution, such as mazes. (DFS can be adapted to find all solutions to a maze by only including nodes on the current path in the visited set.)
Topological Sort
Directed acyclic graph (dag)
A directed graph with no cycles.
A topological sort of a dag is a linear ordering of all its vertices such that if G contains an edge (u,v), then u appears before v in the ordering.
An ordering of its vertices along a horizontal line so that all directed edges go from left to right.
This is a different kind of sort than we have done in the past.
Dag of Cardiovascular Disease
A dag can be used to indicate causality.
Useful in social sciences, epidemiology, etc.
http://www.omicsonline.org/using-directed-acyclic-graphs-for-investigating-causal-paths-for-cardiovascular-disease-2155-6180.1000182.php?aid=20947
Dag of course prerequisites
Dags are used in scheduling, when one thing must happen before another
https://www.cs.northwestern.edu/academics/courses/311/html/graphs.html
Dag of making pancakes
Dags can model dependencies
Topological Sort
A directed edge (u,v) indicates that item u must be put on before item v
Topological Sort
What is the time complexity?
It is just a DFS with O(1) inserting to front of a linked list, so Θ(V+E)
27
Using DFS to detect a dag
A directed graph G is acyclic if and only if a depth-first search of G yields no back edges
Interview Questions
What is the ordering of vertices produced by a topological sort?
Interview Questions
Give an algorithm that determines whether or not a given undirected graph contains a cycle. Your algorithm should run in O(V) time, independent of |E|.
Another way to perform topological sorting on a directed acyclic graph is to repeatedly find a vertex of in-degree 0, output it, and remove it and all of its outgoing edges from the graph. Explain how to implement this idea so that it runs in time O(V+E). What happens to this algorithm if G has cycles?
30
Strongly Connected Components
A strongly connected component of a directed graph G = (V,E) is a maximal set of vertices C ⊆V such that for every pair of vertices u and v in C, we have both u↝v and v↝u; that is, vertices u and v are reachable from each other.
Strongly Connected Components
Component Graph
Component Graph
GSCC = (VSCC,ESCC).
VSCC has one vertex for each SCC in G.
ESCC has an edge if there’s an edge between the corresponding SCC’s in G.
GSCC is a dag.
Transpose of a directed graph
GT is the transpose of G.
GT is G with all edges reversed.
Can create GT in Θ(V+E) time if using adjacency lists.
Do G and GT have the same strongly connected components?
Compute Strongly Connected Components
Why does it work?
What does it mean that vertices are considered in order of decreasing finishing time?
decreasing finishing time = topological sort!
36
Interview Questions
How can the number of strongly connected components of a graph change if a new edge is added?
37