Skip to the content.

Notes of Algorithms, Part II

“In the practice of computing, where we have so much latitude for making a mess of it, mathematical elegance is not a dispensable luxury, but a matter of life and death.” — Edsger W. Dijkstra

Notes are taken from the course Algorithms, Part II by Robert Sedgewick and Kevin Wayne. The notes are for tracking learning progress and easy reference. Section numbers are used to distinguish different parts of the content, not the original chapter numbers.

1. Undirected Graph

1.1. Introduction

Graph: Set of vertices connected pairwise by edges.

Importance of graph algorithms:

Examples of graph applications:

The Internet

The Internet

1.1.1. Graph applications

graph vertex edge
communication telephone, computer fiber optic cable
circuit gate, register, processor wire
mechanical joint rod, beam, spring
financial stock, currency transactions
transportation street intersection, airport highway, airway route
internet class C network connection
game board position legal move
social relationship person, actor friendship, movie cast
neural network neuron synapse
protein network protein protein-protein interaction
molecule atom bond

1.1.2. Some graph-processing problems

1.1.3. Graph terminology

Two vertices are connected if there is a path between them.

A path in a graph is a sequence of vertices connected by edges. A simple path is one with no repeated vertices. A cycle is a path with at least one edge whose first and last vertices are the same. A simple cycle is a cycle with no repeated edges or vertices (except the requisite repetition of the first and last vertices). The length of a path or a cycle is its number of edges.

A graph is connected if there is a path from every vertex to every other vertex in the graph. A graph that is not connected consists of a set of connected components, which are maximal connected subgraphs.

An acyclic graph is a graph with no cycles.

A tree is an acyclic connected graph. A disjoint set of trees is called a forest. A spanning tree of a connected graph is a subgraph that contains all of that graph’s vertices and is a single tree. A spanning forest of a graph is the union of spanning trees of its connected components.

A graph G with V vertices is a tree if and only if it satisfies any of the following five conditions:

A bipartite graph is a graph whose vertices we can divide into two sets such that all edges connect a vertex in one set with a vertex in the other set.


↥ back to top


1.2. Graph API

public class Graph  
Graph(int V) create an empty graph with V vertices
Graph(In in) create a graph from input stream
void addEdge(int v, int w) add an edge v-w
Iterable<Integer> adj(int v) vertices adjacent to v
int V() number of vertices
int E() number of edges
String toString() string representation

Simple client:

In in = new In(args[0]);    // read graph from input stream
Graph G = new Graph(in);

for (int v = 0; v < G.V(); v++)
    for (int w : G.adj(v))
        StdOut.println(v + "-" + w);    // print out each edge (twice)

Typical graph-processing code:

// compute the degree of v
public static int degree(Graph G, int v)
{
    int degree = 0;
    for (int w : G.adj(v)) 
        degree++;
    return degree;
}

// compute maximum degree
public static int maxDegree(Graph G)
{
    int max = 0;
    for (int v = 0; v < G.V(); v++)
        if (degree(G, v) > max)
            max = degree(G, v);
    return max;
}

// compute average degree
public static double averageDegree(Graph G)
{ return 2.0 * G.E() / G.V(); }

// count self-loops
public static int numberOfSelfLoops(Graph G)
{
    int count = 0;
    for (int v = 0; v < G.V(); v++)
        for (int w : G.adj(v))
            if (v == w) count++;
    return count/2; // each edge counted twice

Graph representation:

In practice. Use adjacency-lists representation.

representation space add edge edge between v and w? iterate over vertices adjacent to v?
list of edges E 1 E E
adjacency matrix V2 1 1 V
adjacency lists E+V 1 degree(v) degree(v)

Adjacency-lists representation Java implementation

public class Graph
{
    private final int V;                    // number of vertices
    private int E;                          // number of edges
    private Bag<Integer>[] adj;             // adjacency lists

    public Graph(int V)
    {
        this.V = V; this.E = 0;
        adj = (Bag<Integer>[]) new Bag[V];  // Create array of lists.
        for (int v = 0; v < V; v++)         // Initialize all lists
            adj[v] = new Bag<Integer>();    // to empty.
    }

    public Graph(In in)
    {
        this(in.readInt());                 // Read V and construct this graph.
        int E = in.readInt();               // Read E.
        for (int i = 0; i < E; i++)
        {                                   // Add an edge.
            int v = in.readInt();           // Read a vertex,
            int w = in.readInt();           // read another vertex,
            addEdge(v, w);                  // and add edge connecting them.
        }
    }

    public int V() { return V; }
    public int E() { return E; }
    
    public void addEdge(int v, int w)
    {
        adj[v].add(w);                      // Add w to v’s list.
        adj[w].add(v);                      // Add v to w’s list.
        E++;
    }

    public Iterable<Integer> adj(int v)
    { return adj[v]; }
}


↥ back to top


Searching through a graph is equivalent to finding way through a maze. There are many methods in solving maze problem. Trémaux’s algorithm was discovered in the 19th century, has been used about a hundred years later as depth-first search.

Use depth-first search to solve problems:

Tremaux exploration is an intuitive starting point, but it differs in subtle ways from exploring a graph, so we now move on to searching in graphs. To search a graph, invoke a recursive method that visits vertices. To visit a vertex:

Algorithm

Data structures

Java implementation of depth-first search API is as follows. The recursive method marks the given vertex and calls itself for any unmarked vertices on its adjacency list. If the graph is connected, every adjacency-list entry is checked.

public class DepthFirstSearch
{
    private boolean[] marked;
    private int count;

    public DepthFirstSearch(Graph G, int s)
    {
        marked = new boolean[G.V()];
        dfs(G, s);
    }

    private void dfs(Graph G, int v)
    {
        marked[v] = true;
        count++;
        for (int w : G.adj(v))
            if (!marked[w]) dfs(G, w);
    }

    public boolean marked(int w)
    { return marked[w]; }

    public int count()
    { return count; }
}

Design pattern. Decouple graph data type from graph processing.

public class Paths  
Paths(Graph G, int s) find paths in G from source s
boolean hasPathTo(int v) is there a path from s to v?
Iterable<Integer> pathTo(int v) path from s to v; null if no such path

Depth-first search to find paths in a graph (recursive implementation) (ALGORITHM 4.1):

public class DepthFirstPaths
{
    private boolean[] marked;   // Has dfs() been called for this vertex?
    private int[] edgeTo;       // last vertex on known path to this vertex
    private final int s;        // source

    public DepthFirstPaths(Graph G, int s)
    {
        marked = new boolean[G.V()];
        edgeTo = new int[G.V()];
        this.s = s;
        dfs(G, s);
    }

    private void dfs(Graph G, int v)
    {
        marked[v] = true;
        for (int w : G.adj(v))
            if (!marked[w])
            {
                edgeTo[w] = v;
                dfs(G, w);
            }
    }

    public boolean hasPathTo(int v)
    { return marked[v]; }
    
    public Iterable<Integer> pathTo(int v)
    {
        if (!hasPathTo(v)) return null;
        Stack<Integer> path = new Stack<Integer>();
        for (int x = v; x != s; x = edgeTo[x])
            path.push(x);
        path.push(s);
        return path;
    }
}

This Graph client uses depth-first search to find paths to all the vertices in a graph that are connected to a given start vertex s. To save known paths to each vertex, this code maintains a vertex-indexed array edgeTo[] such that edgeTo[w] = v means that v-w was the edge used to access w for the first time. The edgeTo[] array is a parent-link representation of a tree rooted at s that contains all the vertices connected to s.

Depth-first search properties:


↥ back to top


Breadth-first search can be used to solve the following problem:
Single-source shortest paths. Given a graph and a source vertex s, support queries of the form Is there a path from s to a given target vertex v? If so, find a shortest such path (one with a minimal number of edges).

Repeat until queue is empty:

BFS and DFS differ only in the rule used to take the next vertex from the data structure (least recently added for BFS, most recently added for DFS). This difference leads to completely different views of the graph, even though all the vertices and edges connected to the source are examined no matter what rule is used.

Depth-first search. Put unvisited vertices on a stack.
Breadth-first search. Put unvisited vertices on a queue.

Breadth-first search properties:

Breadth-first search to find paths in a graph (non-recursive implementation) (ALGORITHM 4.2):

public class BreadthFirstPaths
{
    private boolean[] marked;           // Is a shortest path to this vertex known?
    private int[] edgeTo;               // last vertex on known path to this vertex
    private final int s;                // source

    public BreadthFirstPaths(Graph G, int s)
    {
        marked = new boolean[G.V()];
        edgeTo = new int[G.V()];
        this.s = s;
        bfs(G, s);
    }

    private void bfs(Graph G, int s)
    {
        Queue<Integer> queue = new Queue<Integer>();
        marked[s] = true;               // Mark the source
        queue.enqueue(s);               // and put it on the queue.
        while (!queue.isEmpty())
        {
            int v = queue.dequeue();    // Remove next vertex from the queue.
            for (int w : G.adj(v))
                if (!marked[w])             // For every unmarked adjacent vertex,
                {
                    edgeTo[w] = v;          // save last edge on a shortest path,
                    marked[w] = true;       // mark it because path is known,
                    queue.enqueue(w);       // and add it to the queue.
                }
        }
    }

    public boolean hasPathTo(int v)
    { return marked[v]; }

    public Iterable<Integer> pathTo(int v)
    // Same as for DFS
}

Breadth-first search application:


↥ back to top


1.5. Connected components

Definitions:

Goal:
Preprocess graph to answer queries of the form is v connected to w? in constant time.

public class CC  
CC(Graph G) preprocessing constructor
boolean connected(int v, int w) are v and w connected?
int count() number of connected components
int id(int v) component identifier for v ( between 0 and count()-1 )

How does the DFS-based solution for graph connectivity in CC compare with the union-find approach of Chapter 1?

Therefore, for example, we prefer union-find when determining connectivity is our only task or when we have a large number of queries intermixed with edge insertions but may find the DFS solution more appropriate for use in a graph ADT because it makes efficient use of existing infrastructure.

Depth-first search to find connected components in a graph (ALGORITHM 4.3):

public class CC
{
    private boolean[] marked;
    private int[] id;
    private int count;

    public CC(Graph G)
    {
        marked = new boolean[G.V()];
        id = new int[G.V()];
        for (int s = 0; s < G.V(); s++)
            if (!marked[s])
            {
                dfs(G, s);
                count++;
            }
    }

    private void dfs(Graph G, int v)
    {
        marked[v] = true;
        id[v] = count;
        for (int w : G.adj(v))
            if (!marked[w])
                dfs(G, w);
    }

    public boolean connected(int v, int w)
    { return id[v] == id[w]; }

    public int id(int v)
    { return id[v]; }

    public int count()
    { return count; }
}

Proposition C. DFS uses preprocessing time and space proportional to V + E to support constant-time connectivity queries in a graph.


↥ back to top


1.6. Cycle detection

Is a given graph acylic?

public class Cycle
{
    private boolean[] marked;
    private boolean hasCycle;

    public Cycle(Graph G)
    {
        marked = new boolean[G.V()];
        for (int s = 0; s < G.V(); s++)
            if (!marked[s])
                dfs(G, s, s);
    }

    private void dfs(Graph G, int v, int u)
    {
        marked[v] = true;
        for (int w : G.adj(v))
            if (!marked[w])
                dfs(G, w, v);
            else if (w != u) hasCycle = true;
    }

    public boolean hasCycle()
    { return hasCycle; }
}


↥ back to top


1.7. Two-colorability

Is the graph bipartite?

public class TwoColor
{
    private boolean[] marked;
    private boolean[] color;
    private boolean isTwoColorable = true;

    public TwoColor(Graph G)
    {
        marked = new boolean[G.V()];
        color = new boolean[G.V()];
        for (int s = 0; s < G.V(); s++)
            if (!marked[s])
                dfs(G, s);
    }

    private void dfs(Graph G, int v)
    {
        marked[v] = true;
        for (int w : G.adj(v))
            if (!marked[w])
            {
                color[w] = !color[v];
                dfs(G, w);
            }
            else if (color[w] == color[v]) isTwoColorable = false;
    }

    public boolean isBipartite()
    { return isTwoColorable; }
}


↥ back to top


1.8. Symbol graphs

Typical applications involve processing graphs defined in files or on web pages, using strings, not integer indices, to define and refer to vertices. To accommodate such applications, we define an input format with the following properties:

public class SymbolGraph  
SymbolGraph(String filename, String delim) build graph specified in filename using delim to separate vertex names
boolean contains(String key) is key a vertex?
int index(String key) index associated with key
String name(int v) key associated with index v
Graph G() underlying Graph

symbol-graph-data-structures

Symbol graph data type

public class SymbolGraph {
    private ST<String, Integer> st;                 // String -> index
    private String[] keys;                          // index -> String
    private Graph G;                                // the graph

    public SymbolGraph(String stream, String sp) {
        st = new ST<String, Integer>();
        In in = new In(stream);                     // First pass
        while (in.hasNextLine())                    // builds the index
        {
            String[] a = in.readLine().split(sp);   // by reading strings
            for (int i = 0; i < a.length; i++)      // to associate each
                if (!st.contains(a[i]))             // distinct string
                    st.put(a[i], st.size());        // with an index.
        }
        keys = new String[st.size()];               // Inverted index
        for (String name : st.keys())               // to get string keys
            keys[st.get(name)] = name;              // is an array.
        G = new Graph(st.size());
        in = new In(stream);                        // Second pass
        while (in.hasNextLine())                    // builds the graph
        {
            String[] a = in.readLine().split(sp);   // by connecting the
            int v = st.get(a[0]);                   // first vertex
            for (int i = 1; i < a.length; i++)      // on each line
                G.addEdge(v, st.get(a[i]));         // to all the others.
        }
    }

    public boolean contains(String s) {
        return st.contains(s);
    }

    public int index(String s) {
        return st.get(s);
    }

    public String name(int v) {
        return keys[v];
    }

    public Graph G() {
        return G;
    }
}


↥ back to top


1.9. Challenges

How difficult?

  1. Any programmer could do it.
  2. Typical diligent algorithms student could do it.
  3. Hire an expert.
  4. Intractable.
  5. No one knows.
  6. Impossible.

Problems


↥ back to top


2. Directed Graph

2.1. Introduction

2.1.1. Definition

A directed graph (or digraph) is a set of vertices and a collection of directed edges. Each directed edge connects an ordered pair of vertices.

2.1.2. Digraph applications

digraph vertex directed edge
transportation street intersection one-way street
web web page hyperlink
food web species predator-prey relationship
WordNet synset hypernym
scheduling task precedence constraint
financial bank transaction
cell phone person placed call
infectious disease person infection
game board position legal move
citation journal article citation
object graph object pointer
inheritance hierarchy class inherits from
control flow code block jump

2.1.3. Some digraph problems

2.2. Digraph API

public class Digraph  
Digraph(int V) create an empty digraph with V vertices
Digraph(In in) create a digraph from input stream
void addEdge(int v, int w) add a directed edge v→w
Iterable<Integer> adj(int v) vertices pointing from v
int V() number of vertices
int E() number of edges
Digraph reverse() reverse of this digraph
String toString() string representation

2.2.1. Adjacency-lists digraph representation

Adjacency-lists digraph representation

Adjacency-lists digraph representation: Java implementation

public class Digraph {
    private final int V;
    private int E;
    private Bag<Integer>[] adj;

    public Digraph(int V) {
        this.V = V;
        this.E = 0;
        adj = (Bag<Integer>[]) new Bag[V];
        for (int v = 0; v < V; v++)
            adj[v] = new Bag<Integer>();
    }

    public int V() {
        return V;
    }

    public int E() {
        return E;
    }

    public void addEdge(int v, int w) {
        adj[v].add(w);
        E++;
    }

    public Iterable<Integer> adj(int v) {
        return adj[v];
    }

    public Digraph reverse() {
        Digraph R = new Digraph(V);
        for (int v = 0; v < V; v++)
            for (int w : adj(v))
                R.addEdge(w, v);
        return R;
    }
}

In practice. Use adjacency-lists representation.

representation space add edge edge between v and w? iterate over vertices pointing from v?
list of edges E 1 E E
adjacency matrix V2 1 1 V
adjacency lists E+V 1 outdegree(v) outdegree(v)


↥ back to top


2.3.1. Reachability in digraphs

API for reachability in digraphs:

public class DirectedDFS  
DirectedDFS(Digraph G, int s) find vertices in G that are reachable from s
DirectedDFS(Digraph G, Iterable<Integer> sources) find vertices in G that are reachable from sources
boolean marked(int v) is v reachable?

2.3.2. Finding paths in digraphs

DepthFirstPaths (Algorithm 4.1) and BreadthFirstPaths (Algorithm 4.2) are also fundamentally digraph-processing algorithms.

Again, the identical APIs and code (with Graph changed to Digraph) effectively solve the following problems:

2.3.3. Depth-first search in digraphs

Same method as for undirected graphs.

Proposition D. DFS marks all the vertices in a digraph reachable from a given set of sources in time proportional to the sum of the outdegrees of the vertices marked.

public class DirectedDFS {
    private boolean[] marked;

    public DirectedDFS(Digraph G, int s) {
        marked = new boolean[G.V()];
        dfs(G, s);
    }

    public DirectedDFS(Digraph G, Iterable<Integer> sources) {
        marked = new boolean[G.V()];
        for (int s : sources)
            if (!marked[s])
                dfs(G, s);
    }

    private void dfs(Digraph G, int v) {
        marked[v] = true;
        for (int w : G.adj(v))
            if (!marked[w])
                dfs(G, w);
    }

    public boolean marked(int v) {
        return marked[v];
    }

    public static void main(String[] args) {
        Digraph G = new Digraph(new In(args[0]));
        Bag<Integer> sources = new Bag<Integer>();
        for (int i = 1; i < args.length; i++)
            sources.add(Integer.parseInt(args[i]));
        DirectedDFS reachable = new DirectedDFS(G, sources);
        for (int v = 0; v < G.V(); v++)
            if (reachable.marked(v))
                StdOut.print(v + " ");
        StdOut.println();
    }
}
2.3.3.1. Mark-and-sweep garbage collection

A mark-and-sweep garbage collection strategy reserves one bit per object for the purpose of garbage collection, then periodically marks the set of potentially accessible objects by running a digraph reachability algorithm like DirectedDFS and sweeps through all objects, collecting the unmarked ones for use for new objects.

2.3.4. Breadth-first search in digraphs

Same method as for undirected graphs.

Proposition. BFS computes shortest paths (fewest number of edges) from s to all other vertices in a digraph in time proportional to E + V.

Multiple-source shortest paths.

2.3.4.1. Breadth-first search in digraphs application: web crawler
Queue<String> queue = new Queue<String>();
SET<String> marked = new SET<String>();

String root = "http://www.princeton.edu";
queue.enqueue(root);
marked.add(root);

while (!queue.isEmpty())
{
    String v = queue.dequeue();
    StdOut.println(v);
    In in = new In(v);
    String input = in.readAll();

    String regexp = "http://(\\w+\\.)*(\\w+)";
    Pattern pattern = Pattern.compile(regexp);
    Matcher matcher = pattern.matcher(input);
    while (matcher.find())
    {
        String w = matcher.group();
        if (!marked.contains(w))
        {
            marked.add(w);
            queue.enqueue(w);
        }
    }
}


↥ back to top


2.4. Topological sort

Precedence-constrained scheduling. Given a set of jobs to be completed, with precedence constraints that specify that certain jobs have to be completed before certain other jobs are begun, how can we schedule the jobs such that they are all completed while still respecting the constraints?

Goal. Given a set of tasks to be completed with precedence constraints, in which order should we schedule the tasks?

Digraph model. vertex = task; edge = precedence constraint.

Topological sort. Given a digraph, put the vertices in order such that all its directed edges point from a vertex earlier in the order to a vertex later in the order (or report that doing so is not possible).

2.4.1. Typical topological-sort applications

application vertex edge
job schedule job precedence constraint
course schedule course prerequisite
inheritance Java class extends
spreadsheet cell formula
symbolic links file name link

2.4.2. DAG (directed acyclic graph)

Definition. A directed acyclic graph (DAG) is a digraph with no directed cycles.

Finding a directed cycle:

public class DirectedCycle {
    private boolean[] marked;
    private int[] edgeTo;
    private Stack<Integer> cycle; // vertices on a cycle (if one exists)
    private boolean[] onStack; // vertices on recursive call stack

    public DirectedCycle(Digraph G) {
        onStack = new boolean[G.V()];
        edgeTo = new int[G.V()];
        marked = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++)
            if (!marked[v])
                dfs(G, v);
    }

    private void dfs(Digraph G, int v) {
        onStack[v] = true;
        marked[v] = true;
        for (int w : G.adj(v))
            if (this.hasCycle())
                return;
            else if (!marked[w]) {
                edgeTo[w] = v;
                dfs(G, w);
            } else if (onStack[w]) {
                cycle = new Stack<Integer>();
                for (int x = v; x != w; x = edgeTo[x])
                    cycle.push(x);
                cycle.push(w);
                cycle.push(v);
            }
        onStack[v] = false;
    }

    public boolean hasCycle() {
        return cycle != null;
    }

    public Iterable<Integer> cycle() {
        return cycle;
    }
}

Proposition E. A digraph has a topological order if and only if it is a DAG.

Proposition F. Reverse DFS postorder of a DAG is a topological order.

Depth-first search vertex ordering in a digraph:

public class DepthFirstOrder {
    private boolean[] marked;
    private Queue<Integer> pre;         // vertices in preorder
    private Queue<Integer> post;        // vertices in postorder
    private Stack<Integer> reversePost; // vertices in reverse postorder

    public DepthFirstOrder(Digraph G) {
        pre = new Queue<Integer>();     // preorder is order of dfs() calls
        post = new Queue<Integer>();    // postorder is order in which vertices are done
        reversePost = new Stack<Integer>();
        marked = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++)
            if (!marked[v])
                dfs(G, v);
    }

    private void dfs(Digraph G, int v) {
        pre.enqueue(v);
        marked[v] = true;
        for (int w : G.adj(v))
            if (!marked[w])
                dfs(G, w);
        post.enqueue(v);
        reversePost.push(v);
    }

    public Iterable<Integer> pre() {
        return pre;
    }

    public Iterable<Integer> post() {
        return post;
    }

    public Iterable<Integer> reversePost() {
        return reversePost;
    }
}

Topological sort (ALGORITHM 4.5)

public class Topological {
    private Iterable<Integer> order; // topological order

    public Topological(Digraph G) {
        DirectedCycle cyclefinder = new DirectedCycle(G);
        if (!cyclefinder.hasCycle()) {
            DepthFirstOrder dfs = new DepthFirstOrder(G);
            order = dfs.reversePost();
        }
    }

    public Iterable<Integer> order() {
        return order;
    }

    public boolean isDAG() {
        return order == null;
    }

    public static void main(String[] args) {
        String filename = args[0];
        String separator = args[1];
        SymbolDigraph sg = new SymbolDigraph(filename, separator);
        Topological top = new Topological(sg.G());
        for (int v : top.order())
            StdOut.println(sg.name(v));
    }
}

Proposition G. With DFS, we can topologically sort a DAG in time proportional to V + E.

In practice, topological sorting and cycle detection go hand in hand, with cycle detection playing the role of a debugging tool. For example, in a job-scheduling application, a directed cycle in the underlying digraph represents a mistake that must be corrected, no matter how the schedule was formulated. Thus, a job-scheduling application is typically a three-step process:

  1. Specify the tasks and precedence constraints.
  2. Make sure that a feasible solution exists, by detecting and removing cycles in the underlying digraph until none exist.
  3. Solve the scheduling problem, using topological sort. Similarly, any changes in the schedule can be checked for cycles (using DirectedCycle), then a new schedule computed (using Topological).


↥ back to top


2.5. Strongly-connected components

Def. Vertices v and w are strongly connected if there is both a directed path from v to w and a directed path from w to v.

Key property. Strong connectivity is an equivalence relation:

Def. A strong component is a maximal subset of strongly-connected vertices.

2.5.1. Strong component application

Typical strong-component applications

application vertex edge
web page hyperlink
textbook topic reference
software module call
food web organism predator-prey relationship

2.5.2. API for strong components

public class SCC  
SCC(Digraph G) preprocessing constructor
boolean stronglyConnected(int v, int w) are v and w strongly connected?
int count() number of strong components
int id(int v) component identifier for v (between 0 and count()-1)

2.5.3. Kosaraju’s algorithm

Proposition H. In a DFS of a digraph G where marked vertices are considered in reverse postorder given by a DFS of the digraph’s reverse GR (Kosaraju’s algorithm), the vertices reached in each call of the recursive method from the constructor are in a strong component.

Proof. > First, we prove by contradiction that every vertex `v` that is strongly connected to `s` is reached by the call `dfs(G, s)` in the constructor. Suppose a vertex `v` that is strongly connected to `s` is not reached by `dfs(G, s)`. Since there is a path from `s` to `v`, `v` must have been previously marked. But then, since there is a path from `v` to `s`, `s` would have been marked during the call `dfs(G, v)` and the constructor would not call `dfs(G, s)`, a contradiction. > > Second, we prove that every vertex `v` reached by the call `dfs(G, s)` in the constructor is strongly connected to `s`. Let v be a vertex reached by the call `dfs(G, s)`. Then, there is a path from `s` to `v` in G, so it remains only to prove that there is a path from `v` to `s` in G. This statement is equivalent to saying that there is a path from `s` to `v` in GR, so it remains only to prove that there is a path from `s` to `v` in GR. > > The crux of the proof is that the reverse postorder construction implies that `dfs(G, v)` must have been done before `dfs(G, s)` during the DFS of GR, leaving just two cases to consider for `dfs(G, v)`: it might have been called > - before `dfs(G, s)` was called (and also done before `dfs(G, s)` was called) > - after `dfs(G, s)` was called (and done before `dfs(G, s)` was done) > > The first of these is not possible because there is a path from `v` to `s` in GR; the second implies that there is a path from `s` to `v` in GR, completing the proof.

Kosaraju’s algorithm for computing strong components (ALGORITHM 4.6). To find strong components, it does a depth-first search in the reverse digraph to produce a vertex order (reverse postorder of that search) for use in a depth-first search of the given digraph.

public class KosarajuSCC {
    private boolean[] marked;       // reached vertices
    private int[] id;               // component identifiers
    private int count;              // number of strong components

    public KosarajuSCC(Digraph G) {
        marked = new boolean[G.V()];
        id = new int[G.V()];
        DepthFirstOrder order = new DepthFirstOrder(G.reverse());
        for (int s : order.reversePost())
            if (!marked[s]) {
                dfs(G, s);
                count++;
            }
    }

    private void dfs(Digraph G, int v) {
        marked[v] = true;
        id[v] = count;
        for (int w : G.adj(v))
            if (!marked[w])
                dfs(G, w);
    }

    public boolean stronglyConnected(int v, int w) {
        return id[v] == id[w];
    }

    public int id(int v) {
        return id[v];
    }

    public int count() {
        return count;
    }
}

Proposition I. Kosaraju’s algorithm uses preprocessing time and space proportional to V + E to support constant-time strong connectivity queries in a digraph.


↥ back to top


3. Minimum spanning trees

Given an undirected graph G with positive edge weights (connected), A spanning tree of G is a subgraph T that is both a tree (connected and acyclic) and spanning (includes all of the vertices). A minimum spanning tree (MST) of an edge-weighted graph is a spanning tree whose weight (the sum of the weights of its edges) is no larger than the weight of any other spanning tree.

MST applications MST is fundamental problem with diverse applications.

3.1. Greedy algorithm

With the following simplified assumptions, MST exists and is unique.

Recall in section Graph terminology, two of the defining properties of a tree:

Definition. A cut of a graph is a partition of its vertices into two nonempty disjoint sets. A crossing edge of a cut is an edge that connects a vertex in one set with a vertex in the other.

Proposition J. (Cut property) Given any cut, the crossing edge of min weight is in the MST.

cut property

Proposition K. (Greedy MST algorithm) The following method colors black all edges in the the MST of any connected edgeweighted graph with V vertices: starting with all edges colored gray, find a cut with no black edges, color its minimum-weight edge black, and continue until V - 1 edges have been colored black.


↥ back to top


3.2. Edge-weighted graph API

Edge abstraction needed for weighted edges.

public class Edge implements Comparable<Edge>  
Edge(int v, int w, double weight) initializing constructor
double weight() weight of this edge
int either() either of this edge’s vertices
int other(int v) the other vertex
int compareTo(Edge that) compare this edge to e
String toString() string representation
public class Edge implements Comparable<Edge>
{
    private final int v, w;
    private final double weight;
    public Edge(int v, int w, double weight)
    {
        this.v = v;
        this.w = w;
        this.weight = weight;
    }
    public int either()
    { return v; }
    public int other(int vertex)
    {
        if (vertex == v) return w;
        else return v;
    }
    public int compareTo(Edge that)
    {
        if (this.weight < that.weight) return -1;
        else if (this.weight > that.weight) return +1;
        else return 0;
    }
}

Edge-weighted graph API.

public class EdgeWeightedGraph  
EdgeWeightedGraph(int V) create an empty graph with V vertices
EdgeWeightedGraph(In in) create a graph from input stream
void addEdge(Edge e) add weighted edge e to this graph
Iterable<Edge> adj(int v) edges incident to v
Iterable<Edge> edges() all edges in this graph
int V() number of vertices
int E() number of edges
String toString() string representation

This API is very similar to the API for Graph.

3.2.1. Edge-weighted graph: adjacency-lists representation

edge-weighted-graph-adjacency-lists-representation

public class EdgeWeightedGraph
{
    private final int V;
    private final Bag<Edge>[] adj;
    public EdgeWeightedGraph(int V)
    {
        this.V = V;
        adj = (Bag<Edge>[]) new Bag[V];
        for (int v = 0; v < V; v++)
            adj[v] = new Bag<Edge>();
    }
    public void addEdge(Edge e)
    {
        int v = e.either(), w = e.other(v);
        adj[v].add(e);
        adj[w].add(e);
    }
    public Iterable<Edge> adj(int v)
    { return adj[v]; }
}


↥ back to top


3.3. Minimum spanning tree API

As usual, for graph processing, define an API where the constructor takes an edge-weighted graph as argument and supports client query methods that return the MST and its weight. The MST of a graph G is a subgraph of G that is also a tree.

public class MST  
MST(EdgeWeightedGraph G) constructor
Iterable<Edge> edges() all of the MST edges
double weight() weight of MST

Test client.

public static void main(String[] args)
{
    In in = new In(args[0]);
    EdgeWeightedGraph G = new EdgeWeightedGraph(in);
    MST mst = new MST(G);

    for (Edge e : mst.edges())
        StdOut.println(e);
    StdOut.printf("%.2f\n", mst.weight());
}

3.4. Kruskal’s algorithm

Consider edges in ascending order of weight.

Proposition. [Kruskal 1956] Kruskal’s algorithm computes the MST.
Pf. (Kruskal’s algorithm is a special case of the greedy MST algorithm.)

Challenge. Would adding edge v–w to tree T create a cycle? If not, add it.

Efficient solution. Use the union-find data structure.

public class KruskalMST
{
    private Queue<Edge> mst = new Queue<Edge>();
    public KruskalMST(EdgeWeightedGraph G)
    {
        // build priority queue
        MinPQ<Edge> pq = new MinPQ<Edge>();
        for (Edge e : G.edges())
            pq.insert(e);

        UF uf = new UF(G.V());
        while (!pq.isEmpty() && mst.size() < G.V()-1)
        {
            // greedily add edges to MST
            Edge e = pq.delMin();
            int v = e.either(), w = e.other(v);
            
            // edge v–w does not create cycle
            if (!uf.connected(v, w))
            {
                uf.union(v, w);     // merge sets
                mst.enqueue(e);     // add edge to MST
            }
        }
    }

    public Iterable<Edge> edges()
    { return mst; }
}

Proposition. Kruskal’s algorithm computes MST in time proportional to E log E (in the worst case).

operation frequency time per op
build pq 1 E log E
delete-min E log E
union V log* V †
connected E log* V †

Remark. If edges are already sorted, order of growth is E log* V.

Actually we don’t have to always sort them. In real life situations, we can stop when we get the V - 1 edges in the MST.


↥ back to top


3.5. Prim’s algorithm

Attach a new edge to a single growing tree at each step.

Start with any vertex as a single-vertex tree; then add V - 1 edges to it, always taking next (coloring black) the minimum-weight edge that connects a vertex on the tree to a vertex not yet on the tree (a crossing edge for the cut defined by tree vertices).

Prim's algorithm

Proposition. [Jarník 1930, Dijkstra 1957, Prim 1959] Prim’s algorithm computes the MST.
Pf. (Prim’s algorithm is a special case of the greedy MST algorithm.)

Challenge. Find the min weight edge with exactly one endpoint in T.

3.5.1. Prim’s algorithm: lazy implementation

Lazy solution. Maintain a PQ of edges with (at least) one endpoint in T.

public class LazyPrimMST
{
    private boolean[] marked;   // MST vertices
    private Queue<Edge> mst;    // MST edges
    private MinPQ<Edge> pq;     // PQ of edges

    public LazyPrimMST(WeightedGraph G)
    {
        pq = new MinPQ<Edge>();
        mst = new Queue<Edge>();
        marked = new boolean[G.V()];
        visit(G, 0);

        while (!pq.isEmpty() && mst.size() < G.V() - 1)
        {
            Edge e = pq.delMin();
            int v = e.either(), w = e.other(v);
            if (marked[v] && marked[w]) continue;
            mst.enqueue(e);
            if (!marked[v]) visit(G, v);
            if (!marked[w]) visit(G, w);
        }
    }

    private void visit(WeightedGraph G, int v)
    {
        marked[v] = true;
        for (Edge e : G.adj(v))
            if (!marked[e.other(v)])
                pq.insert(e);
    }

    public Iterable<Edge> mst()
    { return mst; }
}

This implementation of Prim’s algorithm uses a priority queue to hold crossing edges, a vertex-indexed arrays to mark tree vertices, and a queue to hold MST edges. This implementation is a lazy approach where we leave ineligible edges in the priority queue.

Proposition. Lazy Prim’s algorithm computes the MST in time proportional to E log E and extra space proportional to E (in the worst case).
Pf. | operation | frequency | binary heap | |:–:|:–:|:–:| | delete min | E | log E | | insert | E | log E |

The bottleneck in the algorithm is the number of edge-weight comparisons in the priority-queue methods insert() and delMin(). The number of edges on the priority queue is at most E, which gives the space bound.

In practice, the upper bound on the running time is a bit conservative because the number of edges on the priority queue is typically much less than E.

3.5.2. Indexed priority queue

Associate an index between 0 and N - 1 with each key in a priority queue.

public class IndexMinPQ<Key extends Comparable<Key>>  
IndexMinPQ(int N) create indexed priority queue with indices 0, 1, …, N-1
void insert(int i, Key key) associate key with index i
void decreaseKey(int i, Key key) decrease the key associated with index i
boolean contains(int i) is i an index on the priority queue?
int delMin() remove a minimal key and return its associated index
boolean isEmpty() is the priority queue empty?
int size() number of entries in the priority queue

Implementation.

3.5.3. Prim’s algorithm: eager implementation

Eager solution. Maintain a PQ of vertices connected by an edge to T, where priority of vertex v = weight of shortest edge connecting v to T.

note: pq has at most one entry per vertex

Prim's algorithm eager implementation

public class PrimMST
{
    private Edge[] edgeTo;          // shortest edge from tree vertex
    private double[] distTo;        // distTo[w] = edgeTo[w].weight()
    private boolean[] marked;       // true if v on tree
    private IndexMinPQ<Double> pq;  // eligible crossing edges

    public PrimMST(EdgeWeightedGraph G)
    {
        edgeTo = new Edge[G.V()];
        distTo = new double[G.V()];
        marked = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++)
            distTo[v] = Double.POSITIVE_INFINITY;
        pq = new IndexMinPQ<Double>(G.V());

        distTo[0] = 0.0;
        pq.insert(0, 0.0);          // Initialize pq with 0, weight 0.
        while (!pq.isEmpty())
            visit(G, pq.delMin());  // Add closest vertex to tree.
    }

    private void visit(EdgeWeightedGraph G, int v)
    { // Add v to tree; update data structures.
        marked[v] = true;
        for (Edge e : G.adj(v))
        {
            int w = e.other(v);
            if (marked[w]) continue; // v-w is ineligible.
            if (e.weight() < distTo[w])
            { // Edge e is new best connection from tree to w.
                edgeTo[w] = e;
                distTo[w] = e.weight();
                if (pq.contains(w)) pq.change(w, distTo[w]);
                else pq.insert(w, distTo[w]);
            }
        }
    }

    public Iterable<Edge> edges()   // See Exercise 4.3.21.
    public double weight()          // See Exercise 4.3.31.
}

3.5.4. Prim’s algorithm: running time

Prim’s algorithm: running time depends on PQ implementation: V insert, V delete-min, E decrease-key.

PQ implementation insert delete-min decrease-key total
array 1 V 1 V2
binary heap log V log V log V E log V
d-way heap (Johnson 1975) logd V d logd V logd V E logE/V V
Fibonacci heap (Fredman-Tarjan 1984) 1 † log V † 1 † E + V log V

† amortized

Bottom line.


↥ back to top


3.6. context

3.6.1. Euclidean MST

Given N points in the plane, find MST connecting them, where the distances between point pairs are their Euclidean distances.

Brute force. Compute ~ N2 / 2 distances and run Prim’s algorithm.

Ingenuity. Exploit geometry and do it in ~ c N log N.

3.6.2. Scientific application: clustering

k-clustering. Divide a set of objects classify into k coherent groups.

Distance function. Numeric value specifying “closeness” of two objects.

Goal. Divide into clusters so that objects in different clusters are far apart.

Applications:

Single link. Distance between two clusters equals the distance between the two closest objects (one in each cluster).

Single-link clustering. Given an integer k, find a k-clustering that maximizes the distance between two closest clusters.

“Well-known” algorithm in science literature for single-link clustering:

NOTE: This is Kruskal’s algorithm (stop when k connected components).
Alternate solution. Run Prim’s algorithm and delete k–1 max weight edges.


↥ back to top


4. Shortest Paths

Given an edge-weighted digraph, find the shortest path from s to t.

Typical shortest-paths applications:

application vertex edge
map intersection road
network router connection
schedule job precedence constraint
arbitrage currency exchange rate


4.1. Shortest path variants

Which vertices?

Restrictions on edge weights?

Cycles?

Simplifying assumption. Shortest paths from s to each vertex v exist.


↥ back to top


4.2. Weighted directed edge API

public class DirectedEdge  
DirectedEdge(int v, int w, double weight) weighted edge v→w
int from() vertex v
int to() vertex w
double weight() weight of this edge
String toString() string representation

Implementation in Java: Similar to Edge for undirected graphs, but a bit simpler.

public class DirectedEdge
{
    private final int v, w;
    private final double weight;

    public DirectedEdge(int v, int w, double weight)
    {
        this.v = v;
        this.w = w;
        this.weight = weight;
    }

    // from() and to() replace either() and other()
    public int from()
    { return v; }

    public int to()
    { return w; }

    public int weight()
    { return weight; }
}

4.2.1. Edge-weighted digraph: adjacency-lists implementation in Java

Same as EdgeWeightedGraph except replace Graph with Digraph.

public class EdgeWeightedDigraph
{
    private final int V;
    private final Bag<DirectedEdge>[] adj;
    public EdgeWeightedDigraph(int V)
    {
        this.V = V;
        adj = (Bag<DirectedEdge>[]) new Bag[V];
        for (int v = 0; v < V; v++)
            adj[v] = new Bag<DirectedEdge>();
    }
    public void addEdge(DirectedEdge e)
    {
        int v = e.from();
        adj[v].add(e);
    }
    public Iterable<DirectedEdge> adj(int v)
    { return adj[v]; }
}

4.2.2. Single-source shortest paths API

Goal. Find the shortest path from s to every other vertex.

public class SP  
SP(EdgeWeightedDigraph G, int s) shortest paths from s in graph G
double distTo(int v) length of shortest path from s to v
Iterable <DirectedEdge> pathTo(int v) shortest path from s to v
boolean hasPathTo(int v) is there a path from s to v?


↥ back to top


4.3. Shortest-paths properties

4.3.1. Data structures for single-source shortest paths

Can represent the shortest-paths tree (SPT) with two vertex-indexed arrays:

public double distTo(int v)
{ return distTo[v]; }

public Iterable<DirectedEdge> pathTo(int v)
{
    Stack<DirectedEdge> path = new Stack<DirectedEdge>();
    for (DirectedEdge e = edgeTo[v]; e != null; e = edgeTo[e.from()])
        path.push(e);
    return path;
}

4.3.2. Edge relaxation

Relax edge e = v→w.

private void relax(DirectedEdge e)
{
    int v = e.from(), w = e.to();
    if (distTo[w] > distTo[v] + e.weight())
    {
        distTo[w] = distTo[v] + e.weight();
        edgeTo[w] = e;
    }
}

Edge relaxation (two cases)

The term relaxation follows from the idea of a rubber band stretched tight on a path connecting two vertices: relaxing an edge is akin to relaxing the tension on the rubber band along a shorter path, if possible. We say that an edge e can be successfully relaxed if relax() would change the values of distTo[e.to()] and edgeTo[e.to()].

4.3.3. Shortest-paths optimality conditions

Proposition. Let G be an edge-weighted digraph. Then distTo[] are the shortest path distances from s iff:

Proof. > ⇐ [ necessary ] > - Suppose that `distTo[w] > distTo[v] + e.weight()` for some edge `e = v→w`. > - Then, `e` gives a path from `s` to `w` (through `v`) of length less than `distTo[w]`. > > ⇒ [ sufficient ] > - Suppose that s = v0 → v1 → v2 → … → vk = w is a shortest path from `s` to `w`. > - Then, > distTo[v1] ≤ distTo[v0] + e1.weight() > distTo[v2] ≤ distTo[v1] + e2.weight() > ... > distTo[vk] ≤ distTo[vk-1] + ek.weight() > - Add inequalities; simplify; and substitute distTo[v0] = distTo[s] = 0: > distTo[w] = distTo[vk] ≤ e1.weight() + e2.weight() + … + ek.weight() > - Thus, `distTo[w]` is the weight of shortest path to `w`.

4.4. Generic shortest-paths algorithm

Generic algorithm (to compute SPT from s)

Proposition. Generic algorithm computes SPT (if it exists) from s.

Proof sketch. > - Throughout algorithm, `distTo[v]` is the length of a simple path from `s` to `v` (and `edgeTo[v]` is last edge on path). > - Each successful relaxation decreases `distTo[v]` for some `v`. > - The entry `distTo[v]` can decrease at most a finite number of times.

Efficient implementations. How to choose which edge to relax?


↥ back to top


4.5. Dijkstra’s algorithm

“The tools we use have a profound and devious influence on our thinking habits, and therefore on our thinking abilities.” — Edsger W. Dijkstra

Dijkstra's algorithm animation

Wikipedia: Dijkstra’s algorithm

Proposition. Dijkstra’s algorithm solves the single-source shortest-paths problem in edge-weighted digraphs with nonnegative weights.

Proposition. Dijkstra’s algorithm uses extra space proportional to V and time proportional to E log V (in the worst case) to compute the SPT rooted at a given source in an edge-weighted digraph with E edges and V vertices.

Dijkstra's algorithm

public class DijkstraSP
{
    private DirectedEdge[] edgeTo;
    private double[] distTo;
    private IndexMinPQ<Double> pq;

    public DijkstraSP(EdgeWeightedDigraph G, int s)
    {
        edgeTo = new DirectedEdge[G.V()];
        distTo = new double[G.V()];
        pq = new IndexMinPQ<Double>(G.V());
        for (int v = 0; v < G.V(); v++)
            distTo[v] = Double.POSITIVE_INFINITY;
        distTo[s] = 0.0;
        pq.insert(s, 0.0);
        while (!pq.isEmpty())
            relax(G, pq.delMin())
    }

    private void relax(EdgeWeightedDigraph G, int v)
    {
        for(DirectedEdge e : G.adj(v))
        {
            int w = e.to();
            if (distTo[w] > distTo[v] + e.weight())
            {
                distTo[w] = distTo[v] + e.weight();
                edgeTo[w] = e;
                if (pq.contains(w)) pq.change(w, distTo[w]);
                else pq.insert(w, distTo[w]);
            }
        }
    }

    public double distTo(int v)
    { return distTo[v]; }

    public boolean hasPathTo(int v)
    { return distTo[v] < Double.POSITIVE_INFINITY; }

    public Iterable<DirectedEdge> pathTo(int v)
    {
        if (!hasPathTo(v)) return null;
        Stack<DirectedEdge> path = new Stack<DirectedEdge>();
        for (DirectedEdge e = edgeTo[v]; e != null; e = edgeTo[e.from()])
            path.push(e);
        return path;
    }
}

This implementation of Dijkstra’s algorithm grows the SPT by adding an edge at a time, always choosing the edge from a tree vertex to a non-tree vertex whose destination w is closest to s.

Dijkstra’s algorithm vs. Prim’s algorithm?

Note: DFS and BFS are also in this family of algorithms.

Main distinction: Rule used to choose next vertex for the tree. | algorithms | distinction | |:–:|:–:| | Prim’s | Closest vertex to the tree (via an undirected edge). | | Dijkstra’s | Closest vertex to the source (via a directed path). |

Dijkstra’s algorithm: which priority queue?

Depends on PQ implementation: V insert, V delete-min, E decrease-key. See section Prim’s algorithm runtime.

4.5.1. Dijkstra’s algorithm variants


↥ back to top


4.6. Edge-weighted DAGs (acyclic edge-weighted digraph)

It is easier to find shortest paths in an edge-weighted DAG than in a general digraph.

Initialize distTo[s] to 0 and all other distTo[] values to infinity, then relax the vertices, one by one, taking the vertices in topological order.

Proposition. Topological sort algorithm computes SPT in any edge-weighted DAG in time proportional to E + V.

Proof. > - Each edge `e = v→w` is relaxed exactly once (when `v` is relaxed), leaving `distTo[w] ≤ distTo[v] + e.weight()`. > - Inequality holds until algorithm terminates because: > - `distTo[w]` cannot increase (`distTo[]` values are monotone decreasing) > - `distTo[v]` will not change (because of topological order, no edge pointing to v will be relaxed after `v` is relaxed) > - Thus, upon termination, shortest-paths optimality conditions hold. > > NOTE: proof does not depend on the edge weights being nonnegative, so we can remove that restriction for edge-weighted DAGs.

ALGORITHM: Shortest paths in edge-weighted DAGs

public class AcyclicSP
{
    private DirectedEdge[] edgeTo;
    private double[] distTo;
    public AcyclicSP(EdgeWeightedDigraph G, int s)
    {
        edgeTo = new DirectedEdge[G.V()];
        distTo = new double[G.V()];
        for (int v = 0; v < G.V(); v++)
            distTo[v] = Double.POSITIVE_INFINITY;
        distTo[s] = 0.0;
        Topological top = new Topological(G);
        for (int v : top.order())
            relax(G, v);
    }

    private void relax(EdgeWeightedDigraph G, int v)
    {
        for(DirectedEdge e : G.adj(v))
        {
            int w = e.to();
            if (distTo[w] > distTo[v] + e.weight())
            {
                distTo[w] = distTo[v] + e.weight();
                edgeTo[w] = e;
            }
        }
    }

    public double distTo(int v)
    { return distTo[v]; }

    public boolean hasPathTo(int v)
    { return distTo[v] < Double.POSITIVE_INFINITY; }

    public Iterable<DirectedEdge> pathTo(int v)
    {
        if (!hasPathTo(v)) return null;
        Stack<DirectedEdge> path = new Stack<DirectedEdge>();
        for (DirectedEdge e = edgeTo[v]; e != null; e = edgeTo[e.from()])
            path.push(e);
        return path;
    }
}

Note that marked[] is not needed in this implementation: since vertices is processed in an acyclic digraph in topological order, a vertex that is already relaxed will not be re-encountered.

For shortest paths, the topological-sort-based method is faster than Dijkstra’s algorithm by a factor proportional to the cost of the priority-queue operations in Dijkstra’s algorithm.

4.6.1. Application: content-aware resizing

Seam carving. [Avidan and Shamir] Resize an image without distortion for display on cell phones and web browsers.

To find vertical seam:

4.6.2. Longest paths in edge-weighted DAGs

Key point. Topological sort algorithm works even with negative weights.

Single-source longest paths in edge-weighted DAGs. Given an edge-weighted DAG (with negative weights allowed) and a source vertex s, support queries of the form: Is there a directed path from s to a given target vertex v? If so, find a longest such path (one whose total weight is maximal).

Solution. > Can solve the longest-paths problem in edge-weighted DAGs in time proportional to E + V. Negate the edge weights.

4.6.3. Longest paths in edge-weighted DAGs: application

4.6.3.1. Parallel job scheduling

Parallel precedence-constrained scheduling. Given a set of jobs of specified duration to be completed, with precedence constraints that specify that certain jobs have to be completed before certain other jobs are begun, how can we schedule the jobs on identical processors (as many as needed) such that they are all completed in the minimum amount of time while still respecting the constraints?

Solution. The problem is equivalent to a longest-paths problem in an edge-weighted DAG. Use critical path method (a linear-time algorithm).

critical path method

CPM. To solve a parallel job-scheduling problem, create edge-weighted DAG:

Critical path method for parallel precedence-constrained job scheduling

public class CPM
{
    public static void main(String[] args)
    {
        int N = StdIn.readInt(); StdIn.readLine();
        EdgeWeightedDigraph G;
        G = new EdgeWeightedDigraph(2*N+2);
        int s = 2*N, t = 2*N+1;
        for (int i = 0; i < N; i++)
        {
            String[] a = StdIn.readLine().split("\\s+");
            double duration = Double.parseDouble(a[0]);
            G.addEdge(new DirectedEdge(i, i+N, duration));
            G.addEdge(new DirectedEdge(s, i, 0.0));
            G.addEdge(new DirectedEdge(i+N, t, 0.0));
            for (int j = 1; j < a.length; j++)
            {
                int successor = Integer.parseInt(a[j]);
                G.addEdge(new DirectedEdge(i+N, successor, 0.0));
            }
        }
        AcyclicLP lp = new AcyclicLP(G, s);
        StdOut.println("Start times:");
        for (int i = 0; i < N; i++)
            StdOut.printf("%4d: %5.1f\n", i, lp.distTo(i));
        StdOut.printf("Finish time: %5.1f\n", lp.distTo(t));
    }
}


↥ back to top


4.7. Negative weights

Shortest paths with negative weights: failed attempts

Dijkstra. Doesn’t work with negative edge weights. Re-weighting by add a constant to every edge weight doesn’t work.

Def. A negative cycle is a directed cycle whose sum of edge weights is negative.

Proposition. A SPT exists iff no negative cycles.

4.7.1. Bellman-Ford algorithm

Proposition. (Bellman-Ford algorithm) The following method solves the singlesource shortest-paths problem from a given source s for any edge-weighted digraph with V vertices and no negative cycles reachable from s: Initialize distTo[s] to 0 and all other distTo[] values to infinity. Then, considering the digraph’s edges in any order, relax all edges. Make V such passes.

Proof. > For any vertex `t` that is reachable from `s` consider a specific shortest path from s to t: v0->v1->...->vk, where v0 is s and vk is t. Since there are no negative cycles, such a path exists and `k` can be no larger than `V-1`. > > We show by induction on `i` that after the ith pass the algorithm computes a shortest path from s to vi. The base case (i = 0) is trivial. Assuming the claim to be true for `i`, v0->v1->...->vi is a shortest path from `s` to vi, and distTo[vi] is its length. Now, we relax every vertex in the ith pass, including vi, so distTo[vi+1] is no greater than distTo[vi] plus the weight of vi->vi+1. Now, after the ith pass, distTo[vi+1] must be equal to distTo[vi] plus the weight of vi->vi+1. It cannot be greater because we relax every vertex in the ith pass, in particular vi, and it cannot be less because that is the length of v0->v1->...->vi+1, a shortest path. Thus the algorithm computes a shortest path from s to vi+1 after the (i+1)st pass.

Initialize distTo[s] = 0 and distTo[v] = ∞ for all other vertices.
Repeat V times:

for (int i = 0; i < G.V(); i++)
    for (int v = 0; v < G.V(); v++)
        for (DirectedEdge e : G.adj(v))
            relax(e);

Proposition. The Bellman-Ford algorithm takes time proportional to EV and extra space proportional to V.

ALGORITHM: Bellman-Ford algorithm (queue-based)

public class BellmanFordSP
{
    private double[] distTo;                // length of path to v
    private DirectedEdge[] edgeTo;          // last edge on path to v
    private boolean[] onQ;                  // Is this vertex on the queue?
    private Queue<Integer> queue;           // vertices being relaxed
    private int cost;                       // number of calls to relax()
    private Iterable<DirectedEdge> cycle;   // negative cycle in edgeTo[]?

    public BellmanFordSP(EdgeWeightedDigraph G, int s)
    {
        distTo = new double[G.V()];
        edgeTo = new DirectedEdge[G.V()];
        onQ = new boolean[G.V()];
        queue = new Queue<Integer>();
        for (int v = 0; v < G.V(); v++)
            distTo[v] = Double.POSITIVE_INFINITY;
        distTo[s] = 0.0;
        queue.enqueue(s);
        onQ[s] = true;
        while (!queue.isEmpty() && !this.hasNegativeCycle())
        {
            int v = queue.dequeue();
            onQ[v] = false;
            relax(v);
        }
    }

    private void relax(EdgeWeightedDigraph G, int v)
    {
        for (DirectedEdge e : G.adj(v))
        {
            int w = e.to();
            if (distTo[w] > distTo[v] + e.weight())
            {
                distTo[w] = distTo[v] + e.weight();
                edgeTo[w] = e;
                if (!onQ[w])
                {
                    q.enqueue(w);
                    onQ[w] = true;
                }
            }
            if (cost++ % G.V() == 0)
                findNegativeCycle();
        }
    }

    public double distTo(int v)         // standard client query methods
    public boolean hasPathTo(int v)     // for SPT implementatations
    public Iterable<Edge> pathTo(int v) // as above

    private void findNegativeCycle()
    {
        int V = edgeTo.length;
        EdgeWeightedDigraph spt;
        spt = new EdgeWeightedDigraph(V);
        for (int v = 0; v < V; v++)
            if (edgeTo[v] != null)
                spt.addEdge(edgeTo[v]);
        EdgeWeightedCycleFinder cf;
        cf = new EdgeWeightedCycleFinder(spt);
        cycle = cf.cycle();
    }

    public boolean hasNegativeCycle()
    { return cycle != null; }

    public Iterable<Edge> negativeCycle()
    { return cycle; }
}

The algorithm is based on two additional data structures:

The data structures ensure that

Observation. If distTo[v] does not change during pass i, no need to relax any edge pointing from v in pass i+1.

FIFO implementation. Maintain queue of vertices whose distTo[] changed.

Note: be careful to keep at most one copy of each vertex on queue (why?)

4.7.2. Finding a negative cycle

Observation. If there is a negative cycle, Bellman-Ford gets stuck in loop, updating distTo[] and edgeTo[] entries of vertices in the cycle.

Proposition. If any vertex v is updated in phase V, there exists a negative cycle (and can trace back edgeTo[v] entries to find it).

In practice. Check for negative cycles more frequently.

4.7.3. Negative cycle application: arbitrage detection

Currency exchange graph.

Model as a negative cycle detection problem by taking logs.

Arbitrage in currency exchange

public class Arbitrage
{
    public static void main(String[] args)
    {
        int V = StdIn.readInt();
        String[] name = new String[V];
        EdgeWeightedDigraph G = new EdgeWeightedDigraph(V);
        for (int v = 0; v < V; v++)
        {
            name[v] = StdIn.readString();
            for (int w = 0; w < V; w++)
            {
                double rate = StdIn.readDouble();
                DirectedEdge e = new DirectedEdge(v, w, -Math.log(rate));
                G.addEdge(e);
            }
        }
        BellmanFordSP spt = new BellmanFordSP(G, 0);
        if (spt.hasNegativeCycle())
        {
            double stake = 1000.0;
            for (DirectedEdge e : spt.negativeCycle())
            {
                StdOut.printf("%10.5f %s ", stake, name[e.from()]);
                stake *= Math.exp(-e.weight());
                StdOut.printf("= %10.5f %s\n", stake, name[e.to()]);
            }
        }
        else StdOut.println("No arbitrage opportunity");
    }
}


↥ back to top


4.8. Single source shortest-paths implementation: summary

Cost:

algorithm restriction typical case worst case extra space
topological sort no directed cycles E + V E + V V
Dijkstra (binary heap) no negative weights E log V E log V V
Bellman-Ford no negative cycles E V E V V
Bellman-Ford (queue-based) no negative cycles E + V E V V

Remarks:

  1. Directed cycles make the problem harder.
  2. Negative weights make the problem harder.
  3. Negative cycles makes the problem intractable.

Summary:

topic remark
Dijkstra’s algorithm Nearly linear-time when weights are nonnegative.
Generalization encompasses DFS, BFS, and Prim.
Acyclic edge-weighted digraphs Arise in applications.
Faster than Dijkstra’s algorithm.
Negative weights are no problem.
Negative weights and negative cycles Arise in applications.
If no negative cycles, can find shortest paths via Bellman-Ford.
If negative cycles, can find one via Bellman-Ford.

Shortest-paths is a broadly useful problem-solving model.


↥ back to top


5. Maximum Flow and Minimum Cut

5.1. Mincut problem

Input. An edge-weighted digraph, source vertex s, and target vertex t.

Def. A st-cut (cut) is a partition of the vertices into two disjoint sets, with s in one set A and t in the other set B.

Def. Its capacity is the sum of the capacities of the edges from A to B.

Minimum st-cut (mincut) problem. Find a cut of minimum capacity.

5.2. Maxflow problem

Input. An edge-weighted digraph, source vertex s, and target vertex t. (each edge has a positive capacity)

Def. An st-flow (flow) is an assignment of values to the edges such that:

Def. The value of a flow is the inflow at t. (assume no edge points to s or from t)

Maximum st-flow (maxflow) problem. Find a flow of maximum value.

Maxflow Mincut

Remarkable fact. These two problems are dual.

5.3. Ford-Fulkerson algorithm

It is a generic method for increasing flows incrementally along paths from source to sink that serves as the basis for a family of algorithms. It is known as the Ford-Fulkerson algorithm in the classical literature; the more descriptive term augmenting-path algorithm is also widely used.

Initialization. Start with 0 flow.

Augmenting path. Find an undirected path from s to t such that:

Termination. All paths from s to t are blocked by either a

Ford-Fulkerson algorithm

Start with 0 flow. While there exists an augmenting path:


↥ back to top


5.4. Maxflow-mincut theorem

Def. The net flow across a cut (A, B) is the sum of the flows on its edges from A to B minus the sum of the flows on its edges from from B to A.

Flow-value lemma. Let f be any flow and let (A, B) be any cut. Then, the net flow across (A, B) equals the value of f.

Pf. By induction on the size of B.

Corollary. Outflow from s = inflow to t = value of flow.

Weak duality. Let f be any flow and let (A, B) be any cut. Then, the value of the flow ≤ the capacity of the cut.

Pf. Value of flow f = net flow across cut (A, B) ≤ capacity of cut (A, B).

Augmenting path theorem. A flow f is a maxflow iff no augmenting paths.

Maxflow-mincut theorem. Value of the maxflow = capacity of mincut.

Prove both theorems simultaneously by showing the following three conditions are equivalent for any flow f:

Proof. > [ i ⇒ ii ] > - Suppose that `(A, B)` is a cut with capacity equal to the value of `f`. > - Then, the value of any flow `f` ' ≤ capacity of `(A, B)` = value of `f`. > - Thus, `f` is a maxflow. > > [ ii ⇒ iii ] To prove contrapositive: `~iii ⇒ ~ii`. > - Suppose that there is an augmenting path with respect to `f`. > - Can improve flow `f` by sending flow along this path. > - Thus, `f` is not a maxflow. > > [ iii ⇒ i ] > Suppose that there is no augmenting path with respect to `f`. > - Let `(A, B)` be a cut where `A` is the set of vertices connected to `s` by an undirected path with no full forward or empty backward edges (`A` is the set of vertices reachable from `s` in *residual* graph). > - By above definition, `s` is in `A`; since no augmenting path, `t` is in `B`. > - Capacity of cut = net flow across cut (forward edges full; backward edges empty) > = value of flow `f`. (flow-value lemma)

5.4.1. Computing a mincut from a maxflow

To compute mincut (A, B) from maxflow f:

Computing a mincut from a maxflow

Questions.

5.4.2. Ford-Fulkerson algorithm with integer capacities

Important special case. Edge capacities are integers between 1 and U.

Invariant. The flow is integer-valued throughout Ford-Fulkerson.

Pf. [by induction]

Proposition. Number of augmentations ≤ the value of the maxflow.

Pf. Each augmentation increases the value by at least 1.

Integrality theorem. There exists an integer-valued maxflow.

Pf. Ford-Fulkerson terminates and maxflow that it finds is integer-valued.

Bad news. Even when edge capacities are integers, number of augmenting paths could be equal to the value of the maxflow.

Good news. This case is easily avoided. [use shortest/fattest path]

5.4.3. How to choose augmenting paths?

FF performance depends on choice of augmenting paths.

digraph with V vertices, E edges, and integer capacities between 1 and U | augmenting path | number of paths | implementation | | :–: | :–: | :–: | | shortest path | ≤ ½ E V | queue (BFS) | | fattest path | ≤ E ln(E U) | priority queue | | random path | ≤ E U | randomized queue | | DFS path | ≤ E U | stack (DFS) |


↥ back to top


5.5. Ford-Fulkerson: Java implementation

Flow edge data type. Associate flow fe and capacity ce with edge e = v→w.

Flow network data type. Need to process edge e = v→w in either direction:
Include e in both v and w’s adjacency lists.

Residual capacity.

Augment flow.

Residual network. A useful view of a flow network.

Residual network

Key point. Augmenting path in original network is equivalent to directed path in residual network.

Residual network

Flow edge: Java implementation

public class FlowEdge
{
    private final int v;            // edge source
    private final int w;            // edge target
    private final double capacity;  // capacity
    private double flow;            // flow

    public FlowEdge(int v, int w, double capacity)
    {
        this.v = v;
        this.w = w;
        this.capacity = capacity;
        this.flow = 0.0;
    }

    public int from() { return v; }
    public int to() { return w; }
    public double capacity() { return capacity; }
    public double flow() { return flow; }

    public int other(int vertex)
    // same as for Edge

    public double residualCapacityTo(int vertex)
    {
        if (vertex == v) return flow;
        else if (vertex == w) return cap - flow;
        else throw new RuntimeException("Inconsistent edge");
    }

    public void addResidualFlowTo(int vertex, double delta)
    {
        if (vertex == v) flow -= delta;
        else if (vertex == w) flow += delta;
        else throw new RuntimeException("Inconsistent edge");
    }

    public String toString()
    { return String.format("%d->%d %.2f %.2f", v, w, capacity, flow); }
}

Flow network: Java implementation

Same as EdgeWeightedGraph, but adjacency lists of FlowEdges instead of Edges.

public class FlowNetwork
{
    private final int V;
    private Bag<FlowEdge>[] adj;
    public FlowNetwork(int V)
    {
        this.V = V;
        adj = (Bag<FlowEdge>[]) new Bag[V];
        for (int v = 0; v < V; v++)
            adj[v] = new Bag<FlowEdge>();
    }
    public void addEdge(FlowEdge e)
    {
        int v = e.from();
        int w = e.to();
        adj[v].add(e);      // add forward edge
        adj[w].add(e);      // add backward edge
    }
    public Iterable<FlowEdge> adj(int v)
    { return adj[v]; }
}

Flow network

Ford-Fulkerson: Java implementation

ALGORITHM: Ford-Fulkerson shortest-augmenting path maxflow algorithm

public class FordFulkerson
{
    private boolean[] marked;   // Is s->v path in residual graph?
    private FlowEdge[] edgeTo;  // last edge on shortest s->v path
    private double value;       // current value of maxflow

    public FordFulkerson(FlowNetwork G, int s, int t)
    { // Find maxflow in flow network G from s to t.

        while (hasAugmentingPath(G, s, t))
        { // While there exists an augmenting path, use it.

            // Compute bottleneck capacity.
            double bottle = Double.POSITIVE_INFINITY;
            for (int v = t; v != s; v = edgeTo[v].other(v))
                bottle = Math.min(bottle, edgeTo[v].residualCapacityTo(v));

            // Augment flow.
            for (int v = t; v != s; v = edgeTo[v].other(v))
                edgeTo[v].addResidualFlowTo(v, bottle);
            value += bottle;
        }
    }

    private boolean hasAugmentingPath(FlowNetwork G, int s, int t)
    {
        marked = new boolean[G.V()];        // Is path to this vertex known?
        edgeTo = new FlowEdge[G.V()];       // last edge on path
        Queue<Integer> q = new Queue<Integer>();

        marked[s] = true;                   // Mark the source
        q.enqueue(s);                       // and put it on the queue.
        while (!q.isEmpty())
        {
            int v = q.dequeue();
            for (FlowEdge e : G.adj(v))
            {
                int w = e.other(v);
                if (e.residualCapacityTo(w) > 0 && !marked[w])
                { // For every edge to an unmarked vertex (in residual)
                    edgeTo[w] = e;          // Save the last edge on a path.
                    marked[w] = true;       // Mark w because a path is known
                    q.enqueue(w);           // and add it to the queue.
                }
            }
        }
        return marked[t];
    }

    public double value() { return value; }
    public boolean inCut(int v) { return marked[v]; }

    public static void main(String[] args)
    {
        FlowNetwork G = new FlowNetwork(new In(args[0]));
        int s = 0, t = G.V() - 1;
        FordFulkerson maxflow = new FordFulkerson(G, s, t);

        StdOut.println("Max flow from " + s + " to " + t);
        for (int v = 0; v < G.V(); v++)
            for (FlowEdge e : G.adj(v))
                if ((v == e.from()) && e.flow() > 0)
                    StdOut.println(" " + e);
        StdOut.println("Max flow value = " + maxflow.value());
    }
}


↥ back to top


5.6. Maxflow and mincut applications

Maxflow/mincut is a widely applicable problem-solving model.

Network flow formulation of bipartite matching

Create s, t, one vertex for each student, and one vertex for each job.

1-1 correspondence between perfect matchings in bipartite graph and integer-valued maxflows of value N.

When no perfect matching, mincut explains why.

Mincut. Consider mincut (A, B).

Maximum flow algorithms: theory

(Yet another) holy grail for theoretical computer scientists.

Maxflow algorithms for sparse digraphs with E edges, integer capacities between 1 and U | year | method | worst case | discovered by | | :–: | :–: | :–: | :–: | | 1951 | simplex | E3 U | Dantzig | | 1955 | augmenting path | E2 U | Ford-Fulkerson | | 1970 | shortest augmenting path | E3 | Dinitz, Edmonds-Karp | | 1970 | fattest augmenting path | E2 log E log( E U ) | Dinitz, Edmonds-Karp | | 1977 | blocking flow | E5/2 | Cherkasky | | 1978 | blocking flow | E7/3 | Galil | | 1983 | dynamic trees | E2 log E | Sleator-Tarjan | | 1985 | capacity scaling | E2 log U | Gabow | | 1997 | length function | E3/2 log E log U | Goldberg-Rao | | 2012 | compact network | E2 / log E | Orlin | | ? | ? | E | ? |


↥ back to top


6. Radix Sort

In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort.

Radix sort can be applied to data that can be sorted lexicographically, be they integers, words, punch cards, playing cards, or the mail.

via Wikipedia

6.1. Strings in Java

String. Sequence of characters.

Important fundamental abstraction.

C char data type. Typically an 8-bit integer.

Java char data type. A 16-bit unsigned integer.

6.1.1. String

String data type in Java. Sequence of characters (immutable).

String: Java implementation

public final class String implements Comparable<String> 
{
    private char[] value; // characters
    private int offset; // index of first char in array
    private int length; // length of string
    private int hash; // cache of hashCode()

    public int length() {
        return length;
    }

    public char charAt(int i) {
        return value[i + offset];
    }

    private String(int offset, int length, char[] value) {
        this.offset = offset;
        this.length = length;
        this.value = value;
    }

    public String substring(int from, int to) {
        return new String(offset + from, to - from, value);
    }
}

6.1.2. StringBuilder

The StringBuilder data type. Sequence of characters (mutable).

data type String String StringBuilder StringBuilder
operation guarantee extra space guarantee extra space
length() 1 1 1 1
charAt() 1 1 1 1
substring() 1 1 N N
concat() N N 1 * 1 *

6.1.3. String vs. StringBuilder

How to efficiently reverse a string?

// Algorithm A
public static String reverse(String s)
{
    String rev = "";
    for (int i = s.length() - 1; i >= 0; i--)
        rev += s.charAt(i);
    return rev;
}

// Algorithm B
public static String reverse(String s)
{
    StringBuilder rev = new StringBuilder();
    for (int i = s.length() - 1; i >= 0; i--)
        rev.append(s.charAt(i));
    return rev.toString();
}

How to efficiently form array of suffixes?

// Algorithm A
public static String[] suffixes(String s)
{
    int N = s.length();
    String[] suffixes = new String[N];
    for (int i = 0; i < N; i++)
        suffixes[i] = s.substring(i, N);
    return suffixes;
}

// Algorithm B
public static String[] suffixes(String s)
{
    int N = s.length();
    StringBuilder sb = new StringBuilder(s);
    String[] suffixes = new String[N];
    for (int i = 0; i < N; i++)
        suffixes[i] = sb.substring(i, N);
    return suffixes;
}

NOTE: The order of growth of the amount of memory used by Algorithm A to form the n suffixes of a string of length n is linear in Java 6 and quadratic in Java 7, Update 6.

Why? > Oracle and OpenJDK changed their representation of the **String** data type in Java 7, Update 6 so that the underlying `char[]` array is no longer shared! Now, it takes linear extra space and time to extract a substring (instead of constant extra space and time). > > See [this article](http://java-performance.info/changes-to-string-java-1-7-0_06/) for more details.

How long to compute length of longest common prefix?

public static int lcp(String s, String t)
{
    int N = Math.min(s.length(), t.length());

    // linear time (worst case)
    // sublinear time (typical case)
    for (int i = 0; i < N; i++)
        if (s.charAt(i) != t.charAt(i))
            return i;
    return N;
}

6.1.4. Alphabets

Some applications involve strings taken from a restricted alphabet. In such applications, it often makes sense to use an Alphabet class with the following API:

public class Alphabet  
Alphabet(String s) create a new alphabet from chars in s
char toChar(int index) convert index to corresponding alphabet char
int toIndex(char c) convert c to an index between 0 and R-1
boolean contains(char c) is c in the alphabet?
int R() radix (number of characters in alphabet)
int lgR() number of bits to represent an index
int[] toIndices(String s) convert s to base-R integer
String toChars(int[] indices) convert base-R integer to string over this alphabet

Standard alphabets:

name R() lgR() characters
BINARY 2 1 01
DNA 4 2 ACTG
OCTAL 8 3 01234567
DECIMAL 10 4 0123456789
HEXADECIMAL 16 4 0123456789ABCDEF
PROTEIN 20 5 ACDEFGHIKLMNPQRSTVWY
LOWERCASE 26 5 abcdefghijklmnopqrstuvwxyz
UPPERCASE 26 5 ABCDEFGHIJKLMNOPQRSTUVWXYZ
BASE64 64 6 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
ASCII 128 7 ASCII characters
EXTENDED_ASCII 256 8 extended ASCII characters
UNICODE16 65536 16 Unicode characters


↥ back to top


6.2. Key-indexed counting

Review: summary of the performance of sorting algorithms.

Frequency of operations = key compares.

algorithm guarantee random extra space stable? operations on keys
insertion sort ½ N2 ¼ N2 1 yes compareTo()
mergesort N lg N N lg N N yes compareTo()
quicksort 1.39 N lg N * 1.39 N lg N c lg N no compareTo()
heapsort 2 N lg N 2 N lg N 1 no compareTo()

* probabilistic

Lower bound. ~ N lg N compares required by any compare-based algorithm. NOTE: can be better(despite the lower bound), if don’t depend on key compares.

Key-indexed counting: assumptions about keys

Assumption. Keys are integers between 0 and R - 1. Implication. Can use key as an array index.

Applications.

Remark. Keys may have associated data ⇒ can’t just count up number of keys of each value.

Goal. Sort an array a[] of N integers between 0 and R - 1.

cumulates, transforming counts to start indices

move items, distributing the records

int N = a.length;
int[] count = new int[R+1];

// count frequencies
for (int i = 0; i < N; i++)
    count[a[i]+1]++;

// compute cumulates, transforming counts to start indices.
for (int r = 0; r < R; r++)
    count[r+1] += count[r];

// move items, distributing the records
for (int i = 0; i < N; i++)
    aux[count[a[i]]++] = a[i];

// copy back
for (int i = 0; i < N; i++)
    a[i] = aux[i];

Proposition. Key-indexed counting uses 8N + 3R + 1 array accesses to stably sort N items whose keys are integers between 0 and R - 1.

Proof. > Immediate from the code. Initializing the arrays uses `N + R + 1` array accesses. The first loop increments a counter for each of the N items (`2N` array accesses); the second loop does R additions (`2R` array accesses); the third loop does N counter increments and N data moves (`3N` array accesses); and the fourth loop does N data moves (`2N` array accesses). **Both moves preserve the relative order of equal keys**.


↥ back to top


6.3. LSD radix sort

LSD string (radix) sort.

Proposition. LSD sorts fixed-length strings in ascending order.

Proof. > [by induction on `i`] > After pass `i`, strings are sorted by last `i` characters. > - If two strings differ on sort key, key-indexed sort puts them in proper relative order. > - If two strings agree on sort key, stability keeps them in proper relative order.

Proposition. LSD sort is stable.

public class LSD
{
    public static void sort(String[] a, int W)
    { // fixed-length W strings
        int R = 256;
        int N = a.length;
        String[] aux = new String[N];

        // do key-indexed counting for each digit from right to left
        for (int d = W-1; d >= 0; d--)
        {
            int[] count = new int[R+1];
            for (int i = 0; i < N; i++)
                count[a[i].charAt(d) + 1]++;

            for (int r = 0; r < R; r++)
                count[r+1] += count[r];

            for (int i = 0; i < N; i++)
                aux[count[a[i].charAt(d)]++] = a[i];

            for (int i = 0; i < N; i++)
                a[i] = aux[i];
        }
    }
}

Proposition. LSD string sort uses ~7WN + 3WR array accesses and extra space proportional to N + R to sort N items whose keys are W-character strings taken from an R-character alphabet.


↥ back to top


6.4. MSD radix sort

Most significant digit first string sort

MSD string (radix) sort.

MSD radix sort

Variable-length strings. Treat strings as if they had an extra char at end (smaller than any char).

C strings. Have extra char '\0' at end ⇒ no extra work needed.

6.4.1. MSD string sort: Java implementation

public class MSD
{
    private static int R = 256;         // radix
    private static final int M = 15;    // cutoff for small subarrays
    private static String[] aux;        // auxiliary array for distribution

    private static int charAt(String s, int d)
    { if (d < s.length()) return s.charAt(d); else return -1; }

    public static void sort(String[] a)
    {
        int N = a.length;
        aux = new String[N];            // can recycle aux[] array, but not count[] array
        sort(a, 0, N-1, 0);
    }

    private static void sort(String[] a, int lo, int hi, int d)
    { // Sort from a[lo] to a[hi], starting at the dth character.
        if (hi <= lo + M)
        { Insertion.sort(a, lo, hi, d); return; }
        int[] count = new int[R+2];     // Compute frequency counts.
        for (int i = lo; i <= hi; i++)
            count[charAt(a[i], d) + 2]++;

        for (int r = 0; r < R+1; r++)   // Transform counts to indices.
            count[r+1] += count[r];

        for (int i = lo; i <= hi; i++) // Distribute.
            aux[count[charAt(a[i], d) + 1]++] = a[i];

        for (int i = lo; i <= hi; i++) // Copy back.
            a[i] = aux[i - lo];
        
        // Recursively sort for each character value.
        for (int r = 0; r < R; r++)
            sort(a, lo + count[r], lo + count[r+1] - 1, d+1);
    }
}

Solution. Cutoff to insertion sort for small subarrays.

NOTE: in Java, forming and comparing substrings is faster than directly comparing chars with charAt()

public static void sort(String[] a, int lo, int hi, int d)
{ // Sort from a[lo] to a[hi], starting at the dth character.
    for (int i = lo; i <= hi; i++)
        for (int j = i; j > lo && less(a[j], a[j-1], d); j--)
            exch(a, j, j-1);
}

private static boolean less(String v, String w, int d)
{ return v.substring(d).compareTo(w.substring(d)) < 0; }

MSD string sort: performance

Number of characters examined.

6.4.2. MSD string sort vs. quicksort for strings

Disadvantages of MSD string sort.

Disadvantage of quicksort.

Goal. Combine advantages of MSD and quicksort.


↥ back to top


6.5. 3-way radix quicksort

3-way string quicksort (Bentley and Sedgewick, 1997)

Overview. Do 3-way partitioning on the dth character.

3-way radix quicksort

6.5.1. 3-way string quicksort: Java implementation

private static void sort(String[] a)
{ sort(a, 0, a.length - 1, 0); }

private static void sort(String[] a, int lo, int hi, int d)
{
    if (hi <= lo) return;
    int lt = lo, gt = hi;
    int v = charAt(a[lo], d);
    int i = lo + 1;
    while (i <= gt)
    {
        int t = charAt(a[i], d);
        if (t < v) exch(a, lt++, i++);
        else if (t > v) exch(a, i, gt--);
        else i++;
    }

    sort(a, lo, lt-1, d);
    if (v >= 0) sort(a, lt, gt, d+1);   // sort 3 subarrays recursively
    sort(a, gt+1, hi, d);
}

6.5.2. 3-way string quicksort vs. standard quicksort

algorithm property
Standard quicksort <ul><li>Uses ~ 2 N ln N string compares on average. </li><li>Costly for keys with long common prefixes (and this is a common case!)</li></ul>
3-way string (radix) quicksort <ul><li>Uses ~ 2 N ln N character compares on average for random strings.</li><li>Avoids re-comparing long common prefixes.</li></ul>

6.5.3. 3-way string quicksort vs. MSD string sort

algorithm property
MSD string sort <ul><li>Is cache-inefficient. </li><li>Too much memory storing count[]. </li><li>Too much overhead reinitializing count[] and aux[]</li></ul>
3-way string (radix) quicksort <ul><li>Has a short inner loop.</li><li>Is cache-friendly.</li><li>Is in-place.</li></ul>

The main reasons to use 3-way radix quicksort are speed and memory. 3-way radix quicksort is not stable and it uses more character compares than MSD radix sort. Both 3-way radix quicksort and MSD radix sort handle variable-length strings.


↥ back to top


6.6. Summary of the performance of sorting algorithms

Frequency of operations.

algorithm guarantee random extra space stable? operations on keys
insertion sort ½ N2 ¼ N2 1 yes compareTo()
mergesort N lg N N lg N N yes compareTo()
quicksort 1.39 N lg N * 1.39 N lg N c lg N no compareTo()
heapsort 2 N lg N 2 N lg N 1 no compareTo()
LSD †</sub> 2 N W 2 N W N + R yes charAt()
MSD ‡</sub> 2 N W N log R N N + D R yes charAt()
3-way string quicksort 1.39 W N lg R * 1.39 N lg N log N + W no charAt()

* probabilistic
† fixed-length W keys
‡ average-length W keys

6.7. Suffix arrays

6.7.1. Keyword-in-context search (KWIC)

Given a text of N characters, preprocess it to enable fast substring search (find all occurrences of query string context).

Applications. Linguistics, databases, web search, word processing, ….

Keyword-in-context search: suffix-sorting solution (sort suffixes to bring repeated substrings together)

6.7.2. Longest repeated substring

Given a string of N characters, find the longest repeated substring.

Applications. Bioinformatics, cryptanalysis, data compression, …

Visualize repetitions in music.

The diagrams in The Shape of Song display musical form as a sequence of translucent arches. Each arch connects two repeated, identical passages of a composition. By using repeated passages as signposts, the diagram illustrates the deep structure of the composition.

Visualize repetitions in music

Brute-force algorithm.

Analysis. Running time ≤ D N2, where D is length of longest match.

Longest repeated substring: a sorting solution

6.7.2.1. Longest repeated substring: Java implementation
public String lrs(String s)
{
    int N = s.length();

    // create suffixes (linear time and space)
    String[] suffixes = new String[N];
    for (int i = 0; i < N; i++)
        suffixes[i] = s.substring(i, N);

    // sort suffixes
    Arrays.sort(suffixes);

    // find LCP between adjacent suffixes in sorted order
    String lrs = "";
    for (int i = 0; i < N-1; i++)
    {
        int len = lcp(suffixes[i], suffixes[i+1]);
        if (len > lrs.length())
            lrs = suffixes[i].substring(0, len);
    }
    return lrs;
}

6.7.3. Suffix sorting: worst-case input

Bad input: longest repeated substring very long.

LRS needs at least 1 + 2 + 3 + ... + D character compares, where D = length of longest match.

Running time. Quadratic (or worse) in D for LRS (and also for sort).

Suffix sorting challenge. Suffix sort an arbitrary string of length N.

worst-case running time of best algorithm for problem:

6.7.4. Manber-Myers MSD algorithm

Suffix Arrays: A New Method for On-Line String Searches

Worst-case running time. N lg N.

The main reason to use the Manber-Myers algorithm instead of 3-way quicksort in order to suffix sort a string of length N:

NOTE:


↥ back to top


6.8. String sorting summary

We can develop linear-time sorts.

We can develop sublinear-time sorts.

3-way string quicksort is asymptotically optimal.

Long strings are rarely random in practice.

See also:


↥ back to top


7. Trie

Performance of symbol-table implementations

Order of growth of the frequency of operations.

implementation typical case
search
typical case
insert
typical case
delete
ordered operations operations on keys
red-black BST log N log N log N yes compareTo()
hash table 1 1 1 no equals()
hashCode()

† under uniform hashing assumption

As with sorting, we can take advantage of properties of strings to develop search methods (symbol-table implementations) that can be more efficient than the general-purpose methods (i.e., BST, hash tables) for typical applications where search keys are strings.

API for a symbol table with string keys

public class StringST< Value>  
StringST() create a symbol table
void put(String key, Value val) put key-value pair into the table (remove key if value is null)
Value get(String key) value paired with key (null if key is absent)
void delete(String key) remove key (and its value)
boolean contains(String key) is there a value paired with key?
boolean isEmpty() is the table empty?
String longestPrefixOf(String s) the longest key that is a prefix of s
Iterable<String> keysWithPrefix(String s) all the keys having s as a prefix
Iterable<String> keysThatMatch(String s) all the keys that match s (where . matches any character)
int size() number of key-value pairs
Iterable<String> keys() all the keys in the table

7.1. R-way tries

Tries. [from retrieval, but pronounced “try”]

Trie

ALGORITHM: Trie symbol table

public class TrieST<Value>
{
    private static int R = 256;     // radix, extended ASCII
    private Node root;              // root of trie

    private static class Node
    {
        private Object val;
        private Node[] next = new Node[R];
    }

    public Value get(String key)
    {
        Node x = get(root, key, 0);
        if (x == null) return null;
        return (Value) x.val;       // cast needed
    }

    private Node get(Node x, String key, int d)
    { // Return value associated with key in the subtrie rooted at x.
        if (x == null) return null;
        if (d == key.length()) return x;
        char c = key.charAt(d);     // Use dth key char to identify subtrie.
        return get(x.next[c], key, d+1);
    }

    public void put(String key, Value val)
    { root = put(root, key, val, 0); }

    private Node put(Node x, String key, Value val, int d)
    { // Change value associated with key if in subtrie rooted at x.
        if (x == null) x = new Node();
        if (d == key.length()) { x.val = val; return x; }
        char c = key.charAt(d);     // Use dth key char to identify subtrie.
        x.next[c] = put(x.next[c], key, val, d+1);
        return x;
    }
        
    public Iterable<String> keys()
    { return keysWithPrefix(""); }

    public Iterable<String> keysWithPrefix(String pre)
    {
        Queue<String> q = new Queue<String>();
        collect(get(root, pre, 0), pre, q);
        return q;
    }

    private void collect(Node x, String pre, Queue<String> q)
    {
        if (x == null) return;
        if (x.val != null) q.enqueue(pre);
        for (char c = 0; c < R; c++)
            collect(x.next[c], pre + c, q);
    }
}

NOTE: The value in Node has to be an Object because Java does not support arrays of generics; cast values back to Value in get().

Deletion in an R-way trie

To delete a key-value pair:

Collecting keys

To iterate through all keys in sorted order:

collecting keys

Prefix matches

Find all keys in a symbol table starting with a given prefix.

Ex. Autocomplete in a cell phone, search bar, text editor, or shell.

Interview question. Design a data structure to perform efficient spell checking.

Solution. > Build a 26-way trie, key = word, value = bit (indicating whether a string ending there is a word).

Longest prefix

Find longest key in symbol table that is a prefix of query string.

Ex. To send packet toward destination IP address, router chooses IP address in routing table that is longest prefix match.

public String longestPrefixOf(String query)
{
    int length = search(root, query, 0, 0);
    return query.substring(0, length);
}
private int search(Node x, String query, int d, int length)
{
    if (x == null) return length;
    if (x.val != null) length = d;
    if (d == query.length()) return length;
    char c = query.charAt(d);
    return search(x.next[c], query, d+1, length);
}

7.1.1. Trie performance

Search hit. Need to examine all L characters for equality.

Search miss.

Space. R null links at each leaf.
(but sublinear space possible if many short strings share common prefixes)

Bottom line. Fast search hit and even faster search miss, but wastes space.


↥ back to top


7.2. Ternary search trie

Ternary search tries: another data structure that responds to the problem of too much memory space used by tries.

Search hit. Node where search ends has a non-null value.

Search miss. Reach a null link or node where search ends has null value.

Ternary search trie

ALGORITHM: TST symbol table

public class TST<Value>
{
    private Node root;              // root of trie

    private class Node
    {
        char c;                     // character
        Node left, mid, right;      // left, middle, and right subtries
        Value val;                  // value associated with string
    }

    public Value get(String key)    // same as for tries (See page 737).

    private Node get(Node x, String key, int d)
    {
        if (x == null) return null;
        char c = key.charAt(d);
        if (c < x.c) return get(x.left, key, d);
        else if (c > x.c) return get(x.right, key, d);
        else if (d < key.length() - 1)
            return get(x.mid, key, d+1);
        else return x;
    }

    public void put(String key, Value val)
    { root = put(root, key, val, 0); }

    private Node put(Node x, String key, Value val, int d)
    {
        char c = key.charAt(d);
        if (x == null) { x = new Node(); x.c = c; }
        if (c < x.c) x.left = put(x.left, key, val, d);
        else if (c > x.c) x.right = put(x.right, key, val, d);
        else if (d < key.length() - 1)
            x.mid = put(x.mid, key, val, d+1);
        else x.val = val;
        return x;
    }
}

7.2.1. TST vs. hashing

algorithm property
Hashing <ul><li>Need to examine entire key.</li><li>Search hits and misses cost about the same.</li><li>Performance relies on hash function.</li><li>Does not support ordered symbol table operations.</li></ul>
TST <ul><li>Works only for strings (or digital keys).</li><li>Only examines just enough key characters.</li><li>Search miss may involve only a few characters.</li><li>Supports ordered symbol table operations (plus others!).</li></ul>

Bottom line. TSTs are:

7.2.2. String symbol table implementation cost summary

implementation character accesses
(typical case)
search hit
character accesses
(typical case)
search miss
character accesses
(typical case)
insert
character accesses
(typical case)
space (references)
moby.txt actors.txt
red-black BST L + c lg 2 N c lg 2 N c lg 2 N 4 N 1.40 97.4
hash table L L L 4 N to 16 N 0.76 40.6
R-way trie L log R N L (R + 1) N 1.12 out of memory
TST L + ln N ln N L + ln N 4 N 0.72 38.7
TST with R2 L + ln N ln N L + ln N 4 N + R2 0.51 32.7

If space is available, R-way tries provide the fastest search, essentially completing the job with a constant number of character compares. For large alphabets, where space may not be available for R-way tries, TSTs are preferable, since they use a logarithmic number of character compares, while BSTs use a logarithmic number of key compares. Hashing can be competitive, but, as usual, cannot support ordered symbol-table operations or extended character-based API operations such as prefix or wildcard match.

7.3. Patricia trie

Practical Algorithm to Retrieve Information Coded in Alphanumeric

Applications.

Also known as: crit-bit tree, radix tree.

7.4. Suffix tree

Suffix tree.

Applications.

7.5. String symbol tables summary

algorithm property
Red-black BST <ul><li>Performance guarantee: log N key compares.</li><li>Supports ordered symbol table API.</li></ul>
Hash tables <ul><li>Performance guarantee: constant number of probes.</li><li>Requires good hash function for key type.</li></ul>
Tries. R-way, TST <ul><li>Performance guarantee: log N characters accessed.</li><li>Supports character-based operations.</li></ul>

See also:


↥ back to top


Goal. Find pattern of length M in a text of length N.

Substring search applications

8.1. Brute force

Check for pattern starting at each text position.

8.1.1. Brute-force substring search: Java implementation

public static int search(String pat, String txt)
{
    int M = pat.length();
    int N = txt.length();
    for (int i = 0; i <= N - M; i++)
    {
        int j;
        for (j = 0; j < M; j++)
            if (txt.charAt(i+j) != pat.charAt(j))
                break;
        if (j == M) return i; // index in text where pattern starts
    }
    return N;   // not found
}

Brute-force algorithm can be slow if text and pattern are repetitive.

Worst case. ~ M N char compares.

In many applications, we want to avoid backup in text stream.

Brute-force algorithm needs backup for every mismatch.

brute force backup

Maintain buffer of last M characters.

8.1.2. Brute-force substring search: alternate implementation

Same sequence of char compares as previous implementation.

public static int search(String pat, String txt)
{
    int i, N = txt.length();
    int j, M = pat.length();
    for (i = 0, j = 0; i < N && j < M; i++)
    {
        if (txt.charAt(i) == pat.charAt(j)) j++;
        else { i -= j; j = 0; }     // explicit backup
    }
    if (j == M) return i - M;
    else return N;
}

Theoretical challenge. Linear-time guarantee.

Practical challenge. Avoid backup in text stream.


↥ back to top


8.2. Knuth-Morris-Pratt (KMP)

Intuition. Suppose we are searching in text for pattern BAAAAAAAAA.

8.2.1. Deterministic finite state automaton (DFA)

DFA is abstract string-searching machine.

Key differences from brute-force implementation.

public int search(String txt)
{
    int i, j, N = txt.length();
    for (i = 0, j = 0; i < N && j < M; i++)
        j = dfa[txt.charAt(i)][j];  // no backup

    if (j == M) return i - M;   // found (hit end of pattern)
    else return N;              // not found (hit end of text)
}

Running time.

KMP DFA

8.2.2. Build DFA from pattern

Match transition. If in state j and next char c == pat.charAt(j), go to j+1. Mismatch transition. If in state j and next char c != pat.charAt(j), then the last j-1 characters of input are pat[1..j-1], followed by c.

NOTE:

build DFA

Constructing the DFA for KMP substring search: Java implementation

For each state j:

public KMP(String pat)
{   // Build DFA from pattern.
    this.pat = pat;
    M = pat.length();
    dfa = new int[R][M];
    dfa[pat.charAt(0)][0] = 1;

    for (int X = 0, j = 1; j < M; j++)
    {
        for (int c = 0; c < R; c++)
            dfa[c][j] = dfa[c][X];      // copy mismatch cases
        dfa[pat.charAt(j)][j] = j+1;    // set match case
        X = dfa[pat.charAt(j)][X];      // update restart state
    }
}

Running time. M character accesses (but space/time proportional to R M).

KMP substring search analysis

Proposition. KMP substring search accesses no more than M + N chars to search for a pattern of length M in a text of length N.
Pf. Each pattern char accessed once when constructing the DFA; each text char accessed once (in the worst case) when simulating the DFA.

Proposition. KMP constructs dfa[][] in time and space proportional to R M.

Larger alphabets. Improved version of KMP constructs nfa[] in time and space proportional to M.

ALGORITHM: Knuth-Morris-Pratt substring search

public class KMP
{
    private String pat;
    private int[][] dfa;
    public KMP(String pat)
    { // Build DFA from pattern.
        this.pat = pat;
        int M = pat.length();
        int R = 256;
        dfa = new int[R][M];
        dfa[pat.charAt(0)][0] = 1;
        for (int X = 0, j = 1; j < M; j++)
        { // Compute dfa[][j].
            for (int c = 0; c < R; c++)
                dfa[c][j] = dfa[c][X]; // Copy mismatch cases.
            dfa[pat.charAt(j)][j] = j+1; // Set match case.
            X = dfa[pat.charAt(j)][X]; // Update restart state.
        }
    }
    
    public int search(String txt)
    { // Simulate operation of DFA on txt.
        int i, j, N = txt.length(), M = pat.length();
        for (i = 0, j = 0; i < N && j < M; i++)
            j = dfa[txt.charAt(i)][j];
        if (j == M) return i - M; // found (hit end of pattern)
        else return N; // not found (hit end of text)
    }
    
    public static void main(String[] args)
    {
        String pat = args[0];
        String txt = args[1];
        KMP kmp = new KMP(pat);
        StdOut.println("text: " + txt);
        int offset = kmp.search(txt);
        StdOut.print("pattern: ");
        for (int i = 0; i < offset; i++)
            StdOut.print(" ");
        StdOut.println(pat);
    }
}


↥ back to top


Intuition.

Q. How much to skip? A. Precompute index of rightmost occurrence of character c in pattern (-1 if character not in pattern).

Boyer-Moore skip table

ALGORITHM: Boyer-Moore substring search (mismatched character heuristic)

public class BoyerMoore
{
    private int[] right;
    private String pat;

    BoyerMoore(String pat)
    { // Compute skip table.
        this.pat = pat;
        int M = pat.length();
        int R = 256;
        right = new int[R];
        for (int c = 0; c < R; c++)
            right[c] = -1;              // -1 for chars not in pattern
        for (int j = 0; j < M; j++)     // rightmost position for
            right[pat.charAt(j)] = j;   // chars in pattern
    }

    public int search(String txt)
    { // Search for pattern in txt.
        int N = txt.length();
        int M = pat.length();
        int skip;
        for (int i = 0; i <= N-M; i += skip)
        { // Does the pattern match the text at position i ?
            skip = 0;
            for (int j = M-1; j >= 0; j--)
                if (pat.charAt(j) != txt.charAt(i+j))
                {
                    skip = j - right[txt.charAt(i+j)];
                    if (skip < 1) skip = 1;
                    break;
                }
            if (skip == 0) return i;    // found.
        }
        return N;                       // not found.
    }
}

The constructor in this substring search algorithm builds a table giving the rightmost occurrence in the pattern of each possible character. The search method scans from right to left in the pattern, skipping to align any character causing a mismatch with its rightmost occurrence in the pattern.

8.3.1. Boyer-Moore: analysis

Property. Substring search with the Boyer-Moore mismatched character heuristic takes about ~ N / M character compares to search for a pattern of length M in a text of length N.

Worst-case. Can be as bad as ~ M N.

Boyer-Moore variant. Can improve worst case to ~ 3 N character compares by adding a KMP-like rule to guard against repetitive patterns.


↥ back to top


Basic idea = modular hashing.

8.4.1. Efficiently computing the hash function

Modular hash function. Using the notation ti for txt.charAt(i), we wish to compute
xi = ti RM-1 + ti+1 RM-2 + … + ti+M-1 R0 (mod Q)

Intuition. M-digit, base-R integer, modulo Q.

Horner’s method. Linear-time method to evaluate degree-M polynomial.

// Compute hash for M-digit key
private long hash(String key, int M)
{
    long h = 0;
    for (int j = 0; j < M; j++)
        h = (R * h + key.charAt(j)) % Q;
    return h;
}

Challenge. How to efficiently compute xi+1 given that we know xi.
xi = ti RM-1 + ti+1 RM-2 + … + ti+M-1 R0
xi+1 = ti+1 RM-1 + ti+2 RM-2 + … + ti+M R0

Key property. Can update hash function in constant time.
xi+1 = (xi – ti RM-1) R + ti+M

ALGORITHM: Rabin-Karp fingerprint substring search

public class RabinKarp
{
    private String pat;         // pattern (only needed for Las Vegas)
    private long patHash;       // pattern hash value
    private int M;              // pattern length
    private long Q;             // a large prime
    private int R = 256;        // alphabet size
    private long RM;            // R^(M-1) % Q
    public RabinKarp(String pat)
    {
        this.pat = pat;         // save pattern (only needed for Las Vegas)
        this.M = pat.length();

        Q = longRandomPrime();  // See Exercise 5.3.33 (Book)

        RM = 1;
        for (int i = 1; i <= M-1; i++)  // Compute R^(M-1) % Q for use
            RM = (R * RM) % Q;          // in removing leading digit.
        patHash = hash(pat, M);
    }

    public boolean check(int i) // Monte Carlo
    { return true; }            // For Las Vegas, check pat vs txt(i..i-M+1).
    
    private long hash(String key, int M)
    {
        long h = 0;
        for (int j = 0; j < M; j++)
            h = (R * h + key.charAt(j)) % Q;
        return h;
    }

    private int search(String txt)
    { // Search for hash match in text.
        int N = txt.length();
        long txtHash = hash(txt, M);
        if (patHash == txtHash) return 0; // Match at beginning.

        for (int i = M; i < N; i++)
        { // Remove leading digit, add trailing digit, check for match.
            txtHash = (txtHash + Q - RM*txt.charAt(i-M) % Q) % Q;
            txtHash = (txtHash*R + txt.charAt(i)) % Q;
            if (patHash == txtHash)
                if (check(i - M + 1)) return i - M + 1; // match
        }
        return N; // no match found
    }
}

Monte Carlo version. Return match if hash match.

Las Vegas version. Check for substring match if hash match; continue search if false collision.

Rabin-Karp analysis

Theory. If Q is a sufficiently large random prime (about M N2), then the probability of a false collision is about 1 / N.

Practice. Choose Q to be a large prime (but not so large to cause overflow). Under reasonable assumptions, probability of a collision is about 1 / Q.

Advantages.

Disadvantages.

8.5. Cost summary for substring search implementations

Cost summary for substring search


↥ back to top


9. Regular Expressions

9.1. Pattern matching

Examples.

Applications.

Test if a string matches some pattern.

Parse text files.


↥ back to top


9.2. Regular expressions

A regular expression is a notation to specify a set of strings (possibly infinite).

operations:

Ex. [A-E]+ is shorthand for (A|B|C|D|E)(A|B|C|D|E)*

RE notation is surprisingly expressive.

REs play a well-understood role in the theory of computation.

Regular expression caveat

Writing a RE is like writing a program. REs are amazingly powerful and expressive, but using them in applications can be amazingly complex and error-prone.


↥ back to top


9.3. Duality between REs and DFAs

RE. Concise way to describe a set of strings.

DFA. Machine to recognize whether a given string is in a given set.

Kleene’s theorem.

RE (number of 1’s is a multiple of 3) DFA (number of 1’s is a multiple of 3)
0* \| (0*10*10*10*)* DFA

9.3.1. Pattern matching implementation

To apply Kleene’s theorem by building Deterministic finite state automata (DFA) from RE is infeasible because DFA may have exponential # of states.

Instead, apply Kleene’s theorem by building Nondeterministic finite state automata (NFA) from RE and simulating NFA with text as input.

9.3.1.1. Nondeterministic finite-state automata

Regular-expression-matching NFA.

Nondeterminism.

NFA

Q. How to determine whether a string is matched by an automaton?

Q. How to simulate NFA?
A. Systematically consider all possible transition sequences.

Q. How to efficiently simulate an NFA?
A. Maintain set of all possible states that NFA could be in after reading in the first i text characters.

Digraph reachability. Find all vertices reachable from a given source or set of vertices.

NFA simulation: Java implementation

public class NFA
{
    private char[] re;  // match transitions
    private Digraph G;  // epsilon transition digraph
    private int M;      // number of states

    public NFA(String regexp)
    {
        M = regexp.length();
        re = regexp.toCharArray();
        G = buildEpsilonTransitionDigraph();
    }

    public boolean recognizes(String txt)
    {
        Bag<Integer> pc = new Bag<Integer>();
        DirectedDFS dfs = new DirectedDFS(G, 0);
        for (int v = 0; v < G.V(); v++)
            if (dfs.marked(v)) pc.add(v);
        for (int i = 0; i < txt.length(); i++)
        {
            Bag<Integer> states = new Bag<Integer>();
            for (int v : pc)
            {
                if (v == M) continue;
                if ((re[v] == txt.charAt(i)) || re[v] == '.')
                    states.add(v+1);
            }
            dfs = new DirectedDFS(G, states);
            pc = new Bag<Integer>();
            for (int v = 0; v < G.V(); v++)
                if (dfs.marked(v)) pc.add(v);
        }
        for (int v : pc)
            if (v == M) return true;
        return false;
    }

    public Digraph buildEpsilonTransitionDigraph()
    { /* stay tuned */ }
}

NFA simulation: analysis

Proposition. Determining whether an N-character text is recognized by the NFA corresponding to an M-character pattern takes time proportional to M N in the worst case.

Proof. >For each of the N text characters, we iterate through a set of states of size no more than M and run DFS on the graph of ε-transitions. [The NFA construction we will consider ensures the number of edges ≤ 3M.]
9.3.1.2. NFA construction: implementation

States. Include a state for each symbol in the RE, plus an accept state.

Concatenation. Add match-transition edge from state corresponding to characters in the alphabet to next state.

Metacharacters. ( ) . * |

Goal. Write a program to build the ε-transition digraph.

Challenges. Remember left parentheses to implement closure and or; remember | to implement or.

Solution. Maintain a stack.

private Digraph buildEpsilonTransitionDigraph() {
    Digraph G = new Digraph(M+1);
    Stack<Integer> ops = new Stack<Integer>();
    for (int i = 0; i < M; i++) {
        int lp = i;
        if (re[i] == '(' || re[i] == '|') ops.push(i);  // left parentheses and |
        else if (re[i] == ')') {
            int or = ops.pop();
            if (re[or] == '|') {
                lp = ops.pop();     // 2-way or
                G.addEdge(lp, or+1);
                G.addEdge(or, i);
            }
            else lp = or;
        }
        if (i < M-1 && re[i+1] == '*') {    // closure (needs 1-character lookahead)
            G.addEdge(lp, i+1);
            G.addEdge(i+1, lp);
        }
        if (re[i] == '(' || re[i] == '*' || re[i] == ')')   // metasymbols
            G.addEdge(i, i+1);
    }
    return G;
}
9.3.1.3. NFA construction: analysis

Proposition. Building the NFA corresponding to an M-character RE takes time and space proportional to M.

Proof. For each of the M characters in the RE, we add at most three ε-transitions and execute at most two stack operations.


↥ back to top


9.4. Generalized regular expression print (grep)

Grep. Take a RE as a command-line argument and print the lines from standard input having some substring that is matched by the RE.

public class GREP
{
    public static void main(String[] args)
    {
        String re = "(.*" + args[0] + ".*)";    // contains RE as a substring
        NFA nfa = new NFA(re);
        while (StdIn.hasNextLine())
        {
            String line = StdIn.readLine();
            if (nfa.recognizes(line))
                StdOut.println(line);
        }
    }
}

Industrial-strength grep implementation

To complete the implementation:

Worst-case for grep (proportional to M N) is the same as for brute-force substring search.

Regular expressions in Java

Harvesting information

Goal. Print all substrings of input that match a RE.

RE pattern matching is implemented in Java’s java.util.regexp.Pattern and java.util.regexp.Matcher classes.

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class Harvester {
    public static void main(String[] args) {
        String regexp = args[0];
        In in = new In(args[1]);
        String input = in.readAll();

        Pattern pattern = Pattern.compile(regexp);  // compile() creates a Pattern (NFA) from RE
        Matcher matcher = pattern.matcher(input);   // matcher() creates a Matcher (NFA simulator) from NFA and text

        while (matcher.find()) {                    // find() looks for the next match
            StdOut.println(matcher.group());        // group() returns the substring most recently found by find()
        }
    }
}

Algorithmic complexity attacks

Warning. Typical implementations do not guarantee performance!

Context

Abstract machines, languages, and nondeterminism.

Compiler. A program that translates a program to machine code.

  KMP grep Java
pattern string RE program
parser unnecessary check if legal check if legal
compiler output DFA NFA byte code
simulator DFA simulator NFA simulator JVM


↥ back to top


9.5. Summary of pattern-matching algorithms

Programmer.

Theoretician.


↥ back to top


10. Data Compression

API for static methods that read from a bitstream on standard input: | public class BinaryStdIn || | :–: | :–: | | boolean readBoolean() | read 1 bit of data and return as a boolean value| | char readChar() | read 8 bits of data and return as a char value| | char readChar(int r) | read r (between 1 and 16) bits of data and return as a char value [similar methods for byte (8 bits); short (16 bits); int (32 bits); long and double (64 bits)]| | boolean isEmpty() | is the bitstream empty?| | void close() | close the bitstream |

API for static methods that write to a bitstream on standard output: | public class BinaryStdOut || | :–: | :–: | | void write(boolean b) | write the specified bit| | void write(char c) | write the specified 8-bit char| | void write(char c, int r) | write the r (between 1 and 16) least significant bits of the specified char [similar methods for byte (8 bits); short (16 bits); int (32 bits); long and double (64 bits)]| | void close() | close the bitstream|

Example. Suppose that you have a nn-byte input stream consisting of n 7-bit ASCII characters. Approximately what compression ratio does Huffman compression achieve if each ASCII character appears with equal frequency?

answer. 7/8. The Huffman trie will be a complete binary trie of height 7 (with the 128 characters in the leaves). Therefore, the encoding of each character will require 7 bits.

Proposition. No algorithm can compress every bitstream.

10.1. Run-length encoding

Simple type of redundancy in a bitstream. Long runs of repeated bits.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1

Representation. 4-bit counts to represent alternating runs of 0s and 1s: 15 0s, then 7 1s, then 7 0s, then 11 1s.

public class RunLength
{
    private final static int R = 256;
    private final static int lgR = 8;

    public static void expand()
    {
        boolean b = false;
        while (!BinaryStdIn.isEmpty())
        {
            char cnt = BinaryStdIn.readChar();
            for (int i = 0; i < cnt; i++)
                BinaryStdOut.write(b);
            b = !b;
        }
        BinaryStdOut.close();
    }

    public static void compress()
    {
        char cnt = 0;
        boolean b, old = false;
        while (!BinaryStdIn.isEmpty())
        {
            b = BinaryStdIn.readBoolean();
            if (b != old)
            {
                BinaryStdOut.write(cnt);
                cnt = 0;
                old = !old;
            }
            else
            {
                if (cnt == 255)
                {
                    BinaryStdOut.write(cnt);
                    cnt = 0;
                    BinaryStdOut.write(cnt);
                }
            }
            cnt++;
        }
        BinaryStdOut.write(cnt);
        BinaryStdOut.close();
    }
}


↥ back to top


10.2. Huffman compression

Use different number of bits to encode different chars.

Q. How do we avoid ambiguity?
A. Ensure that no codeword is a prefix of another.

Prefix-free codes: trie representation

Q. How to represent the prefix-free code?
A. A binary trie!

Prefix-free codes: compression and expansion

Compression.

Expansion.

Huffman trie node data type

private static class Node implements Comparable<Node>
{
    private final char ch; // used only for leaf nodes
    private final int freq; // used only for compress
    private final Node left, right;
    
    public Node(char ch, int freq, Node left, Node right)
    {
        this.ch = ch;
        this.freq = freq;
        this.left = left;
        this.right = right;
    }
    
    public boolean isLeaf()
    { return left == null && right == null; }
    
    public int compareTo(Node that)
    { return this.freq - that.freq; }
}

Running time. Linear in input size N.

Q. How to write the trie?
A. Write preorder traversal of trie; mark leaf and internal nodes with a bit.

Q. How to read in the trie?
A. Reconstruct from preorder traversal of trie.

Constructing a Huffman encoding trie: Java implementation

private static Node buildTrie(int[] freq)
{
    // Initialize priority queue with singleton trees.
    MinPQ<Node> pq = new MinPQ<Node>();
    for (char i = 0; i < R; i++)
        if (freq[i] > 0)
            pq.insert(new Node(i, freq[i], null, null));

    while (pq.size() > 1)
    {   // Merge two smallest trees.
        Node x = pq.delMin();
        Node y = pq.delMin();
        Node parent = new Node('\0', x.freq + y.freq, x, y);
        pq.insert(parent);
    }
    return pq.delMin();
}

Proposition. [Huffman 1950s] Huffman algorithm produces an optimal prefix-free code.

ALGORITHM Huffman compression

Huffman algorithm:

public class Huffman
{
    private static int R = 256; // ASCII alphabet

    public static void compress()
    {
        // Read input.
        String s = BinaryStdIn.readString();
        char[] input = s.toCharArray();
        // Tabulate frequency counts.
        int[] freq = new int[R];

        for (int i = 0; i < input.length; i++)
            freq[input[i]]++;

        // Build Huffman code trie.
        Node root = buildTrie(freq);
        // Build code table (recursive).
        String[] st = new String[R];

        buildCode(st, root, "");
        // Print trie for decoder (recursive).
        writeTrie(root);
        // Print number of chars.
        BinaryStdOut.write(input.length);
        // Use Huffman code to encode input.
        
        for (int i = 0; i < input.length; i++)
        {
            String code = st[input[i]];
            for (int j = 0; j < code.length(); j++)
                if (code.charAt(j) == '1')
                    BinaryStdOut.write(true);
                else BinaryStdOut.write(false);
        }
        BinaryStdOut.close();
    }

    private static String[] buildCode(Node root)
    { // Make a lookup table from trie.
        String[] st = new String[R];
        buildCode(st, root, "");
        return st;
    }

    private static void buildCode(String[] st, Node x, String s)
    { // Make a lookup table from trie (recursive).
        if (x.isLeaf())
        { st[x.ch] = s; return; }
        buildCode(st, x.left, s + '0');
        buildCode(st, x.right, s + '1');
    }

    public void expand()
    {
        Node root = readTrie();         // read in encoding trie
        int N = BinaryStdIn.readInt();  // read in number of chars
        for (int i = 0; i < N; i++)
        {
            Node x = root;
            while (!x.isLeaf())         // expand codeword for ith char
            {
                if (!BinaryStdIn.readBoolean())
                    x = x.left;
                else
                    x = x.right;
            }
            BinaryStdOut.write(x.ch, 8);
        }
        BinaryStdOut.close();
    }

    private static void writeTrie(Node x)
    {
        if (x.isLeaf())
        {
            BinaryStdOut.write(true);
            BinaryStdOut.write(x.ch, 8);
            return;
        }
        BinaryStdOut.write(false);
        writeTrie(x.left);
        writeTrie(x.right);
    }

    private static Node readTrie()
    {
        if (BinaryStdIn.readBoolean())
        {
            char c = BinaryStdIn.readChar(8);
            return new Node(c, 0, null, null);
        }
        Node x = readTrie();
        Node y = readTrie();
        return new Node('\0', 0, x, y);
    }

    private static Node buildTrie(int[] freq)
    {
        // Initialize priority queue with singleton trees.
        MinPQ<Node> pq = new MinPQ<Node>();
        for (char i = 0; i < R; i++)
            if (freq[i] > 0)
                pq.insert(new Node(i, freq[i], null, null));

        while (pq.size() > 1)
        {   // Merge two smallest trees.
            Node x = pq.delMin();
            Node y = pq.delMin();
            Node parent = new Node('\0', x.freq + y.freq, x, y);
            pq.insert(parent);
        }
        return pq.delMin();
    }
}

Implementation.

Running time. Using a binary heap ⇒ N + R log R.


↥ back to top


10.3. LZW compression

Static model. Same model for all texts.

Dynamic model. Generate model based on text.

Adaptive model. Progressively learn and update model as you read text.

10.3.1. LZW compression

LZW compression.

Q. How to represent LZW compression code table?
A. A trie to support longest prefix match.

10.3.2. LZW expansion

LZW expansion.

Q. How to represent LZW expansion code table?
A. An array of size 2W.

LZW compression: Java implementation


public class LZW
{
    private static final int R = 256;   // number of input chars
    private static final int L = 4096;  // number of codewords = 2^12
    private static final int W = 12;    // codeword width

    public static void compress()
    {
        String input = BinaryStdIn.readString();
        TST<Integer> st = new TST<Integer>();
        for (int i = 0; i < R; i++)
            st.put("" + (char) i, i);
        int code = R+1;                 // R is codeword for EOF.
        while (input.length() > 0)
        {
            String s = st. l ongestPrefixOf(input); // Find max prefix match.
            BinaryStdOut.write(st.get(s), W);       // Print s's encoding.
            int t = s.length();
            if (t < input.length() && code < L)     // Add s to symbol table.
                st.put(input.substring(0, t + 1), code++);
            input = input.substring(t);             // Scan past s in input.
        }
        BinaryStdOut.write(R, W);                   // Write EOF.
        BinaryStdOut.close();
    }

    public static void expand()
    {
        String[] st = new String[L];
        int i;                                  // next available codeword value
        for (i = 0; i < R; i++)                 // Initialize table for chars.
            st[i] = "" + (char) i;
        st[i++] = " ";                          // (unused) lookahead for EOF
        int codeword = BinaryStdIn.readInt(W);
        String val = st[codeword];
        while (true)
        {
            BinaryStdOut.write(val);            // Write current substring.
            codeword = BinaryStdIn.readInt(W);
            if (codeword == R) break;
            String s = st[codeword];            // Get next codeword.
            if (i == codeword)                  // If lookahead is invalid,
                s = val + val.charAt(0);        // make codeword from last one.
            if (i < L)
                st[i++] = val + s.charAt(0);    // Add new entry to code table.
            val = s;                            // Update current codeword.
        }
        BinaryStdOut.close();
    }
}

LZW implementation details

How big to make ST?

What to do when ST fills up?

Why not put longer substrings in ST?

Lempel-Ziv and friends.

Lossless data compression benchmarks

year scheme bits/char
1967 ASCII 7.00
1950 Huffman 4.70
1977 LZ77 3.94
1984 LZMW 3.32
1987 LZH 3.30
1987 move-to-front 3.24
1987 LZB 3.18
1987 gzip 2.71
1988 PPMC 2.48
1994 SAKDC 2.47
1994 PPM 2.34
1995 Burrows-Wheeler 2.29
1997 BOA 1.99
1999 RK 1.89

Note: data compression using Calgary corpus


↥ back to top


10.4. Data compression summary

Lossless compression.

Lossy compression.

Theoretical limits on compression.

Practical compression. Use extra knowledge whenever possible.


↥ back to top


11. Advanced topics

Main topics.

Shifting gears.

Goals.

11.1. Reductions

Desiderata. Classify problems according to computational requirements.

Def. Problem X reduces to problem Y if you can use an algorithm that solves Y to help solve X.

Cost of solving X = total cost of solving Y (perhaps many calls to Y on problems of different sizes) + cost of reduction (preprocessing and postprocessing).

Ex 1. [finding the median reduces to sorting]
To find the median of N items:

Cost of solving finding the median. N log N + 1. (cost of sorting + cost of reduction)

Ex 2. [element distinctness reduces to sorting] To solve element distinctness on N items:

Cost of solving element distinctness. N log N + N.

11.1.1. designing algorithms

Establishing lower bounds through reduction is an important tool in guiding algorithm design efforts.

Design algorithm. Given algorithm for Y, can also solve X.

Ex.

11.1.1.1. Convex hull reduces to sorting

Sorting. Given N distinct integers, rearrange them in ascending order.

Convex hull. Given N points in the plane, identify the extreme points of the convex hull (in counterclockwise order).

Proposition. Convex hull reduces to sorting.
Pf. Graham scan algorithm.

Cost of convex hull. N log N + N. (cost of sorting + cost of reduction)

Graham scan algorithm.

11.1.1.2. Shortest paths on edge-weighted graphs and digraphs

Proposition. Undirected shortest paths (with nonnegative weights) reduces to directed shortest path.
Pf. Replace each undirected edge by two directed edges.

Cost of undirected shortest paths. E log V + E. (cost of shortest paths in digraph + cost of reduction)

Caveat. Reduction is invalid for edge-weighted graphs with negative weights (even if no negative cycles).

11.1.1.3. Linear-time reductions involving familiar problems

sorting:

linear programming:

11.1.2. establishing lower bounds

Goal. Prove that a problem requires a certain number of steps.

Ex. In decision tree model, any compare-based sorting algorithm requires Ω(N log N) compares in the worst case.

11.1.2.1. Linear-time reductions

Def. Problem X linear-time reduces to problem Y if X can be solved with:

Ex. Almost all of the reductions we’ve seen so far. [Which ones weren’t?]

Establish lower bound:

Mentality.

Proposition. In quadratic decision tree model, any algorithm for sorting N integers requires Ω(N log N) steps.

Proposition. Sorting linear-time reduces to convex hull.

Implication. Any ccw-based convex hull algorithm requires Ω(N log N) ops.

Q. How to convince yourself no linear-time convex hull algorithm exists?
A1. [hard way] Long futile search for a linear-time algorithm.
A2. [easy way] Linear-time reduction from sorting.


↥ back to top


11.1.3. Classifying problems

Desiderata. Problem with algorithm that matches lower bound.
Ex. Sorting and convex hull have complexity N log N.

Desiderata. Prove that two problems X and Y have the same complexity.

Integer multiplication. Given two N-bit integers, compute their product.
Brute force. N2 bit operations.

number of bit operations to multiply two N-bit integers

year algorithm order of growth
? brute force N2
1962 Karatsuba-Ofman N1.585
1963 Toom-3, Toom-4 N1.465, N1.404
1966 Toom-Cook N1 + ε
1971 Schönhage–Strassen N log N log log N
2007 Fürer N log N 2log*N
? ? N

Linear algebra reductions

Matrix multiplication. Given two N-by-N matrices, compute their product.
Brute force. N3 flops.

year algorithm order of growth
? brute force N3
1969 Strassen N2.808
1978 Pan N2.796
1979 Bini N2.780
1981 Schönhage N2.522
1982 Romani N2.517
1982 Coppersmith-Winograd N2.496
1986 Strassen N2.479
1989 Coppersmith-Winograd N2.376
2010 Strother N2.3737
2011 Williams N2.3727
? ? N2 + ε

Desiderata. Classify problems according to computational requirements.

complexity order of growth examples
linear N min, max, median,
linearithmic N log N sorting, convex hull,
M(N) ? integer multiplication, division, square root, …
MM(N) ? matrix multiplication, Ax = b, least square, determinant, …
NP-complete probably not Nb SAT, IND-SET, ILP, …

Summary

Reductions are important in theory to:

Reductions are important in practice to:


↥ back to top


11.2. Linear Programming

What is it? Problem-solving model for optimal allocation of scarce resources, among a number of competing activities that encompasses:

Why significant?

11.2.1. Applications

11.2.2. Standard form linear program

Goal. Maximize linear objective function of n nonnegative variables, subject to m linear equations.

Caveat. No widely agreed notion of “standard form.”

matrix version

Standard form.

Geometry

Inequalities define halfspaces; feasible region is a convex polyhedron.

A set is convex if for any two points a and b in the set, so is ½ (a + b).

An extreme point of a set is a point in the set that can’t be written as ½ (a + b), where a and b are two distinct points in the set.

Extreme point property. If there exists an optimal solution to (P), then there exists one that is an extreme point.

Greedy property. Extreme point optimal iff no better adjacent extreme point.

11.2.3. Simplex algorithm

Simplex algorithm. [George Dantzig, 1947]

Generic algorithm.

A basis is a subset of m of the n variables.

Basic feasible solution (BFS).

pivot

Q. Why pivot on column 2 (corresponding to variable B)?

Q. Why pivot on row 2?

Q. When to stop pivoting? A. When no objective function coefficient is positive.

Q. Why is resulting solution optimal? A. Any feasible solution satisfies current system of equations.


↥ back to top


11.2.4. Implementations

Construct the initial simplex tableau.

dimension n m 1
m A I b
1 c 0 0
public class Simplex {
    private double[][] a;       // simplex tableaux
    private int m, n;           // M constraints, N variables

    public Simplex(double[][] A, double[] b, double[] c) {
        m = b.length;
        n = c.length;
        a = new double[m + 1][m + n + 1];
        for (int i = 0; i < m; i++)
            for (int j = 0; j < n; j++)
                a[i][j] = A[i][j];                          // put A[][] into tableau
        
        for (int j = n; j < m + n; j++) a[j - n][j] = 1.0;  // put I[][] into tableau
        for (int j = 0; j < n; j++) a[m][j] = c[j];         // put c[] into tableau
        for (int i = 0; i < m; i++) a[i][m + n] = b[i];     // put b[] into tableau
    }

    private int bland() {
        for (int q = 0; q < m + n; q++)
            if (a[M][q] > 0) return q;      // entering column q has positive objective function coefficient
        return -1;                          // optimal
    }

    private int minRatioRule(int q) {
        int p = -1;                         // leaving row
        for (int i = 0; i < m; i++) {
            if (a[i][q] <= 0) continue;     // consider only positive entries
            else if (p == -1) p = i;
            else if (a[i][m + n] / a[i][q] < a[p][m + n] / a[p][q])
                p = i;                      // row p has min ratio so far
        }
        return p;
    }

    public void pivot(int p, int q) {
        for (int i = 0; i <= m; i++)
            for (int j = 0; j <= m + n; j++)
                if (i != p && j != q)
                    a[i][j] -= a[p][j] * a[i][q] / a[p][q]; // scale all entries but row p and column q

        for (int i = 0; i <= m; i++)
            if (i != p) a[i][q] = 0.0;          // zero out column q

        for (int j = 0; j <= m + n; j++)
            if (j != q) a[p][j] /= a[p][q];     // scale row p
        a[p][q] = 1.0;
    }

    public void solve() {
        while (true) {
            int q = bland();            // entering column q (optimal if -1)
            if (q == -1) break;
            int p = minRatioRule(q);    // leaving row p (unbounded if -1)
            if (p == -1) ...
            pivot(p, q);                // pivot on row p, column q
        }
    }
}

Remarkable property. In typical practical applications, simplex algorithm terminates after at most 2(m + n) pivots.

Pivoting rules. Carefully balance the cost of finding an entering variable with the number of pivots needed.

Cycling. Get stuck by cycling through different bases that all correspond to same extreme point.

To improve the bare-bones implementation.

Best practice. Don’t implement it yourself!

Basic implementations. Available in many programming environments.
Industrial-strength solvers. Routinely solve LPs with millions of variables.
Modeling languages. Simplify task of modeling problem as LP.

Advantages of an industrial-strength linear programming solver over the one we implemented:

Brief history


↥ back to top


11.2.5. Reductions to standard form

Linear “programming” (1950s term) = reduction to LP (modern term).

Modeling.

  1. Identify variables.
  2. Define constraints (inequalities and equations).
  3. Define objective function.
  4. Convert to standard form. (software usually performs this step automatically)

Examples.


↥ back to top


11.3. Intractability

Church-Turing thesis (1936)

Turing machines can compute any function that can be computed by a physically harnessable process of the natural world.

Turing machine is a simple and universal model of computation.

Q. Which algorithms are useful in practice?

Ex 1. Sorting N items takes N log N compares using mergesort.
Ex 2. Finding best TSP tour on N points takes N! steps using brute search.

Theory. Definition is broad and robust.
Practice. Poly-time algorithms scale to huge problems.

Def. A problem is intractable if it can’t be solved in polynomial time.
Desiderata. Prove that a problem is intractable.

11.3.1. Four fundamental problems

Q. Which of these problems have poly-time algorithms?

11.3.2. Search problems

Search problem. Given an instance I of a problem, find a solution S.
Requirement. Must be able to efficiently check that S is a solution.

11.3.3. P vs. NP

  Def Significance
NP NP is the class of all search problems. What scientists and engineers aspire to compute feasibly.
P P is the class of search problems solvable in poly-time. What scientists and engineers do compute feasibly.

Nondeterminism

Nondeterministic machine can guess the desired solution.

Ex. int[] a = new int[N];

ILP. Given a system of linear inequalities, guess a 0-1 solution.

Ex. Turing machine.

NP. Search problems solvable in poly time on a nondeterministic TM.

Extended Church-Turing thesis

P = search problems solvable in poly-time in the natural world.

Evidence supporting thesis. True for all physical computers. Natural computers? No successful attempts (yet).

Does P = NP ? [Can you always avoid brute-force searching and do better]

Overwhelming consensus. P ≠ NP.


↥ back to top


11.3.4. classifying problems

11.3.4.1. A key problem: satisfiability

SAT. Given a system of boolean equations, find a solution.

Key applications.

Exhaustive search

Q. How to solve an instance of SAT with n variables?
A. Exhaustive search: try all 2n truth assignments.
Q. Can we do anything substantially more clever?
Conjecture. No poly-time algorithm for SAT.

Q. Which search problems are in P?
A. No easy answers (we don’t even know whether P = NP).

Problem X poly-time reduces to problem Y if X can be solved with:

Consequence. If SAT poly-time reduces to Y, then we conclude that Y is (probably) intractable.

11.3.4.2. SAT poly-time reduces to ILP

SAT. Given a system of boolean equations, find a solution.

(can to reduce any SAT problem to this form)

ILP. Given a system of linear inequalities, find a 0-1 solution.

(solution to this ILP instance gives solution to original SAT instance)

Still more reductions from SAT


↥ back to top


11.3.5. NP-completeness

Def. An NP problem is NP-complete if every problem in NP poly-time reduce to it.

Proposition. [Cook 1971, Levin 1973] SAT is NP-complete. (every NP problem is a SAT problem in disguise)

Extremely brief proof sketch:

Corollary. Poly-time algorithm for SAT iff P = NP.

Implication. [SAT captures difficulty of whole class NP]

Remark. Can replace SAT with any of Karp’s problems.

Proving a problem NP-complete guides scientific inquiry.

Summary

P. Class of search problems solvable in poly-time.
NP. Class of all search problems, some of which seem wickedly hard.
NP-complete. Hardest problems in NP.
Intractable. Problem with no poly-time algorithm.

Many fundamental problems are NP-complete.

Use theory a guide:

Exploiting intractability

Modern cryptography.

RSA cryptosystem.

FACTOR. Given an n-bit integer x, find a nontrivial factor.

Q. What is complexity of FACTOR?
A. In NP, but not known (or believed) to be in P or NP-complete.

Q. What if P = NP?
A. Poly-time algorithm for factoring; modern e-conomy collapses.

Proposition. [Shor 1994] Can factor an n-bit integer in n3 steps on a “quantum computer.”

Q. Do we still believe the extended Church-Turing thesis???

Coping with intractability

Relax one of desired features.

Special cases may be tractable.

Develop a heuristic, and hope it produces a good solution.

Approximation algorithm. Find solution of provably good quality.

Complexity theory deals with worst case behavior.

Hamilton path

Goal. Find a simple path that visits every vertex exactly once.

Remark. Euler path easy, but Hamilton path is NP-complete.

public class HamiltonPath
{
    private boolean[] marked;   // vertices on current path
    private int count = 0;      // number of Hamiltonian paths

    public HamiltonPath(Graph G)
    {
        marked = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++)
            dfs(G, v, 1);
    }

    private void dfs(Graph G, int v, int depth) // length of current path (depth of recursion)
    {
        marked[v] = true;
        if (depth == G.V()) count++; // found one
        for (int w : G.adj(v))
            if (!marked[w]) dfs(G, w, depth+1); // backtrack if w is already part of path
        marked[v] = false;  // clean up
    }
}


↥ back to top


Longest-path, Daniel Barrett

Original song The longest Time
Original artist Billy Joel
--------------------------

Woh, oh-oh-oh
Find the Longest Path
Woh oh-oh
Find the Longest Path
If you said P is NP tonight
There would still be papers left to write
I have a weakness
I'm addicted to completeness
And I keep searching for the longest Path

The algorithm I would like to see
Is of Polynomial Degree
Buts its elusive,
Nobody has found conclusive
Evidence that we can find the Longest Path

I have been hard
Working for so long
I swear its right,
And he marks it wrong
Somehow I'll feel sorry when its done
GPA 2.1,
Is more than I hoped for

Garey, Johnson, Karp and other Men (and Women)
Try to make it Order n log n.
Am I a math fool
If I spend my life in Grad School
Forever following the Longest Path.

Woh oh-oh-oh
Find the longest path
Woh oh-oh-oh
Find the longest path

This concludes Algorithm Part II.


↥ back to top


12. Dynamic Programming (DP)

This part is extra content from MIT open course INTRODUCTION TO ALGORITHMS.

Invented by Richard Bellman (as a precursor to the Bellman-Ford algorithm)

Dynamic programming can be thought as a kind of exhaustive search, which is usually a bad thing to do because it leads to exponential time. But if you do it in a clever way, via dynamic programming, you typically get polynomial time.

In computing, memoization is used to speed up computer programs by eliminating the repetitive computation of results, and by avoiding repeated calls to functions that process the same input.

Memoization is a specific form of caching that is used in dynamic programming. The purpose of caching is to improve the performance of our programs and keep data accessible that can be used later. It basically stores the previously calculated result of the subproblem and uses the stored result for the same subproblem. This removes the extra effort to calculate again and again for the same problem. And we already know that if the same problem occurs again and again, then that problem is recursive in nature.

via what is memoization

Dynamic Programming:

12.1. fibonacci

DP is similar to recursion + memoization

Bottom-up DP algorithm for fibonacci

def fibo(n):
    fib = [0] * (n + 1)
    for k in range(n + 1):
        if k == 2 or k == 1: fib[k] = 1
        else:
            fib[k] = fib[k-1] + fib[k-2]
    return fib[n]
graph LR;
    Z[...] --> A[F_n-3];
    A --> B[F_n-2];
    Z --> B;
    B --> C[F_n-1];
    A --> C;
    C --> D[F_n];
    B --> D;
def fibo2(n):
    """
    constant space
    """
    fib = [0] * 2
    for k in range(1,n+1):
        if k == 2 or k == 1: fib[k%2] = 1
        else:
            fib[k%2] = fib[(k-1)%2] + fib[(k-2)%2]
    return fib[n%2]

12.2. Shortest paths

graph LR;
    A((s)) --> B((_));
    A --> X((...));
    B --> C((_));
    C --> D((u));
    D --> E((v));
graph TD;
    s --> b;
    b --> a;
    a --> v;
    a --> s;
    v --> b;

5 “easy” steps to Dynamic Programming:

  1. define subproblems. count #subproblems
  2. guess (part of solution). count #choices for guess
  3. relate subproblem solutions. compute time/subproblem
  4. recurse + memoize. time = time/subproblem * #subproblems
    • or build DP table bottom-up
    • check subproblems acyclic/topological order
  5. solve original problem: = a subproblem or by combining subproblem solutions. extra time


↥ back to top