Greedy algorithm is a type of algorithm that builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion. One can often design many different greedy algorithms for the same problem, each one locally, incrementally optimizing some different measure on its way to a solution.
The interval Scheduling Problem consists of finding the compatible set of maximum size from a set of requests, in which each request has a starting time $s_i$ and finishing time $f_i$.
To design a greedy algorithm to approach the problem, the basic idea is to use a simple rule to select the first request $i_1$. Once a request $i_1$ is accepted, all requests that are not compatible with $i_1$ are rejected. The algorithm terminates when all requests are either accepted or rejected.
Potential approaches:
Always select the available request that starts the earliest
Always select the intervals that requires the smallest interval of time
Doesn't work so well neither. Accepting the short interval in the middle might prevent us from accepting the other two in the middle that forms an optimal solution
For each request, count the number of other requests that are not compatible, and accept the request that has the fewest conflicts
TLDR: Doesn't work.
The unique optimal solution is to accept the four requests in the top row. The greedy method suggested here accepts the middle request in the second row and thereby ensures a solution of size no greater than three.
First accept the requests that finishes the first
R = set(all_requests)
A = set()
while R is not empty:
i = the smallest finishing time request in R
A.add(i)
delete all requests in R that are not compatible with i
return A
We know that the result A we obtained is compatible, but we want to show that it's optimal. Let O be the optimal solution, if |A| = |O|, A will also be proved to be optimal.
Let $i_1,...,i_k$ be the set of requests in A and $j_1,...,j_m$ be the set of requests in O. Want to show k = m.
Assume the request in O are also ordered.
Proof by Induction
Base case: Since the request in O are also ordered, our greedy algorithm guaranteed that $f(i_1)$ ≤ $f(j_1)$
Induction Hypothesis: for r > 1, $f(i_{r-1})$ ≤ $f(j_{r-1})$
Since $f(j_{r-1})$ ≤ $s(j_{r})$, $f(i_{r-1})$ ≤ $s(j_{r})$
Thus the interval $j_r$ **is in the set R of available intervals at the time when the greedy algorithm selects $i_r$. The greedy algorithm selects the available interval with smallest finish time, thus $f(i_{r})$ ≤ $f(j_{r})$
$O(nlogn)$. We can first sort the request array in O(nlogn), and for an additional loop through the sorted array, we construct an array S[1...n] with the property that S[i] contains the value s(i). Then, we traverse through the sorted request time and add subsequent intervals in.