Interval Scheduling via examples
In which we derive an algorithm that solves the Interval Scheduling problem via a sequence of examples.
The problem
In these notes we will solve the following problem:
Interval Scheduling Problem
Input: An input of $n$ intervals $[s(i), f(i))$, or in other words, {$s(i)$, $...$ , $f(i)-1$} for $1 ≤ i ≤ n $ where $i$ represents the intervals, $s(i)$ represents the start time, and $f(i)$ represents the finish time.
Output: A schedule $S$ of $n$ intervals where no two intervals in $S$ conflict, and the total number of intervals in $S$ is maximized.
Sample Input and Output
Input:
Output: [Task 2, Task 3, Task 4]
is an optimal solution because these tasks have no conflicts with each other and any set with 4 tasks will have at least two intervals in conflict.
Next we go through a sequence of examples to build towards an algorithm that will solve the above problem. We encourage you to work on each question first and then toggle for the answer.
But if you must:
Example 1
Assumption: Let's first assume that the input intervals do not overlap each other:
Or a more generally:
In this case, the output solution set is the same as the input solution set since neither Tasks conflict with each other. So we have [Task 1, Task 2]
.
Question 1
What would the pseudocode for the above look like?
Algorithm draft 1
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R
Add i to S
Return S* = S
Example 2
Assumption: Let's now assume that the maximum amount of conflicts that any Task can have is 1:
Or a more generally:
Question 2
How would you modify Algorithm draft 1 to handle this case?
Answer 2
In this case it doesn’t matter which Task we choose. We can have a final set of [Task 1, Task 2]
or [Task 2, Task 3]
and both solutions have a total of two tasks in them so both of them are considered optimal. The only update to put in the algorithm is to remove any conflicts when we iteratively add Tasks to the solution set.
Algorithm draft 2
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R
Add i to S
Remove all requests that conflict with i from R
Return S* = S
But what if a Task can have an arbitrary number of conflicts?
Example 3
Task 1
, we can choose Task 2
or Task 5
to add to the solution set since those two are the only Tasks which do not conflict with Task 1
. If we choose to add Task 4
to the solution set we can add Task 3
and Task 5
which will be a more optimal solution.
Question 3
How should we update Algorithm draft 2 to handle this case?
Answer 3
This problem can be solved using the greedy approach of choosing the next element of the Task list based on some property the Tasks have and iteratively building up a solution.
Generic Algorithm
For each interval $i$, compute a value $v(i)$
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R where v(i) is minimized
Add i to S
Remove all requests that conflict with i from R
Return S* = S
Shortest Duration
A natural solution would be to select Tasks based on their duration. What this means is run algorithm draft 2 but instead of arbitrarily choosing a Task, choose one with the shortest duration.
Algorithm draft 3
Recall from the problem statement that $i$ is an interval, let $s(i)$ be the start time of the interval, and let $f(i)$ be the finish time of the interval. From these definitions, duration of an interval can be represented as $f(i) - s(i)$, and shortest duration would be the case where $f(i) - s(i)$ is the smallest.
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R where f(i) - s(i) is minimized
Add i to S
Remove all requests that conflict with i from R
Return S* = S
Question 4
If we run the above algorithm with Example 3 what is the result?
Answer 4
1. Choose Task 2
2. Remove Task 4
and Task 5
since they conflict with Task 2
3. Choose Task 3
4. Remove Task 1
since it conflicts with Task 3
5. Exit algorithm
An optimal solution for this example can be found by observation. [Task 3, Task 4, Task 5]
has the most amount of intervals that do not conflict with each other. [Task 2, Task 3]
is not an optimal solution so this is not the correct way to greedily solve this problem.
Earliest Start Time
Since shortest duration does not work, we need to find a new parameter to use for choosing the Tasks.
Algorithm draft 4
From our definitions, start time of an interval is represented as $s(i)$, so the earliest start time is when $s(i)$ is the smallest.
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R where s(i) is minimized
Add i to S
Remove all requests that conflict with i from R
Return S* = S
If our algorithm broke ties correctly (Task 1
and Task 3
in this example) then an optimal solution would emerge:
1. Choose Task 3
2. Remove 1 since it is in conflict
3. Choose Task 4
4. Remove Task 2
5. Choose Task 5
Solution set will then be [Task 3, Task 4, Task 5]
which reflects the above.
Question 5
Unfortunately, the above algorithm does not work. Can you think of a counter example for the above?
Example 4
Answer 5
Task 6
and remove all others that are in conflict with it which happens to be everything else in the input set.
Minimum Number of Conflicts
Example 4 could work if we updated the algorithm to select Tasks that have the minimum number of conflicts across the entire input Task set. If the algorithm picks Task 3
or Task 5
first, it would remove all conflicts and result in the correct solution of [Task 3, Task 4, Task 5]
Algorithm draft 5
If $i$ and $j$ are two distinct intervals, there is a conflict if $s(j) < f(i) < f(j)$ or $s(i) < f(j) < f(i)$.
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R with the minimum number of conflicts
Add i to S
Remove all requests that conflict with i from R
Return S* = S
Note
Keep in mind that when the algorithm removes all conflicts for a given $i$ it may also change the number of conflicts for other intervals.
Question 6
Can you think of a counter example for the above?
Example 5
Answer 6
[Task 3, Task 4, Task 5, Task 8, Task 9, Task 10, Task 11]
.
A simple run of the algorithm draft on this example:
1. Choose Task 13
since it only has 2 conflicts
2. Remove Task 9
and Task 10
3. Choose Task 8
(Arbitrarily since all remaining Tasks have the same # of conflicts)
4. Remove Task 12
, Task 15
, Task 17
5. Choose Task 11
(Arbitrarily since all remaining Tasks have the same # of conflicts)
...
Since the algorithm already did not place Task 9
and Task 10
in the solution set, we know this will not be an optimal solution.
Home Exercise
The above is an extension of an example found on pg. 117 of the textbook. Can you find an example with strictly less than 11 intervals?
Earliest End Time
As it turns out, the correct solution is to select the Task based on the earliest end time. The result on Example 5 is then:
1. Choose Task 3
2. Remove 1, 6, and 7 since they are in conflict
3. Choose Task 4
4. Remove Task 2
5. Choose Task 5
6. Choose Task 8
7. Remove 12, 15, 17
8. Choose 9
9. Remove 13
10. Choose 10
11. Remove 14, 16, 18
12. Choose 11
Solution set will then be [Task 3, Task 4, Task 5, Task 8, Task 9, Task 10, Task 11]
which is what we want.
Algorithm Idea
The greedy solution to this problem is to remove an interval from the input set with the earliest finish time, add it to the solution set, and remove all other intervals that conflict with it from the input set. Our algorithm will continue to run these steps until the input set is empty. As a result, the solution set will have an optimal solution and can be returned.
Final Algorithm (Algorithm Details)
R: set of requests
Initialize S to be the empty set
While R is not empty
Choose i in R where f(i) is the smallest
Add i to S
Remove all requests that conflict with i from R
Return S* = S
Question 7
Can you prove that the Final Algorithm will always work?
Answer 7
We will prove this in class.
Question 8
Why does the Final Algorithm terminate?
Answer 8
We will show that this algorithm runs in time $O(n^2)$. In particular, it terminates.