Recitation 4

Recitation notes for the week of September 18, 2017.

Homework reminders

Here are some gentle reminders about HW policies:

Make sure you submit your HW as a PDF only. Otherwise your submission will not be graded.
Remember to put collaborators and sources during your submission. Use None if appropriate.
Can only collaborate with 2 other people and must be for both Q2 and Q3.
NO collaboration is allowed on Q1. Discussion about Q1 (outside of piazza) is not allowed.
We will be running MOSS to check for cheating on Q1.
Proof Idea and Proof Details must be clearly separated and labeled. Not doing so will result in a ZERO for the problem.

Q1 on HW 2

Understanding the problem statement

Input to the problem

$m$ hospitals, each with a preference list of all $n$ students.

$n$ students, each with a preference list of all $m$ hospitals.

Difference from Stable Matching Problem in class

The number of hospitals and the number of students are different.
One hospital can have more than on open slot, but each student is only assigned to one hospital.
There are more students than there are total number of available slots for all $m$ hospitals.

Overall Goal

Given $n$ hospitals and $m$ students, output one stable matching assignment of hospitals to students.

As in the Stable Matching problem, a stable assignment is one that does not have any instability. However, unlike the earlier case there can be two kinds of instability, which we talk about next.

Instability (kind 1)

There are students $s$ and $s'$, and a hospital $h$, so that:

$s$ is assigned to $h$, and
$s'$ is assigned to no hospital, and
$h$ prefers $s'$ to $s$.

For example, consider the following input: \[h_1 (2 \text{ spots}): s_1 > s_2 > s_3 > s_4~~~~ s_1: h_2 > h_1~~~~ s_3: h_1 > h_2\] \[h_2 (1 \text{ spot}):~ s_2 > s_4 > s_3 > s_1~~~~ s_2: h_2 > h_1~~~~ s_4: h_1 > h_2\] and the following assignment: \[\{(h_1, s_3), (h_1, s_4), (h_2, s_2)\}.\]

The above assignment is unstable because $s_1$ was not assigned to anyone, but is highest on $h_1$'s preference list.

Instability (kind 2)

There are students $s$ and $s'$, and hospitals $h$ and $h'$, so that:

$s$ is assigned to $h$ and
$s'$ is assigned to $h'$, and
$h$ prefers $s'$ to $s$, and
$s'$ prefers $h$ to $h'$.

For example, consider the following input: \[h_1 (2 \text{ spots}):~~~~ s_1 > s_2 > s_3 > s_4~~~~ s_1: h_2 > h_1~~~~ s_3: h_1 > h_2\] \[h_2 (1 \text{ spot}):~~~~~ s_2 > s_4 > s_3 > s_1~~~~ s_2: h_2 > h_1~~~~ s_4: h~1 > h_2\] and the following assignment: \[\{(h_1, s_1), (h_1, s_2), (h_2, s_4).\]

The above assignment is unstable because $h_2$ prefers $s_2$ to current ($s_4$) and $s_2$ prefers $h_2$ to current ($h_1$).

An example (input and output and why this is stable)

We will now go over the example given in HW 2.

Input:

                3                               <- Number of hospitals
                5                               <- Number of students
                1   2   3   5   1   4           <- Preference of the 1st hospital (most preferred first with first index is available slot)
                1   5   1   2   4   3           <- Preference of the 2nd hospital (most preferred first with first index is available slot)
                2   5   2   3   1   4           <- Preference of the 3rd hospital (most preferred first with first index is available slot)
                2   1   3                       <- Preference of the 1st student (most preferred first)
                3   2   1                       <- Preference of the 1st student (most preferred first)
                3   1   2                       <- Preference of the 1st student (most preferred first)
                1   2   3                       <- Preference of the 1st student (most preferred first)
                1   2   3                       <- Preference of the 1st student (most preferred first)

We claim that the following is a stable assignment:

                (1, 5)                          <- Pairing of the form (h,s)
                (2, 1)
                (3, 2)
                (3, 3)

Let’s take a look at why this output is stable:

$(1,5)$: Take a look at $\{2,3\}$ who rank higher in $h_1$'s preference list. If $2$ or $3$ prefer $1$ over their current partner or are unmatched, then there would be an instability.
$(2,1)$: $5$ is higher on $h_2$'s preference list. If $s_5$ is unmatched or prefers $h_2$ to their current partner, then there would be an instability.
Repeat for $(3,2)$ and $(3,3)$.

Asymptotic Notation

Overview of "linear" bounds on runtime

$O(n)$ - upper bound for worst case run time
- an upper bound, not the exact growth rate of the function
$\Omega(n)$ - lower bound for worst case run time (NOT THE BEST CASE RUNTIME)
- a lower bound, not the exact growth rate
$\Theta(n)$ - exact run time - if you can show something is $O(n)$ and $\Omega(n)$.

Simple example with search

The search problem

Given a string s, search for a character, c, in the string.

Consider the most basic way solve the above problem: i.e. iterate over list:

                        for a in s:
                            if (a == e):
                                return a
                            return -1

$O(n)$ worst case we will need to iterate over every element in the list. As our list grows, our worst case scenario grows linearly.

Another example

Calling the `len` function

Store the length in a function before even storing the string. For example if we store $s$ = cat, then len = $3$.

What are the asymptotic bounds on calling len?

Calling len will take the same amount of time no matter how big the string is, therefore it takes constant time:

$O(1)$ since worst case we find the length of the string in constant time.
$\Omega(1)$ since we cannot take less than constant time to find the length of the string.
Since big $O$ and big $\Omega$ are the same, this solution runs in $\Theta(1)$.

Common runtimes of algorithms

$O(n)$ - linear

Examples: computing the maximum in a list, merging two sorted lists.

$O(n^2)$ - quadratic

Arises naturally from two nested for loops, why?

Examples: brute force seeing two numbers in a list add up to another, brute force substring.

$O(\log n)$ - logarithmic

Example in class with binary search. Recap of why this is true:

                [1,2,3,4]     → takes 2 steps with a list of size 4
                [1,2][3,4]       
                [1][2][3][4]

                [1,2,3,4,5,6,7,8]    → takes 3 steps with a list of size 8 (double the size)
                [1,2,3,4][5,6,7,8]        
                [1,2][3,4][5,6][7,8]
                [1][2][3][4][5][6][7][8]

                [1,2,3,4,5] best case: searching for 3, worst case: searching for 5 
                [3,4,5]
                [4,5]
                [5]

So binary search is $O(\log n)$ and $\Omega(1)$ ($\Omega(\log{n})$ lower bound can also be showed-- left as an exercise!).

$O(n \log n)$

Common with any algorithm that splits the input into two and solves each piece recursively and combines the two pieces in linear time.

For example: Mergesort, so also common with any algorithm that sorts first then performs a linear operation on the input since $O(n \log n) + O(n) = O(n \log n)$.