CS 61B: Selected Solutions (Practice for the Final Exam)
[1] [a] insertion sort
71808294
17808294
01788294
01278894
01247889
[b] selection sort
71808294
01878294
01278894
01248897
01247898
01247889
[c] mergesort
71808294
17,80,8294
17,08,8294
17,08,28,94
17,08,28,49
0178,28,49
0178,2489
01247889
[d] quicksort
71808294
21808794
21088794
210|4|8798
0|12|4|8798
012|4|7898
012|4|7|8|98
01247889
[e] heapsort
71808294 \
81807294 |
82807194 | bottomUpHeap()
82897104 |
82897140 /
08897241
01897842
01298874
01249887
01247898
01247898
01247889
[2] Extend each item so that it has a "secondary key," which is the index of
the item in the initial array/list. If two items have the same primary
key, the tie is broken using the secondary key, so no two items are ever
considered equal.
5 8 7 5 8 8 3 7 => 5/0 8/1 7/2 5/3 8/4 8/5 3/6 7/7
[3] [a] DFS: abdcegfhi
BFS: abcdegifh
[b] DFS: efihgcdba
[c] gh, ac, hi, ab, cd, cg, fg, ce (cg may come before cd instead)
[4] [a] [i] O(|V|^2) [ii] O(|V|)
[b] Visited in preorder. Finished in postorder.
[c] With BFS, it's done exactly the same as with DFS.
(See Homework 9 for a description of how it's done with the latter.)
[d] for (each vertex v in the graph) {
if (v has not been visited) {
increment the count of connected components
perform DFS on v, thereby marking all vertices in its
connected component as having been visited
}
}
This algorithm requires that you don't "unmark" the marked vertices
between calls to DFS.
[5] [a] 3 12
/ \ /
1 12 3
/\ / / \
0 2 11 1 11
/ /\ /
=> 5 => 0 2 5
/ \ / \
4 7 4 7
/ \ / \
6 9 6 9
/ \ / \
8 10 8 10
[b] 4.5
/ \
3 5
/ \ \
1 4 11
/ \ / \
0 2 7 12
/ \
6 9
/ \
8 10
[c] 3 3 10
/ \ / \ / \
1 5 1 10 3 11
/\ /\ /\ /\ / \ \
0 2 4 11 0 2 5 11 1 5 12
/ \ / \ \ /\ /\
=> 10 12 => 4 9 12 => 0 2 4 9
/ / /
9 7 7
/ / \ / \
7 6 8 6 8
/ \
6 8
[d] 3 3 10
/ \ / \ / \
1 5 1 5 5 11
/\ /\ /\ /\ / \ \
0 2 4 11 0 2 4 10 3 7 12
=> / \ => / \ => / \ / \
7 12 7 11 1 4 6 8
/ \ / \ \ / \
6 10 6 8 12 0 2
/
8
[6] [a] If the find operations are done first, then the find operations take
O(1) time each because every item is the root of its own tree. No
item has a parent, so finding the set an item is in takes a fixed
number of operations.
Union operations always take O(1) time. Hence, a sequence of n
operations with all the finds before the unions takes O(n) time.
[b] This question requires amortized analysis. Find operations can be
expensive, but an expensive find operation is balanced out by lots
of cheap union operations. The accounting is as follows.
Union operations always take O(1) time, so let's say they have an
actual cost of $1. Assign each union operation an amortized cost of
$2, so every union operation puts $1 in the bank.
Each union operation creates a new child. (Some node that was not
a child of any other node before is a child now.) When all the
union operations are done, there is $1 in the bank for every child,
or in other words, for every node with a depth of one or greater.
Let's say that a find(u) operation costs $1 if u is a root. For any
other node, the find operation costs an additional $1 for each parent
pointer the find operation traverses. So the actual cost is
$(1 + d), where d is the depth of u.
Assign each find operation an amortized cost of $2. This covers
the case where u is a root or a child of a root. For each additional
parent pointer traversed, $1 is withdrawn from the bank to pay for
it.
Fortunately, path compression changes the parent pointers of all the
nodes we pay $1 to traverse, so these nodes become children of the
root. All of the traversed nodes whose depths are 2 or greater move
up, so their depths are now 1. We will never have to pay to traverse
these nodes again.
Say that a node is a grandchild if its depth is 2 or greater. Every
time find(u) visits a grandchild, $1 is withdrawn from the bank, but
the grandchild is no longer a grandchild. So the maximum number of
dollars that can ever be withdrawn from the bank is the number of
grandchildren. But we initially put $1 in the bank for every child,
and every grandchild is a child, so the bank balance will never drop
below zero. Therefore, the amortization works out.
Union and find operations both have amortized costs of $2, so any
sequence of n operations where all the unions are done first takes
O(n) time.
[7] [a] insert 4, 2, 1, 3, and 6 in that order.
[b] insert 4, 5, 6, 1, 3, and 2 in that order.
[8] [a] If you need inexact matches. For example, if you want to find the
item less than or equal to 5 that's closest to 5. Hash tables can
only do exact matches. (If exact matches are all you need, however,
hash tables are faster.)
[b] If each single operation absolutely must run in O(log n) time. OR
If most operations are find()s, and the data access patterns are
uniformly random. (2-3-4 trees are faster for these operations
because they don't restructure the tree. But splay trees do better
if a small proportion of the items are targets of most of the finds.)
[c] If memory use is the primary consideration (especially if a 2-3-4
tree holding all the items won't fit in memory).
[d] None. find() and remove() on a heap take worst-case Theta(n) time,
and they're more complicated than in an unordered array. insert()
on a heap takes worst-case Theta(log n) time, versus Theta(1) for an
unordered array.
[e] When you don't need to find the minimum key.
[10] [a] A
A E
C S R T
U D T U T R
[b] A
A E
C S R R (Note that two nodes are different than in [a].)
U D T U T T
[12] DATASTRUCTURE
DACASTRUTTURE
DACA|E|TRUTTURS
DACA|E|RRUTTUTS
DACA|E|RR|S|TTUTU
DACA|E|R|R|S|TTUTU
^
[13] Radix sort takes b/log2 r passes, so the overall running time of radix
sort is
n + r
t = b (ln 2) -----
ln r
To find the value of r that minimizes t, set dt/dr to zero.
dt ln r - (n + r)/r
-- = b (ln 2) ---------------- = 0
dr (ln r)^2
Therefore, ln r = (n + r)/r. Given that n = 493, with a calculator and
some trial-and-error you can determine that r = 128 is the optimal radix.
[14] [a] z: consider the case where, half-way through the sort, the last key
in the "sorted" list is changed to a very low number.
1 3 5 7 9|8 4 2 6 10
zap! 1 3 5 7 0|8 4 2 6 10
Using an in-place insertion sort implementation that searches
from the end of the sorted array, the remaining keys will never
get past the zero.
1 3 5 7 0 2 4 6 8 10
Note that if a key in the "unsorted list" is zapped, no harm is
done at all.
[b] y: If an item in the "sorted list" is zapped, only that one item
is affected. If an item in the "unsorted list" is zapped to
a value lower than the last item in the sorted list, that item
will be out-of-place, but other items are still sorted.
[c] z: Consider merging two lists, where the first item in one of the
lists gets zapped to a very high value. You'll wind up with two
consecutive sorted portions. (After further merge operations,
there will still be two consecutive sorted portions.)
/== 100 3 5 7 9 11
merge \== 2 4 6 8 10 12
[d] y: Radix sort uses no comparisons at all, so the zapped item
doesn't affect how the others are ordered.