forked from rahuldeve/Diversification-Dataset
-
Notifications
You must be signed in to change notification settings - Fork 0
/
24-spanning.txt
389 lines (265 loc) · 10 KB
/
24-spanning.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
Lecture Notes on
Spanning Trees
15-122: Principles of Imperative Computation
Frank Pfenning
Lecture 24
November 18, 2010
1
Introduction
In this lecture we introduce graphs. Graphs provide a uniform model for
many structures, for example, maps with distances or Facebook relationships. Algorithms on graphs are therefore important to many applications.
They will be a central subject in the algorithms courses later in the curriculum; here we only provide a very small sample of graph algorithms.
2
Spanning Trees
We start with undirected graphs which consist of a set V of vertices (also
called nodes) and a set E of edges, each connecting two different vertices.
A graph is connected if we can reach any vertex from any other vertex
by following edges in either direction. In a directed graph edges provide a
connection from one node to another, but not necessarily in the opposite
direction. More mathematically, we say that the edge relation between vertices is symmetric for undirected graphs. In this lecture we only discuss
undirected graphs, although directed graphs also play an important role in
many applications.
The following is a simple example of a connected, undirected graph
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.2
with 5 vertices (A, B, C, D, E) and 6 edges (AB, BC, CD, AE, BE, CE).
A
D
E
B
C
In this lecture we are particularly interested in the problem of computing
a spanning tree for a connected graph. What is a tree here? They are a
bit different than the binary search trees we considered early. One simple
definition is that a tree is a connected graph with no cycles, where a cycle let’s
you go from a node to itself without repeating an edge. A spanning tree for
a connected graph G is a tree containing all the vertices of G. Below are
two examples of spanning trees for our original example graph.
A
D
A
E
E
B
D
C
B
C
When dealing with a new kind of data structure, it is a good strategy to
try to think of as many different characterization as we can. This is somewhat similar to the problem of coming up with good representations of the
data; different ones may be appropriate for different purposes. Here are
some alternative characterizations the class came up with:
1. Connected graph with no cycle (original).
2. Connected graph where no two neighbors are otherwise connected.
Neighbors are vertices connected directly by an edge, otherwise connected means connected without the connecting edge.
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.3
3. Two trees connected by a single edge. This is a recursive characterization. The based case is a single node, with the empty tree (no vertices)
as a possible special case.
4. A connected graph with exactly n − 1 edges, where n is the number
of vertices.
5. A graph with exactly one path between any two distinct vertices,
where a path is a sequence of distinct vertices where each is connected
to the next by an edge.
When considering the asymptotic complexity it is often useful to categorize graphs as dense or sparse. Dense graphs have a lot of edges compared
to the number of vertices. Writing n = |V | for the number of vertices (which
will be our notation in the rest of the lecture) know there can be at most
n ∗ (n − 1)/2: every node is connected to any other node (n ∗ (n − 1)), but in
an undirected way (n ∗ (n − 1)/2). If we write e for the number of edges, we
have e = O(n2 ). By comparison, a tree is sparse because e = n − 1 = O(n).
3
Computing a Spanning Tree
There are many algorithms to compute a spanning tree for a connected
graph. The first is an example of a vertex-centric algorithm.
1. Pick an arbitrary node and mark it as being in the tree.
2. Repeat until all nodes are marked as in the tree:
(a) Pick an arbitrary node u in the tree with an edge e to a node w
not in the tree. Add e to the spanning tree and mark w as in the
tree.
We iterate n − 1 times in Step 2, because there are n − 1 vertices that have to
be added to the tree. The efficiency of the algorithm is determined by how
efficiently we can find a qualifying w.
The second algorithm is edge-centric.
1. Start with the collection of singleton trees, each with exactly one node.
2. As long as we have more than one tree, connect two trees together
with an edge in the graph.
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.4
This second algorithm also performs n steps, because it has to add n − 1
edges to the trees until we have a spanning tree. Its efficiency is determined
by how quickly we can tell if an edge would connect two trees or would
connect two nodes already in the same tree, a question we come back to in
the next lecture.
Let’s try this algorithm on our first graph, considering edges in the
listed order: (AB, BC, CD, AE, BE, CE).
A
D
A
E
E
B
C
A
D
A
B
E
B
D
A
D
A
E
C
B
D
E
C
B
C
D
E
C
B
C
The first graph is the given graph, the completley disconnected graph is the
starting point for this algorithm. At the bottom right we have computed the
spanning tree, which we know because we have added n − 1 = 4 edges. If
we tried to continue, the next edge BE could not be added because it does
not connect two trees, and neither can CE. The spanning tree is complete.
4
Creating a Random Maze
We can use the algorithm to compute a spanning tree for creating a random
maze. We start with the graph where the vertices are the cells and the
edges represent the neighbors we can move to in the maze. In the graph,
all potential neighbors are connected. A spanning tree will be defined by a
subset of the edges in which all cells in the maze are still connected by some
(unique) path. Because a spanning tree connects all cells, we can arbitrarily
decide on the starting point and end point after we have computed it.
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.5
How would we ensure that the maze is random? The idea is to generate a random permutation (see Exercise 1) of the edges and then consider
the edges in the fixed order. Each edge is either added (if it connects two
disconnected parts of the maze) or not (if the two vertices are already connected).
5
Minimum Weight Spanning Trees
In many applications of graphs, there is some measure associated with the
edges. For example, when the vertices are locations then the edge weights
could be distances. We might then be interested in not any spanning tree,
but one whose total edge weight is minimal among all the possible spanning trees, a so-called minimum weight spanning tree (MST). An MST is not
necessarily unique. For example, all the edge weights could be identical in
which case any spanning tree will be minimal.
We annotate the edges in our running example with edge weights as
shown on the left below. On the right is the minimum weight spanning
tree, which has weight 9.
A
3
B
2
D
E
2
2
3
A
3
C
2
2
B
D
E
2
3
C
Before we develop a refinement of our edge-centric algorithm for spanning trees to take edge weights into account, we discuss a basic property it
is based on.
Cycle Property. Let C ⊆ E be a cycle, and e be an edge of maximal weight
in C. The e does not need to be in an MST.
How do we convince ourselves of this property? Assume we have a
spanning tree, and edge e from the cycle property connects vertices u and
w. If e is not in the spanning tree, then, indeed, we don’t need it. If e is in
the spanning tree, we will construct another MST without e. Edge e splits
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.6
the spanning tree into two subtrees. There must be another edge e0 from
C connecting the two subtrees. Removing e and adding e0 instead yields
another spanning tree, and one which does not contain e. It has equal or
lower weight to the first MST, since e0 must have less or equal weight than
e.
The cycle property is the basis for Kruskal’s algorithm.
1. Sort all edges in increasing weight order.
2. Consider the edges in order. If the edge does not create a cycle, add
it to the spanning tree. Otherwise discard it. Stop, when n − 1 edges
have been added, because then we must have spanning tree.
Why does this create a minimum-weight spanning tree? It is a straightforward application of the cycle property (see Exercise 2).
Sorting the edges will take O(e ∗ log(e)) steps with most appropriate
sorting algorithms. The complexity of the second part of the algorithm
depends on how efficiently we can check if adding an edge will create a
cycle or not. As we will see in Lecture 26, this can be O(n ∗ log(n)) or even
more efficient if we use a so-called union-find data structure.
Illustrating the algorithm on our example
A
3
B
2
D
E
2
2
3
3
C
we first sort the edges. There is some ambiguity—say we obtain the following list
AE 2
BE 2
CE 2
BC 3
CD 3
AB 3
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.7
We now add the edges in order, making sure we do not create a cycle. After
AE, BE, CE, we have
A
2
D
E
2
2
B
C
At this point we consider BC. However, this edge would create a cycle
BCE since it connects two vertices in the same tree instead of two different trees. We therefore do not add it to the spanning tree. Next we consider
CD, which does connect two trees. At this point we have a minimum spanning tree
A
2
2
B
D
E
2
3
C
We do not consider the last edge, AB, because we have already added n −
1 = 4 edges.
In the next lecture we will analyze the problem of incrementally adding
edges to a tree in a way that allows us to quickly determine if an edge
would create a cycle.
L ECTURE N OTES
N OVEMBER 18, 2010
Spanning Trees
L24.8
Exercises
Exercise 1 Write a function to generate a random permutation of a given array, using a random number generator with the interface in rand.h0. What is
the asymptotic complexity of your function?
Exercise 2 Prove that the cycle property implies the correctness of Kruskal’s algorithm.
L ECTURE N OTES
N OVEMBER 18, 2010