An O(n 3 log log n/ log 2 n) Time Algorithm for All Pairs Shortest Paths

Please download to get full document.

View again

of 11
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Bills

Published:

Views: 0 | Pages: 11

Extension: PDF | Download: 0

Share
Related documents
Description
An O(n 3 log log n/ log 2 n) Time Algorithm for All Pairs Shortest Paths Yijie Han 1 and Tadao Takaoka 2 1 School of Computing and Engineering, University of Missouri at Kansas City, Kansas City, MO 64110,
Transcript
An O(n 3 log log n/ log 2 n) Time Algorithm for All Pairs Shortest Paths Yijie Han 1 and Tadao Takaoka 2 1 School of Computing and Engineering, University of Missouri at Kansas City, Kansas City, MO 64110, USA 2 Department of Computer Science and Software Engineering, University of Canterbury, Christchurch, New Zealand Abstract. We present an O(n 3 log log n/ log 2 n) time algorithm for all pairs shortest paths. This algorithm improves on the best previous result of O(n 3 (log log n) 3 / log 2 n) time. Keywords: Algorithms, all pairs shortest paths, graph algorithms, upper bounds. 1 Introduction Given an input directed graph G = (V, E), the all pairs shortest path problem (APSP) is to compute the shortest paths between all pairs of vertices of G assuming that edge costs are real values. The APSP problem is a fundamental problem in computer science and has received considerable attention. Early algorithms such as Floyd s algorithm ([2], pp ) computes all pairs shortest paths in O(n 3 ) time, where n is the number of vertices of the graph. Improved results show that all pairs shortest paths can be computed in O(mn+n 2 log n) time [9], where m is the number of edges of the graph. Pettie showed [14] an algorithm with time complexity O(mn + n 2 log log n). See [15] for recent development. There are also results for all pairs shortest paths for graphs with integer weights[10, 16, 17, 21 23]. Fredman gave the first subcubic algorithm [8] for all pairs shortest paths. His algorithm runs in O(n 3 (log log n/ log n) 1/3 ) time. Fredman s algorithm can also run in O(n 2.5 ) time nonuniformly. Later Takaoka improved the upper bound for all pairs shortest paths to O(n 3 (log log n/ log n) 1/2 ) [19]. Dobosiewicz [7] gave an upper bound of O(n 3 /(log n) 1/2 ) with extended operations such as normalization capability of floating point numbers in O(1) time. Earlier Han obtained an algorithm with time complexity O(n 3 (log log n/ log n) 5/7 ) [12]. Later Takaoka obtained an algorithm with time O(n 3 log log n/ log n) [20] and Zwick gave an algorithm with time O(n 3 log log n/ log n) [24]. Chan gave an algorithm with time complexity of O(n 3 / log n) [6]. Chan s algorithm does not use tabulation and bit-wise parallelism. His algorithm also runs on a pointer machine. What subsequently happening was very interesting. Takaoka thought that O(n 3 / log n) could be the ultimate goal and raised the question [20] whether O(n 3 / log n) can be achieved. Chan first achieved O(n 3 / log n) time and also thought that O(n 3 / log n) is a natural bound [6]. However, Han showed an algorithm with O(n 3 (log log n/ log n) 5/4 ) time [11]. Because in [11] Han exhausted Takaoka s technique [19] Han thought that this result should be it and did not expect any improvements in the near future (see the reasoning Han gave in the paper [11] explaining why it should be difficult to further improve). However, Chan came up with an algorithm with time complexity O(n 3 (log log n) 3 / log 2 n) [5]. Chan [5] believes that O(n 3 / log 2 n) is probably the final chapter. Our experience indicates that Chan may be correct. Here we present an algorithm with time complexity O(n 3 log log n/ log 2 n). Thus we further remove a factor of (log log n) 2 from the time complexity of the best previous result due to Chan. We would like to point out the previous results which influenced the formation of our ideas presented in this paper. They are: Floyd s algorithm [2], Fredman s algorithm [8], Takaoka s algorithm [19], Han s algorithm [11], Chan s algorithm [5]. 2 Preparation Since it is well known that the all pairs shortest paths computation has the same time as computing the distance product of two matrices [1] (C = AB), we will concentrate on the computation of distance product. We divide the first n n matrix A into t 1 submatrices A 1, A 2,, A t1 each having dimension n n/t 1, where t 1 = n 1 r 1 and r 1 is a constant to be determined later. We divide the second n n matrix B into t 1 submatrices B 1, B 2,..., B t1 each having dimension n/t 1 n. Therefore C = AB = A 1 B 1 + A 2 B A t1 B t1, where is addition and + is the minimum operation. In the following we consider the computation of the distance product of an n n/t 1 matrix E with a n/t 1 n matrix F. The reason we need to do the division for this level will be understood later in this paper. We then divide the n n/t 1 matrix E into t 2 = (n/t 1 )/(r 2 log n/ log log n) submatrices E 1, E 2,..., E t2 each having dimension n (r 2 log n/ log log n), where r 2 is a constant to be determined later. Similarly we divide the n/t 1 n matrix F into t 2 submatrices each having dimension (r 2 log n/ log log n) n. Now EF = E 1 F 1 + E 2 F E t2 F t2. In the following we will first consider the computation of E 1 F 1, and then the computation of EF (or A 1 B 1 ). After that it takes O(n 2 t 1 ) time straightforwardly to get the all-pairs shortest path of the input graph. Let E 1 = (e ij ) and F 1 = (f ij ). We will first, for each 1 i, j r 2 log n/ log log n, sort f ik f jk, k = 1, 2,..., n. After sorting each f ik f jk has a rank in [1, n]. We then give f ik f jk a label l f (i, j, k) = l if f ik f jk has rank in (l n/ log 9 n, (l + 1) n/ log 9 n]. l f (i, j, k) uses 9 log log n bits. For each e kj e ki we will give it label l e (k, i, j) = l if f ik1 f jk1 e kj e ki f ik2 f jk2, where f ik1 f jk1 has rank (not label) l n/ log 9 n and f ik2 f jk2 has rank (l + 1) n/ log 9 n. l e (k, i, j) also uses 9 log log n bits. According to Fredman [8] and Takaoka [19], if the labels of e k1 j e k1 i and f ik2 f jk2 are different then we can determine either e k1 i + f ik2 e k1 j + f jk2 or e k1i +f ik2 e k1j +f jk2. Say e k1j e k1i has label l e and f ik2 f jk2 has label l f. If l e l f (l f l e ) then e k1j e k1i f ik2 f jk2 (e k1j e k1i f ik2 f jk2 ) and therefore e k1j +f jk2 e k1i+f ik2 (e k1j +f jk2 e k1i+f ik2 ). However, when their labels are the same then we cannot determine this. However, only a fraction 1/ log 9 n of the total number of (e k1 j e k1 i) s (one out of all lables) are undetermined for each f ik2 f jk2 and therefore overall a fraction of 1/ log 9 n of all pairs of e k1 j e k1 i and f ik2 f jk2 are undetermined. In case of indeterminacy for two indices i, j we will pick i over j (to include i in the computation) when i j and leave the j-th position (or index) to be computed separately. This separated computation can be done in brute force and it takes O(n 3 / log 8 n) time for the whole computation, i.e. the computation of AB. The actual counting of complexity of this separate computation is as this: There are w = O(n 3 log n/ log log n) pairs of e k1j e k1i and f jk2 f ik2 (k 1 and k 2 each take value in [0..n 1] and thus have a factor of x = n 2, there are y = n log log n/(r 2 log n) pairs of E and F for each pair of k 1 and k 2 and for each pair of E and F there are z = O((r 2 log n/ log log n) 2 ) pairs of e k1 j e k1 i and f jk2 f ik2. w = xyz. Because 1/ log 9 n of these pairs are in the separate computation the complexity of the separate computation takes O((n 3 log n/ log log n) (1/ log 9 n)) = O(n 3 / log 8 n) time. 3 The Further Details Now for fixed i and k we pack l e (k, i, j), j = 1, 2,..., r 2 log n/ log log n, into one word and call it l e (k, i). This can be done when r 2 is small. Also for fixed i and k we pack l f (i, j, k), j = 1, 2,..., r 2 log n/ log log n, into one word and call it l f (i, k). By comparing l e (k 1, i) and l f (i, k 2 ) we can let the result be 1 if index i should be chosen over all the other r 2 log n/ log log n 1 indices, and let it be 0 otherwise. This computation of comparing one index with all the other r 2 log n/ log log n indices is done in constant time by concatenating l e (k 1, i) and l f (i, k 2 ) into one word of less than log n bits and then index into a precomputed table to get the result of either 0 or 1. Note that since l e (k, i) has 9r 2 log n bits, and we will pick r 2 later such that 9r 2 log n is much smaller than log n and therefore l e (k, i), k = 1, 2,..., n can be sorted into t 3 = 2 9r2 log n = n 9r2 blocks such that each block has the same word (l e (k, i)) value. For the purpose of computing E 1 F 1, there are r 2 log n/ log log n i s (columns in E 1 ) and for each (column) of them we have n l e (k, i) s (one on each row) and these l e (k, i) s (0 k n) form t 3 blocks and for each of these blocks we get a 1 n vector of bits (0 s and 1 s) that are the result of the above computation (i.e. indexing into a table to get a 0 or a 1). We need to compute these (a vector of n) 0 s and 1 s for one l e (k, i) in a block because all l e (k, i) s in a block have the same value. Thus we need to compute r 2 (log n/ log log n)t 3 vectors (of n bits each), and this can be done in O(r 2 (log n/ log log n)t 3 n) time (one step gets one bit by the above table lookup computation). On the average for each such an n bit vector v there are only n/(r 2 log n/ log log n) bits that are 1 s and the remaining bits are 0 s. This is so because of the way we formed l e (k 1, i) and l f (i, k 2 ) and the way that we stipulate that the result of comparison of l e (k 1, i) and l f (i, k 2 ) is 0 or 1. We take v and first divide it into n/((r 2 log n/ log log n)(r 3 log n/ log log n)) small vectors each with dimension 1 ((r 2 log n/ log log n)(r 3 log n/ log log n)), where r 3 is a constant to be determined later. Now, on the average, each small vector has r 3 log n/ log log n bits which are 1 s. If a small vector v has between (t 1)r 3 log n/ log log n + 1 and tr 3 log n/ log log n bits of 1 s we will make a set V of t small vectors each having ((r 2 log n/ log log n)(r 3 log n/ log log n)) bits and containing a subset of no more than r 3 log n/ log log n 1 s from v. Because the multiplication of each row of E 1 with all columns of F 1 results in r 2 log n/ log log n vectors having a total of n bits of 1 s, they will result in 2n(r 2 log n/ log log n)/((r 2 log n/ log log n)(r 3 log n/ log log n))= 2n/(r 3 log n/ log log n) small vectors, where factor 2 is because of some small vectors may have less than r 3 log n/ log log n bits of 1 s. For fixed i (a row of F 1 ) (and therefore l f (i, k), 0 k n) and fixed value of l e (k, i) s (a block) we formed 2n/((r 2 log n/ log log n)(r 3 log n/ log log n)) small vectors each having (r 2 log n/ log log n)(r 3 log n/ log log n) bits with no more than r 3 log n/ log log n bits are 1 s. Therefore each small vector can be represented by a word (with no more than log n bits) when r 3 is small. This is so because r3 log n/ log log n ( (r2 log n/ log log n)(r 3 ) log n/ log log n) t=0 t n. We first form these 2n/((r 2 log n/ log log n)(r 3 log n/ log log n)) words for each vector (on the average) and then duplicate these words for all rows in the block because all rows in the same block has the same l e (k, i) value. The reason we do this duplicating is to save time because small vectors with the same value need not to be computed into words repeatedly. Thus for the multiplication of E 1 F 1 we obtained 2n 2 /(r 3 log n/ log log n) words. And for the multiplication of A 1 B 1 we obtained 2n 2+r 1 /((r 2 log n/ log log n)(r 3 log n/ log log n)) words. And therefore for the multiplication of AB we have obtained O(n 3 (log log n/ log n) 2 ) words and computation thus far takes O(n 3 (log log n/ log n) 2 ) time. However these O(n 3 (log log n/ log n) 2 ) words contain more than O(n 3 (log log n/ log n) 2 ) indices because multiple indices are coded into one word. Thus we shall combined these words to cut the number of indices. To further combine these words we need to consider only the words formed by E i [1, 1..(r 2 log n/ log log n)] F i [1..(r 2 log n/ log log n), 1..(r 2 log n/ log log n)(r 3 log n/ log log n)] (there are r 2 log n/ log log n resulting words out of this multiplication as we explained above), i = 1, 2,..., n r1 /(r 2 log n/ log log n). Thus there are n r1 words. We need to reduce them to n r1 /(r 3 log n/ log log n) words in O(n r1 ) time and thereafter we can simply disassemble indices (for minimum) out of packed words and finish the remaining computation straightforwardly. Because each word w contains a set S w of no more than r 3 log n/ log log n columns (these columns have 1 s and the other columns have 0 s) in F i there are r 3 log n/ log log n ( ((r2 log n/ log log n)(r 3 ) log n/ log log n)) l=1 l n cr 3 choices, where c is a constant. When r 3 is much smaller than r 1 there are many words among the n r1 words having the same S w sets. This is the fact we can take advantage of. In the following we will refer two of these words with the same S w sets as w 1 and w 2, i.e., the two small vectors represented by w 1 and w 2 are the same (equal or identical). 4 Combining Words The scheme for combining words is a little complicated. The basic idea is that, since we have indices gathered in O(n 3 (log log n/ log n) 2 ) words, we just need to do pair-wise comparisons between pairs of indices (paths) to reduce the number of indices by half. If we do this for 2 log log n rounds we can reduce the number of indices to n 3 /(log n) 2 and then we can just disassemble indices from words and finish the computation straightforwardly in O(n 3 /(log n) 2 ) time. Note that because we do pairing-off the time complexity will remain to be O(n 3 (log log n/ log n) 2 ). The complication of our algorithm comes from the fact that indices are encoded in O(n 3 (log log n/ log n) 2 ) words. To deal with this encoding we have to design an algorithm that utilizes the special characteristics of the encoding. We use a different labeling for the matrix B 1 = (b ij ) and A 1 = (a ij ). We will, for each 1 i, j n r1, sort b ik b jk together with a kj a ki, k = 1, 2,..., n. For A and B the total time for sorting is O(n 2+r 1 log n). This gives the rank of b ik b jk (a kj a ki ) which we denote by l b1 (i, j, k) (l a1 (k, i, j)). These ranks take value from 1 to 2n and have log n + 1 bits. There are n 2r 1 choices of i, j pairs and for each of these choices (each pair of i and j) and for each set U t = {t(r 2 log n/ log log n)(r 3 log n/ log log n) + 1, t(r 2 log n/ log log n)(r 3 log n/ log log n) + 2,..., (t + 1)(r 2 log n/ log log n)(r 3 log n/ log log n)}, t = 1, 2,..., n/((r 2 log n/ log log n)(r 3 log n/ log log n)), where values of k are taken, choose any subset of U t containing no more than r 3 log n/ log log n elements and there are no more than (n/((r 2 log n/ log log n)(r 3 log n/ log log n)) r 3 log n/ log log n ( ((r2 log n/ log log n)(r 3 ) log n/ log log n)) l=1 l n 1+cr 3 choices, where c is a constant, and thus there are a total of n 1+2r1+cr3 choices for all pairs of i and j. For each of these choices we use a word w to represent the total of (a) a choice of the pair i, j, (b) a choice (referred to as d) of the subset of U t (n 2r 1+cr 3 choices) and (c) the packing of no more than r 3 log n/ log log n ranks (l b1 (i, j, k) s) (where k take values over elements in the subset of U t ). Note that straightforward packing will not work because it will take O(log 2 n/ log log n) bits (a subset of U t has up to O(log n/ log log n) elements and each element corresponds to O(log n) bits of an l b1 (i, j, k).) and cannot be stored in a word of log n bits. What we do is first to build a trie for the r 3 log n/ log log n ranks. An example of such a trie is shown in Fig. 1(a). This trie is a binary tree with a node having two children when there are ranks with the most significant bits differ at the node s level. Next we build a packed trie by removing nodes v with only one child except the root. The edge connecting this removed node and its child is also removed. This is shown in Fig. 1(b). Thus let v 1, v 2,..., v t be such a chain with v i being v i+1 s parent, v 1 and v t having two children and v i, i 1, t, having one child, and we will remove v 2, v 3,..., v t 1. Edges (v 2, v 3 ), (v 3, v 4 ),..., (v t 1, v t ) are also removed. The edge originally connecting v 1 and v 2 are now made to connect v 1 and v t. We will record on edge (v 1, v t ) that t 2 edges (bits) are removed. Also at leaves, we store only relative address of k (having value between 1 and (r 2 log n/ log log n)(r 3 log n/ log log n)) instead of the value of l b1 (i, j, k) (having value between 1 and 2n). Such a packed trie is shown in Fig. 1(c) and it can be stored in a word w with c log n bits, where c is a constant less than 1. Now with l a1 (1, i, j), we can follow this packed trie in word w and reach a leaf l of the trie (by starting at the root, repeatedly deleting corresponding bits which has been removed from the packed trie and following the path in the packed trie). In Fig. 1(d) we show such situations. Therefore we can precompute a table T 1 and use the values of l a1 (1, i, j) and w to index into T 1 to get leaf l. (Note that l a1 (1, i, j) has log n + 1 bits but this can be reduced to c log n bits with c 1 by using multiple tables replacing T 1 and taking care of a fraction of log n + 1 bits at a time). Here we will use l to represent both the leaf and the relative address of k mentioned in immediate previous paragraph. From l we get the value of k and we can then compare l a1 (1, i, j) and l b1 (i, j, k) to find the most significant bit where l a1 (1, i, j) and l b1 (i, j, k) differ (this can be done by exclusive-oring the two values and then finding the most significant bit which is 1 by indexing into a precomputed table). Say this bit is the b-th most significant bit. By using the values of b, l a1 (1, i, j) and w (indexing into another precomputed table T 2 ) we can then figure out at which chain C of the original trie l a1 (1, i, j) branches out. Note that we need NOT to know the value of C. We need know only within C the branch happens and whether it branches to left or to right (these information can be figured out with b, l a1 (1, i, j) and w.) Thus the output of indexing into T 2 can be the columns where index i should be taken over index j. We can store w as an array element in an array W as W [i][j][t][d], here, i, j, t, d are the ones mentioned in the first paragraph of this section. We need not to pack l a1 (k, i, j) s because only one l a1 (k, i, j) is considered at a time. Now for the above mentioned words w 1 and w 2 (last paragraph of Section 3) we can get t and d value (both w 1 and w 2 are associated with the same t value because we put them together as explained above, w 1 and w 2 are associated with the same d value because as we mentioned above that we can get this by setting r 3 much smaller than r 1 ). We can also get i from w 1 and j from w 2. Thus we can get W [i][j][t][d]. Now we use the values of W [i][j][t][d] and l a1 (1, i, j) to index into precomputed table T 1, T 2 to get rid of half of indices in w 1 plus w 2. Repeating this process we can then remove a factor of r 3 log n/ log log n from the total number of indices. Thereafter we disassemble the indices from packed !% &(' ) * ' +, ' ) + % - +, ) +. ) /+ 01 ' 2, + %!% - 43 /' ) ' ' )! % !! # $!% 5 /' ) % 61* ) + 3 ) ' + + ) ' ) ' * ) + ' + 7) 28 / % - * ) ' ' 2, + ) ) ' 2,4 ) +8+ * + ' ' )! 9,% : );) 0 )! ) + ' ) ', ) 2, = ' ' ' + 9,' ) ' ) 2, ' + ' ' ) 9,' ) ' 4 % : );) 0 ) % : );) 0 ) % : );) 0 ) % : );) 0 )!% ' % :; % % such that one word contains one index and the remaining computation can be carried out easily. The precomputation of tables T 1, T 2 is not difficult and its complexity cannot dominate the overall computation. The reason is because all these computations have polynomial complexity and we can reduce the table size to n ɛ with ɛ being an arbitrarily small constant. The choice of r 1, r 2, r 3 should not be difficult now. The complexity of the algorithm as we described above is O(n 3 (log log n) 2 / log 2 n). 5 Removing Another Factor of log log n From Complexity In our algorithm we partitioned the number of rows of E 1 and the number of
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks