## Posts Tagged ‘algorithm’

### Minimum Spanning Tree

December 7, 2010

Well,

The other day I came across this problem and ramprasad gave a brilliant solution to it.

Problem : Given a connected weighted graph G, find a minimum spanning tree. But wait, before you jump to any conclusion. The weights of the edges are not real numbers, but a quadratic function of time. So given a time interval [t1,t2] find out the minimum spanning tree T. That is T attains the minimum at time tm.  $\forall t \in [t1,t2], \forall TR \in {Spanning Tree of G}, W(T_{tm}) \leq W(TR_t)$. Now, find out the minimum spanning tree and the time instant tm at which it attains that minimum value.

Solution : Well, the first solution that popped up in my mind was a brute force approach. Consider all spanning trees. As the weights of edges are quadratic functions, the weights of spanning trees would also be quadratic functions. Now differentiate these functions to find the minima. Now find out the absolute minimum of all these. But apparently, the number of spanning tree of a graph could be as high as $n^{n-1}$  thanks to Kirchoff. So obviously, it does not fit into our perception of “efficient algorithm”.

Ramprasad gave a nice solution which is definitely polynomial time. Not sure whether there are better solutions or not. Note that, prim’s algorithm or kruskal’s algorithm takes into account the order of weights of edge and does not really care about what these actual weights are.  So as long as the order of the weights remains unchanged, you can apply one of these algorithm. Given the time interval [t1,t2] it can give rise to infinitely many instances of the weighted graph. As the weight function of edges are quadratic ( parabola ) there are at most $O({|E| \choose 2})$ many intersection points. That is, parabola corresponding to two edges intersect at atmost 2 points.  And these intersection points are the points in time where weight of one edge goes beyond the weight of another edge.

So, sort these intersection points by time co-ordinate. Note, that between two consecutive intersection points, the order of the weights remain unchanged and hence one of the standard algorithm can be applied. So we need to run Prim’s or Kruskal’s algorithm ($O(|E| log |V|))$for each such interval giving rise to $O(|E|^3 log |V|)$ algorithm overall. Clearly, a major improvement over the brute force method.

### Tournament graph and Hamiltonian Path

November 11, 2010

Well,

Those of you who are unaware of tournament graph and hamiltonian path, you can look here and here. Ramprasad gave me the following interesting problem to think about.

Problem : Given a tournament ( I will omit the suffix “graph” as it is obvious in the context of graph theory ), construct a hamiltonian path in polynomial time.

Solution : When I first heard the problem, I was even wondering how to prove first that all tournament indeed always have an hamiltonian path. And that too, we need to find it in polynomial time..Phew! But I believe that if you comeup with an algorithm, that itself gives you a proof!!

He gave me a hint : given that you already have explored some path $v_1,\dots,v_k$ how can you extend it for a given vertex $v$? That is, of course, if the path has not covered every vertex :-).

If there is a directed edge from $v_k$ to $v$ there is a natural extension. You just append $v$ to the path. If not, find out two consecutive vertices $v_i, v_{i+1}$ in the path such that the tournament have edges $(v_i,v)$ and $(v,v_{i+1})$ in which case the modified path will be $v_1,\dots , v_i, v ,v_{i+1},\dots, v_k$.

Suppose no such pair exists!!!! Let’s try to see why this can’t be true. Look at, $v_1,v_2$. It has to be the case that either tournament contains edges $(v_1,v),(v_2,v)$ or $(v,v_1),(v_2,v)$. If it is the second case, that $v$ can be just added as a prefix to the path. If it is the first case, than note that tournament must contain $(v_3,v)$ because other wise we can have $\dots, v_2,v,v_3,\dots$. Similarly, we can argue for $v_4,\dots$ giving us the contradiction as we had assumed that edge $(v_k,v)$ is absent from the tournament.

Hence, we can always extend the path as long as there is some vertex we have not yet covered.
Ramprasad gave a simple proof of why a pair $v_i,v_{i+1}$ must exist such that we can extend the path to contain $v_i,v,v_{i+1}$. If we can not add $v$ as a prefix or a suffix to the path. Then look at the vertex with the least index j such that $(v,v_j)$ is an edge. By the definition of the tournament, one of the two edge $(v_{j-1},v) , (v,v_{j-1})$ must be there. But, because j is the least index having an incoming edge in the path, the second edge can not be there. hence we can extend the path to contain $v_{j-1},v,v_j$.

Easy to see that algorithm runs in $O(n^2)$ time as we need to search $O(n)$ vertex to extend the path and repeat it till we  have covered all vertices.

–Saurabh Joshi

### Puzzle : Power game

October 13, 2010

Well,

I like the dramatic titles :-). Anyways, here goes the problem.

Problem : Given an integer n, determine whether $n=a^b$ for some integers a and b in poly-logarithmic time.

Solution : It is obvious that for certain n, there could be multiple a’s and b’s such that $n=a^b$. We only need to find one such a and b or have to declare that there are no such integers. Needless to say, that we are ruling out b=1 :-).

Given any n, the number of bits required is log n. Now, let us fix b = 2. For this b,  a must be having $\lceil \frac{log n}{2} \rceil$ bits. So we know the range of possible a’s that might satisfy $a^2=n$. We do a binary search on this range in O(log n) time. If we do not find a suitable a, we fix b=3 and repeat. There can be only logarithmic many such b’s. Multiplication and calculating powers of a number, both can be done in polylog time. Hence, we can determine that the given number n is of the form $a^b$ for some n or not.

Cool, isn’t it?

–Saurabh Joshi

### Puzzle : Find a duplicate

October 12, 2010

Well,

Needless to say that the puzzle was given by Ramprasad. After thinking for a while, I could figure out the solution using an old trick from one of the typical technical interview questions.

Problem : Given n element in an array ( say 0 to n-1 ). The elements comes from the set {1,..,n-1}. Clearly, due to pigeion hole principle, there will be a duplicate element. There can be more than one element which are occurring twice or more. In O(n) time, find out one duplicate element. The additional space you can use should not exceed O(log n).

Solution : At first, I thought that one can just have a bit array, storing whether we have seen an element or not. But for elements coming from 1,…,n-1 the number of bits required is O(n). Even if we consider a number representation to take O(log n), it will still require O(n/ log n) numbers to be stored. So clearly, that can not lead to a solution.

What works is as follows. Let us say you are given a singly linked list. You need to detect if the linked list has a cycle or not. You can only store constant many elements.

What you can do is have two poiters, p1 and p2. At each  step, you increment p1 by 1 ( by increment we mean that go to the next element in the linked list ), and you increment p2 by 2. If p1 and p2 ever becomes same, that means that the linked list has a cycle, otherwise, p2 will hit the end of the linked list.

It is worth noting that in this scenario, p1 and p2 will meet before p1 can finish traversing the entire cycle. Also, for any two relatively prime numbers ( n1, n2), if you increment p1 by n1 and p2 by n2, they will still meet irrespective of the length of the cycle.

This seemed like a little diversion from our original problem, but we are going to use this :-). What we will do is, we will look inside the element at index 0. Let us say this element is e1. Now we will look in the array A, A[e1] and then A[A[e1]] and so on. We can think of this mechanism forming a singly linked list. Because the element ranges from 1,….n-1, and the total elements in the array being n ( from index 0 to n-1 ) there has to be a cycle in this list. And as shown in the figure, the starting point of the cycle B, is the element which is repeated twice.  That is because, B will have two distinct predecessor e1 and e2 such that A[e1]=A[e2] = B.

So what you do is the following. As mentioned earlier, have two pointers p1 and p2. p1 will move by 1 and p2 will move by 2. Say they meet at point C. Let us say that distance AB=p, BC= q and CDB=r. How much p1 must have traversed? It will be p + q where as p2 has traversed p+q+r+q = 2(p+q) ( double than p1).
So, p = r.

So once we reach C. We will move p1 to A ( that is start from index 0 again ) and keep p2 at C. Remember that AB=p=r=CDB. Now, we will move p1 and p2 both by 1. They will meet at B and that is our duplicate element :-).

Instead of moving p1 by 1 and p2 by 2 to find out the point C, what if we move then by (n1,n2) where n1 and n2 are relatively prime? Would it work? 🙂

–Saurabh Joshi

### Puzzle : Nice puzzle on sorting!

August 11, 2010

Well,

The title is directly taken from a mail by Ramprasad, so is the puzzle :-). He in turn gives credit to Prof Meena for this nice puzzle.

Problem : Pasting the problem verbatim here from Ramprasad’s email.

Meena had given us this puzzle during this workshop and I found it
incredibly interesting.

Suppose we have 13 cards arranged in descending order: K Q J 10 9 8 7
6 5 4 3 2 1
At any move, you can take a substring of cards and move it elsewhere.
For example, 10 9 8 could be move to the right by 2 steps to get K Q
J 7 6 10 9 8 5 4 3 2 1. The goal is to minimize the number of such
moves.

It must be absolutely obvious that you can do it with 12 moves but the
very fact that I’m asking you this question means that you can do
better. How many moves do you need? 🙂

Now the obvious extension to this problem is to generalize this technique for any n.

Solution : Well, I got the solution myself, but not without the hint. And the hint is as follows.

See the snipped mailing conversation between me and Ramprasad.

> Is it the case that number of inversions should strictly decrease?

>

Actually inversions was my first guess towards proving the lower

bound. It is a very good guess but I do not know if it absolutely

necessary that inversions must go down at each step. But perhaps your

guess is right… maybe it must go down monotonically at each step.

[For those who do not know what an inversion is, a pair of indices

(i,j) is called an inversion if i < j and a_i > a_j ]

My first attempt towards the lower bound was that initially there are

n-choose-2 inversions and finally there are none. If we can show that

each move can remove only n inversions then we’d sort of have a good

lower bound. But unfortunately this is not true: take n n-1 n-2 … 1

and shift the first half and put it after the 2nd half to get n/2

n/2-1 … 1 n n-1 n-2… n/2+1. This process actually gets rid of

O(n^2) inversions and my proof attempt broke.

However there is a very similar parameter that can be used to argue

about the lower bound. And this parameter is called ‘descents’. These

are indices i such that a_i > a_{i+1}. Now clearly to begin with there

are (n-1) descents.

Claim: At every step, you can reduce the number of descents by at most

2. [prove it]

This immediately gives a (n-1)/2 lower bound. But do all steps

actually reduce the number of descents by 2? No! Take for example, the

first move. It can reduce the descents only by 1! And similarly the

last step. Thus you get the required lower bound of (n+1)/2.

And, this lower bound can be achieved!

---


Well, there you go! It was sufficient for me to get the trick after this hint. The claim is easy to prove

hence I am leaving it for interested readers ( if any 🙂 ).

The sequence for 13 cards is as follows :

K Q J 10 9 8 7 6 5 4 3 2 1
K 6 5 4 3 2 1 Q J 10 9 8 7
1 Q K 6 5 4 3 2 J 10 9 8 7
1 2 J Q K 6 5 4 3 10 9 8 7
1 2 3 10 J Q K 6 5 4 9 8 7
1 2 3 4 9 10 J Q K 6 5 8 7
1 2 3 4 5 8 9 10 J Q K 6 7
1 2 3 4 5 6 7 8 9 10 J Q K

In seven steps!! Only, seven steps are required, as mentioned earlier by Ramprasad.

In general, for any decreasing sequence

n n-1 n-2 …. 2 1
First move would be to move the second half of the array just after n.

n n/2-1 n/2 -2 …. 1 n-1 n-2 …n/2

Now, we maintain the invariant that the subarray before and including n, does not have a descent.

On the right of n, find an element a_i such that a_i < a_(i+1) ( does not have a descent ).

Move these two elements to the left of n to maintain the invariant. In fact, one can put a_i at the a_i th

place from the left. Again, it is easy to prove the correctness of this strategy.
Copying verbatim interesting notes from Ramprasad :
Some general notes on this problem. The question of finding the
shortest set of moves to sort any given permutation is not known to be
in P, and not known to be NP-hard or anything either. There are
approximation algorithms for this known and that is how Meena asked me
this question. One of the participants of the workshop had worked on
this problem and we started discussing this when we spoke about his
problem.

One question I asked Meena was, is the decreasing sequence the worst
sequence? And turns out, surprisingly, the answer is no! People ran
some computer analysis and they figured a permutation on 15 and
another on 17 which took more than 8 and 9 respectively! But people
are making the conjecture that for sufficiently large n, the worst
sequence is the decreasing sequence [and some others also believe that
‘large n’ is any n > 17 🙂 ].

— Saurabh Joshi

### Puzzle : Goof ups by GOD

September 25, 2009

Well,

Coming directly to the point. Today’s puzzle was posted by manoj and answer was given by Saket. I am just copying it verbatim assuming that things are under creative commons license :D.

Problem :

You are given n balls. There are two type of information God has given to you.
Same(i,j) if ball “i” and “j” are same.
Different(i,j) if ball “i” and “j” are different.

Since the world is imperfect, obviously God has goofed up somewhere.
example
Same(1,2), Same(2,3), Different(1,3)
Obviously this is not consistent
But,
Same(1,2), Different(2,3)
is consistent as you “believe” that God will never assert “Same(1,3)”

But as I said God has goofed up somewhere and you have to pay the
price. And that too a computing price.
Given “m” such assertions can you find if the assertions are consistent.

Can you do it in O($m^2$) time?

Can you do it in O($m logn$) time?
Can you do it in O(m) time?

Solution :

I think O($m^2$) solution is quite easy to achieve for this problem.

Every ball is represented by a vertex, when we process the statement same(i,j) we add an edge between the two corresponding vertex. We process all same(x,y) statements like this. Now for the set to be consistent, there should not be a statement different(x’,y’) such that x’ and y’ are connected. There can be O(m) such statements, we can check each statement by a simple DFS, which will again take O(m) time. Thus giving us an O($m^2$) solution.

For the O(m log n) solution we can use the same edge structure, but additionally make use of the well-known Union-find data structure. Which gives O(log n) time for union / find. So we can insert O(m) edges for all same(x,y) calls in O(m log n) into the structure + for each different(x’,y’) use find, to check if both x’ and y’ belong to the same component (there can be O(m) such statements making the total time taken O(m log n)). The total time for the whole operation is thus O(m log n).

For the O(m) solution,

I think this problem in fact is same as the one addressed by union – find. Because if we have a O(m) solution to this problem, then we will also have an O(1) time union-find solution. But since the union find has a proven lower bound of inverse ackerman function (which although not constant, is extremely slow growing), this problem will also have a lower bound of O(m  α(n)).

### Puzzle : Balls and Binary

November 17, 2008

Well,

I know that the name does not give any clue what the puzzle is about, but I could not think of any better name than this. Anyways, without any beating around the bush, let me directly come to the point.

Oh, for the record, this nice puzzle was given to me by manoj.

Problem : There are boxes, numbered from 0 to $\infty$. There is a ball placed in box 0. In addition to this, you are given k-balls. You can add or remove a ball from  box numbered j only if there is a ball in the box numbered j-1. Now, using only given k-balls and the valid moves ( add/remove ), how would you place a ball in the box numbered $2^k - 1$?

Solution : I have to get back to a very powerful mathematical tool called principle of mathematical induction for this problem. Of course, the trick lies in coming up with correct induction hypothesis.

Claim : Using k-balls I can reach a configuration where balls are placed in box numbered

$2^{k-1},2^{k-1} + 2^{k-2}, 2^{k-1}+2^{k-2} + 2^{k-3}, \dots ,2^k - 2, 2^k - 1$. Note that, according to this claim, i-th ball will be $2^{k-i}$ distance away from (i-1)-th ball.

Base case : k = 1, I have only one ball and I will place it in box numbered 1. Configuration will look like following :  ( a box containing a ball will be written as nB)

0B 1B

So, the claim is trivially true here. Note, that 0 contains a ball by default.

Induction Hypothesis : Assume that with k = p-1 balls, balls will be positioned in box numbered :

$2^{p-2}, 2^{p-2}+2^{p-3}, \dots , 2^{p-1} - 2 , 2^{p-1} - 1$

Induction Step : We have k-balls. Using k-1 balls we can reach a configuration ( due to induction hypothesis ) where these k-1 balls are placed in boxes numbered :

$2^{k-2}, 2^{k-2}+2^{k-3}, \dots , 2^{k-1} - 1$.

But, I still have one additional ball, I can put it in box $2^{k-1}$ as there is a ball in box $2^{k-1}-1$. Now, I can remove a ball from $2^{k-1}-1$ because there is a ball in $2^{k-1}-2$.

So now, I have one ball remaining, also there is a ball in $2^{k-1}-4$. I can place a ball at $2^{k-1}-3$ and now remove a ball from $2^{k-1}-2$ and then from $2^{k-1}-3$.

I have two balls now. There is a ball in box numbered $2^{k-1}-8$. Using two balls, I can reach upto 3 from number 0. So, I can reach upto $2^{k-1}-5$. Now I can recover $2^{k-1}-4$. and so on. It is easy to see that in similar fashion I can recover k-1 balls keeping one ball in $2^{k-1}$.

From 0 with k-1 balls in hand, I can reach upto $2^{k-1}-1$. So from $2^{k-1}$ with k-1 balls in hand we can reach upto $2^k - 1$. Also, it is easy to see that now balls will be place in boxes :

$2^{k-1}, 2^{k-1} + 2^{k-2}, \dots, 2^k - 2, 2^k - 1$. If you don’t understand why this will happen, just subtract $2^{k-1}$ from above terms and you will reach configuration mentioned in induction hypothesis.

Those who are a bit confused due to bad writing as well as mathematical notation here is an example.

Example : 4 balls.

0B  -> 0B 1B -> 0B 1B 2B -> 0B 2B -> 0B 2B 3B -> 0B 2B 3B 4B -> 0B 2B 4B -> 0B 1B 2B 4B -> 0B 1B 4B -> 0B 4B — using the steps show above, just asume that 4B acts like a 0B –> 0B 4B 6B 7B -> 0B 4B 6B 7B 8B -> 0B 4B 5B 6B 8B -> 0B 4B 5B 8B -> 0B 4B 8B -> 0B 1B 4B 8B -> 0B 1B 2B 4B 8B -> 0B 2B 3B 4B 8B -> 0B 1B 2B 3B 8B -> 0B 1B 2B 8B -> 0B 1B 8B -> 0B 8B ( Now you have 3 balls in your hand , assume 8B is 0B and you can reach upto 15B ).

Very nice puzzle indeed I must say 🙂 One can see its connection with binary representation of a number.

–Saurabh Joshi

### Largest sum sub-sequence and sub-matrix!

November 15, 2008

Well,

I happened to bump into two old problems which I had solved sometime back. However, I never bothered to write it down anywhere which I intend to do so now.

Problem 1 : Given a sequence of $n$ integers ( +ve and/or -ve ) find a contiguous subsequence with the largest sum. By contiguous subsequence I mean if the i-th element and j-th element of the original sequence are in the subsequence for ( $i \leq j$ ) then $\forall k, i \leq k \leq j$ k-th element is in the subsequence.

Example :  sequence :  1 5 -9 2 4 3 8

desired subsequence : 2 4 3 8

Problem 2 : Given a m x n matrix A, find a submatrix of A such that sum of all the elements of the submatrix is the largest amongst all possible submatrices.

Example :

1 2 -1
-3 -1 4
1 -5 -2

Desired sub matrix :

2 -1
-1 4

Solution 1 : Well, actually, solution for this is very easy and many people know about it. Still, for those who don’t know, let me describe it over here.

Initialize max_sum = 0.

You start calculating cumulative sum from the first element,at each step you store it in max_sum if it is greater than its current value. If at any point of time the cumulative sum goes below zero, store it in max_sum if it is greater than its current value, and reset cumulatvie sum to zero. Subsequently, update max_sum only if the current running sum is greater than max_sum. At the end, max_sum will give the largest sum. What about the subsequence? Well, you can keep start_index and end_index to keep track of which contiguous subsequence contributed to max_sum.

Running time is $O(n)$ and I guess this is the best possible.

Solution 2  : Second one is a bit tricky but not too tough. In fact, we will use Solution 1 to calculate this one.

First what we do is for each row calculate cumulative sum array. That is array[i][j]  will contain sum of all element from 1 to j in the i-th  row. Time required is $O(mn)$ For all i, array[i][0] = 0

Next let us pick any two column, say p and q (with $p \leq q). Now, we will form a new array, sum_array[k] = array[k][q] - array[k][p-1]. That is, sum_array[k] will contain sum of all elements from p to q in row k. Now, apply Solution 1 to find max_sum. Remember that indexes now will give you start_row and end_row. p and q are start_column and end_column respectively. But we did it only for two particular column. To find the submatrix having the largest sum we will have to apply it for all possible column combinations$latex {n \choose 2}\$. Running time is $O(n^2 m)$

I don’t know whether this is the best possible solution, or does there exist any solution which can achieve this in complexity better than $O(n^2 m )$

Do let me know if someone have a better idea.

— Saurabh Joshi