Pagini: [1]   În jos
  Imprimă  
Ajutor Subiect: Coding Cotest Byte: The Square Root Trick  (Citit de 15481 ori)
0 Utilizatori şi 1 Vizitator pe acest subiect.
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« : Iulie 20, 2012, 07:54:53 »

http://infoarena.ro/blog/square-root-trick
Memorat
a_h1926
Echipa infoarena
Nu mai tace
*****

Karma: 317
Deconectat Deconectat

Mesaje: 385



Vezi Profilul
« Răspunde #1 : Iulie 20, 2012, 14:50:40 »

For the Josephus problem: we may consider a binary array, in which a[ i ]=0 represents that the ith player was eliminated, and a[ i ]=1 represents that the ith player is still in the game. We notice that if we have a given configuration of 0's and 1's, the last eliminated player is P and we are at step k, we need to count k ones to the right of P (we consider the array circular), and that will be the position of the next eliminated player. We could do this efficiently if we keep a data structure Sum in which Sum[ i ] represents the sum for the ith interval of length sqrt (n). Therefor, when we are at position P and we want to count k ones to the right, we can easily see in which interval of length sqrt (n) is the next position by simply iterating through the intervals to the right untill we find that their total sum exceeds k. The total complexity is sqrt (n) for every eliminated player, as we only iterate through elements in the interval in which the current player is, through the elements from the interval in which the next player to be eliminated is and through the intervals between, which are at most sqrt (n) (in case we need to go around the entire array from 1 to n more than once because k is greater than the number of remaining players, we can just count k%(number of remaining players) ones to the right because the rest of the moves would lead us to the exact same position). Final complexity is O(n*sqrt (n)).

For the Level Ancestor problem: we could keep an array Ancestor in which we remember the sqrt (n)-th ancestor of each node. We may compute this array in O(n) with a single DFS, as we may keep the nodes in a stack and we can figure out in constant time which is the sqrt (n)-th ancestor for each node. Having this array computed, when we recieve a query, we can just iterate through the sqrt(n)-th ancestor of the node as long as the current level of the ancestor doesn't exceed k, and we can just iterate through the left ancestors level by level (through the father array) untill we reach the kth ancestor. The complexity is O (n) precalculation + O(sqrt (n)) for each query.

For the Range Median problem: we can split the array into sqrt (n) intervals, and for each interval we keep all elements inside it in increasing order. We may do this by keeping a binary search tree for each interval. Now, when we execute an update, we can tell in constant time from which interval the element is part of, we delete its previous value and insert its new value. This takes log (sqrt (n)) time (approx. log (n)). When we have a query, we can use a binary search for the value of the median. If we have a fixed value for the median X, we can count how many elements in the range [l..r] are less than X as follows: we iterate through all elements in the range [l..k*sqrt (n)) (where k is the least number such that the interval exists) and we see how many elements are less than or equal to X. Then, we iterate through the intervals from between l and r, and in each interval we can perform a binary search on the tree to see how many elements are less than or equal to X. After this, we just iterate through the last elements which didn’t fit in an interval untill we reach r. If the number of elements less than or equal to X is greater than [(l-r)/2] we need to search for a smaller X, and if it is less then we need to search for a greater X. If we search for the smallest X which meets the requierment, it is guaranteed that we will find the median. If we analyze the complexity, we have log (n) from the binary search of the median value (we can keep all elements of the array in another binary search tree so that we have log (n) instead of log (Maximum value) ), and for each value, we iterate through at most 2*sqrt (n) elements which do not fit in an interval from the range [l..r], and at most sqrt (n) intervals from the range [l..r]. For each interval through which we iterate, we perform another binary search, thus we have another log (sqrt (n) ) (approx. log (n) ) factor. The time complexity for an update is log (sqrt (n)), and for a query is log (n)*sqrt (n)*log (sqrt (n)).
The implementation of the binary search trees might be difficult during a programming contest, we can replace them with a trie in which we insert the binary representation of each element in the array, but that will increase the time complexity of log (Max Value) for an update and log (Max Value)*log (Max Value)*sqrt (n) for a query.
« Ultima modificare: Iulie 20, 2012, 22:41:01 de către Heidelbacher Andrei » Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #2 : Iulie 20, 2012, 22:02:28 »

I made a mistake in the median problem. I updated the query complexity to be O(sqrt(n)log(n)log(U)).

@Andrei, your solutions look correct.

Here are two other problems:
1. You are given 2 eggs. You have access to a 100-story building. Eggs can be very hard or very fragile means it may break if dropped from the first floor or may not even break if dropped from 100 th floor.Both eggs are identical.  You need to figure out the highest floor of a 100-storey building an egg can be dropped without breaking. Now the question is how many drops you need to make. You are allowed to break 2 eggs in the process.
2. Compute the number of digits of n! n <= 1000000000
« Ultima modificare: Iulie 21, 2012, 01:01:47 de către Cosmin Negruseri » Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #3 : Iulie 21, 2012, 04:08:12 »

[Addressing the 1st problem in the second set]

My idea is to drop the first egg from a floor N (where 2<=N<=100) and see if it breaks. We have two possible outcomes:

a) The egg breaks at story N, so we have 1 egg left and we know that between 0 and N-1 it can either break or not. Since we only have 1 egg left we simply iterate through all N-1 stories and see where it breaks. Total drops is N in the worst case.
b) The egg does not break so we have we can apply the same scheme again considering story N+1 as our "base" level. So we just drop the egg again from the 2*N-how_many_times_weve_tried floor and apply case a) if it breaks or repeat b) if it does not.

The idea is that we always drop the egg from one story less than the previous try if it doesn't break during our previous attempt. Let's say we drop it from the 10th floor initially. If the egg doesn't break we drop it next time from the 19th, the next time from the 27th...etc. This works because if the egg should happen to break during one of our tries we only need N-(how_many_times_we_tried_before) steps to find out where the egg actually breaks with our second egg.
So for N=10, we can cover 10+9+...+2+1 = 55 floors like this. Now the trick is to find the smallest N for which we can cover 100 floors. So 1+2+..+N <= 100 => N*(N+1)/2 <= 100. Intuitively solving this I get N=14. So we need a minimum of 14 tries to cover all 100 stories (it actually covers 105).

Hope this is good...
Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #4 : Iulie 21, 2012, 07:35:49 »

@Claudiu, sounds good.

Here's another one:
We start with an array A of unique integers from 0 to n - 1 then we are given n <= 30000 updates and queries.
- reverse(i, j): reverses the order of the elements in A[i..j]
- get(i): returns A
Come up with an algorithm faster than O(n^2) that can run these operations.
Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #5 : Iulie 21, 2012, 08:34:58 »

[Addressing the second problem in the second set]

My idea is rather simplistic: find the maximum number of the form 10^x <= n!. We can apply log10 to either side and get something like x <= log10(n!). Expanding this we get x <= log10(1) + log10(2) + ... + log10(n). Playing around with  a small program (http://codepad.org/IwdMLjcr) I noticed the result seems to be always something like floor(log10(1) + log10(2) + ... + log10(n)) + 1 I don't have any proof correctness on this but it seems to work Smile. Am I at least close?
Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #6 : Iulie 21, 2012, 09:04:26 »

You can solve the problem by using Stirling's n! approximation and use lg to get the number of digits.

The way I solved it more than 10 years ago was precompute the cumulative sums and store every kth sum. Then either go from [n/k] * k up or [n/k + 1] * k down. This way I could trade off binary size vs algorithm speed Smile.
Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #7 : Iulie 22, 2012, 19:29:46 »

Cosmin,

Any hint on the problem with the reverse and get operations? I'm thinking it could be done with an interval tree (similar to how other problems like the maximum sub-sequence per interval) are done, but I having trouble figuring out how the update can be done on the interval in less than O(n) time.
Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #8 : Iulie 22, 2012, 19:37:54 »

Queries, updates are given in the input.

You can split the queries/uptates in batches of length sqrt(n).
Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #9 : Iulie 23, 2012, 18:39:02 »

Thinking about this problem I'm still a bit stuck on how you can beat O(n) for the update operation. This is what I've understood so far (indexing from 0):
- reverse(i,j): reverses the order of elements in the interval [i,j]. Ex: A = [1,2,3,4,5,6,7], and reverse(1,4) yields [1,5,4,3,2,6,7]
- query(i): returns the value of A. Ex: Ex: A = [1,2,3,4,5,6,7], query(2) yields 3

My dilemma is that even if I'd split the updates/queries in batches of sqrt(n) just applying a reverse is (worst case) O(n). And as far as I can see reverse operation results depend on previous reverse operation results. Also as far as I could tell the order of reverse operations affects the overall state of A. Any more hints on this, please? Smile
Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #10 : Iulie 23, 2012, 20:12:15 »

Hint: If you have x updates and then one query you can respond to the query in O(x) by going through the updates backwards.
Memorat
gogu
Client obisnuit
**

Karma: 42
Deconectat Deconectat

Mesaje: 98



Vezi Profilul
« Răspunde #11 : Iulie 24, 2012, 13:29:19 »

Good ol' sqrt(N) solutions, they bring back some memories.
I woudn't even call it a trick, more of a general balancing technique, where you try to find optimal trade-offs.
The median problem for instance can get O(sqrt(N) * sqrt(log N) * log U) per operation, if you pick your block length as sqrt(N log N).
If your block length is B, every operation outside a block takes O(B) time, plus O(N/B * log N) for all the blocks, where B is the block size. When you try to minimize for both you'll get sqrt(N log N).
In practice though, this is one of those problems where you just have to try multiple constant factors, and see which ones work best.
Also, a much better replacement for the binary trees Andrei was thinking of for the blocks would be plain sorted arrays. Lots of binary searches and some inserts with a memmove are very well suited for modern CPUs.

A nice problem I proposed here that was meant to be solved with a sqrt(N) trick was http://infoarena.ro/problema/rutier
Since I don't think anybody solved it entirely, and the topic is live, I though of doing some shameless promoting.
The task asks that given a graph where edges are inserted one by one, find the cost of the minimum spanning tree after each new edge.
Sugested complexity per operation: O(sqrt(N log*N))
Memorat
freak93
Echipa infoarena
Nu mai tace
*****

Karma: 342
Deconectat Deconectat

Mesaje: 819



Vezi Profilul
« Răspunde #12 : Iulie 24, 2012, 22:08:43 »

@Gogu Here's what I was thinking.
First we split the queries into sqrt(M) pieces(each of sqrt(M)) and we calculate the spanning tree after each such group of queries.

Suppose we have the spanning tree before the current group and this sqrt(M) queries. We sort the queries by value and merge it with the edges from the spanning tree and then we start Kruskal's Algorithm. When we add a query edge we check the already added query edges and if they are from a query that comes after the current query we say that the current query edge receives an open slot. If the edges was added to the spanning tree then we add open slots to all other query edges(other than the ones that counted as open slots for the current one). When we get to an edge that can't go into the spanning tree we check if it can fit an open slot(if it's from the last spanning tree it can, if it's a query edge it can only fit open slots for query edges that appear after it).

After we complete the spanning tree we take each query edge and check how many query edges are in the spanning tree but appear after them in the query list. From the total spanning tree cost we remove the sum of those query edges costs and add the cost of the edges that fill the slots.
Complexity O(sqrt(N) * log*(N)) per query. Can't seem to find O(sqrt(N log*N))
« Ultima modificare: Iulie 24, 2012, 22:37:59 de către Budau Adrian » Memorat
a_h1926
Echipa infoarena
Nu mai tace
*****

Karma: 317
Deconectat Deconectat

Mesaje: 385



Vezi Profilul
« Răspunde #13 : Iulie 24, 2012, 22:59:02 »

@Gogu
The binary trees can be replaced with sorted arrays and binary searches and the complexity for a query will remain unchanged, but any update will require O(N) time, which is a pretty signifiant increase in complexity, even for modern CPUs. Let's say we have about 100.000 updates. Although a little harder to implement, the binary trees will deal well with the updates and the queries in a few seconds (maybe less), but the sorted arrays will take much longer to perform the updates.
Memorat
gogu
Client obisnuit
**

Karma: 42
Deconectat Deconectat

Mesaje: 98



Vezi Profilul
« Răspunde #14 : Iulie 25, 2012, 01:27:56 »

@Adrian
Your solution might be correct, but I'm not sure how you determine the exact new cost, since I don't understand how you're keeping track of which edge gets removed after a newly inserted one. From what I can tell you're removing them all together, and you can have a query edge inserted and have it bumped out in the same batch. My solution was somewhat different, but based on the same splitting into sqrt(N) chunks of queries. You can get O(sqrt(N log *N)) in the way I describes in the post before: pick the B that minimizes M * O(B) + M/B * O(N log *N), and that is your optimal chunk size, in this case sqrt(N log *N).

What I did was see which edges don't change after the B insertions and merge the nodes they connected, to get a tree with at most B nodes (it's the number of removed edges + 1). After that, I inserted the query edges in this newly reduces graph in linear time. Your solution might work, you can try implementing and seeing if it's fine Wink

@Andrei
I think you misunderstood me. I wasn't suggesting you should keep the entire array sorted, but that for each sqrt(N) sized bucket you should keep a sorted array with those values. Pretty much replace each binary tree you have for a section with just keeping a sorted copy of the values there.
You'll have to do a binary search in every bucket, and every time you change a value in a bucket you'll have to shift all the elements between the last value and the new one, thus making an insert cost O(B), where B is your bucket size.
I was saying this has a significant advantage over keeping binary trees because it simplifies the implementation (I don't think STL has a data structure for "how many elements are less than k?") and also because it is very fast due to excellent cache behaviour. The update is just a memory shift by 4 bytes of a continuous section of on average B/2 integers.
The binary search may seem like it will produce cache misses like the binary tree, but it's actually remarkably well behaved: there are several hot-spots in the array (middle of the sequence, 3/4 point, etc) which are almost always in cache, the end comparisons which happen with closely packed elements, and just a few random accesses in the array. Binary trees keep some of the hot-spot property, but still need more RAM accesses. One real-life benchmark result that shocked me, and that I always remember for this situations is that reading single bytes randomly from memory is about as slow as reading them sequentially from disk.

Since I've rambled a bit besides the point, I have to point out a neat "bulaneala" (dubious solution) the range median problem has: pick a few (say 32-64) elements from your query interval and select the 4-5 values in the middle as a starting interval for your median binary search, instead of the whole interval. Most of the time, the median will be in this sampled range, and in the worst case, you just made an extra meta-comparison. This way, you'll get much fewer median tests overall (by a factor of 4-5 maybe even).
Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #15 : Iulie 26, 2012, 17:57:55 »

[Bringing the discussion back to an earlier problem]

After some thinking on this problem I'm still a puzzled how one can beat the worst case O(n^2) for the range reverse/point query problem Cosmin posted earlier. I figured out how you can step through the queries backwards and find what the value at a certain index is in O(how_many_updates_were made). So if we split the operations in batches of sqrt(n), in every batch an query could be answered in at most O(sqrt(n)). Problem is, I can't see how to efficiently precompute, for every batch of size sqrt(n), the cumulative effect of the updates in that batch. I'm thinking this is necessary, since otherwise you end up having O(n) for every query. Anyway, if someone feels like posting any more hints, or just pointing me to the solution directly (publicly or in private), I'd be ever so grateful.
Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #16 : Iulie 26, 2012, 22:10:10 »

k updates in O(k^2 + n) time:
Work with ranges of numbers instead of one number at a time. Each new update will increase the number of disjoint ranges by 2. Performing a reverse over k ranges will take O(k) time. Then you can get the modified array in O(k + n) time.

did I answer your question?
« Ultima modificare: Iulie 27, 2012, 00:45:11 de către Cosmin Negruseri » Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #17 : Iulie 27, 2012, 01:35:24 »

Thanks for the extra hint. From your method I'm assuming that by using ranges (I'll think about how to do that) I can get a linear time instead of quadratic (the "k updates in O(k^2 + n) time" part of your post sort of suggests the contrary). FWIW I was thinking there might be some range related idea to reducing the complexity, but I couldn't figure it out. I'll follow up when I have it solved (unless someone beats me to it).
Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #18 : Iulie 27, 2012, 02:00:41 »

let's see what happens for
n = 10 and
reverse(3,7)
reverse(5,9)

first you have the whole range
[1..10]

[1,2] [3,4,5,6,7] [8,9,10]
reverse(3,7)
[1,2], [7, 6, 5, 4, 3] [8, 9, 10]

[1,2] [7, 6] [5, 4, 3] [8, 9] [10]
reverse(5, 10)
[1, 2] [7, 6] [9, 8] [3, 4, 5] [10]

So the idea is that you play with compact lists of consecutive indexes. Each update increases the number of lists by 2 and reverses the order of the lists.
« Ultima modificare: Iulie 27, 2012, 02:16:19 de către Cosmin Negruseri » Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #19 : Iulie 27, 2012, 02:17:12 »

That's pretty neat! Unless I miss my guess this idea has the down side that at one point the lists could, depending on the updates, degenerate into single numbers. But if we work only with batches of sqrt(n) that won't be a problem. Have I understood?
« Ultima modificare: Iulie 27, 2012, 02:24:15 de către Claudiu Mihail » Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #20 : Iulie 27, 2012, 02:26:26 »

Yes, basically you work with 2k + 1 lists instead of working with n numbers. First you have one list, then three and so on.

For each list you know the start index and end index. Reversing a list means swapping it's boundaries.

Reversing a new range involves creating 2 new lists and reversing all the lists that are within the current range. If you have x lists, doing an update takes O(x).

So doing sqrt(n) offline reverse updates takes O(sqrt(n) * sqrt(n) + n) = O(n) time.
Memorat
claudiumihail
Strain
*

Karma: 5
Deconectat Deconectat

Mesaje: 33



Vezi Profilul
« Răspunde #21 : Iulie 27, 2012, 02:42:39 »

And because there are sqrt(n) batches the cost is O(n*sqrt(n)) for the overall preprocessing and for the queries it's sqrt(n) per query since we only need to go through at most sqrt(n) individual update ops before we reach a known "check point" and the the actual value. So overall O(n*sqrt(n) + n*sqrt(n)) = O(n*sqrt(n))...I get it now. Nice problem.
« Ultima modificare: Iulie 27, 2012, 03:06:01 de către Claudiu Mihail » Memorat
Cosmin
Echipa infoarena
Nu mai tace
*****

Karma: 351
Deconectat Deconectat

Mesaje: 1.799



Vezi Profilul
« Răspunde #22 : Iulie 27, 2012, 03:15:27 »

You can try it here http://infoarena.ro/problema/rev

There's a more efficient solution using treaps (treaps have fast split and merge operations).

Petru told me another one where the sqrt n trick is pretty handy:
http://infoarena.ro/problema/drumuri4

Implement an efficient datastructure:
You start from n lists of one element. You can do updates: break a list in two or Join two lists. You can do index queries.
« Ultima modificare: Iulie 27, 2012, 03:21:53 de către Cosmin Negruseri » Memorat
Pagini: [1]   În sus
  Imprimă  
 
Schimbă forumul:  

Powered by SMF 1.1.19 | SMF © 2006-2013, Simple Machines