infoarena

Comunitate - feedback, proiecte si distractie => Blog => Subiect creat de: Cosmin Negruseri din Octombrie 29, 2013, 02:35:04



Titlul: Transpose
Scris de: Cosmin Negruseri din Octombrie 29, 2013, 02:35:04
http://www.infoarena.ro/blog/transpose


Titlul: Răspuns: Transpose
Scris de: Giurgea Mihnea din Octombrie 29, 2013, 13:15:26
A solution can be obtained by splitting the initial matrix into 100 smaller sub-matrices, using files on disk. Since each sub-matrix will have 1 GB, we can load it in memory, transpose it, then write it back to disk. The final step consists of assembling all 100 sub-matrices in the correct order, into another 100GB file.

This works because:
(A B)T = (AT CT)
(C D)     (BT DT)

(assuming A, B, C and D are square sub-matrices).


Titlul: Răspuns: Transpose
Scris de: Balan Radu Cosmin din Octombrie 29, 2013, 16:27:16
Is the file in binary or text format ? It wouldn't be that hard to interchange between them but I am curious.

LE: I'm thinking that with the binary format you could seek in that file, this enables you to even transpose it in place,
without even using any significant memory or extra disk space for even larger files. But this is slow due to all those disk read/write/seek operations.


Titlul: Răspuns: Transpose
Scris de: Petcu Marius din Octombrie 29, 2013, 21:35:56
Read a few lines down of the matrix and start constructing the rows of the transposed matrix left to right in memory. When the memory fills, dump them to disk in a preallocated file by seeking to the beginning of each row if it's binary, or by writing each row to an individual file then cat-ing them together if it's text. You'll only have sqrt(100G) * 100 = 31600000 seeks to do then if it's binary.