Timeline for How much can matrix multiplication algorithm be parallelized?
Current License: CC BY-SA 4.0
3 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Oct 24, 2019 at 12:06 | comment | added | SonneXo | The position updated is fully determined by the outer two loops: C[i,j] does not depend on k by any means. Now, remember what a += b expands to: a = a + b. This means, when a processor executes this command, it first has to fetch the current value of a, then add this value to b and after that, it sends the sum back to the memory. If multiple threads to execute such a command on one location, this reads/writes can interlock st. multiple threads read the old value, all add their local value and send their result back, overwriting the results from the other threads | |
| Oct 24, 2019 at 8:59 | comment | added | Simone C. | Why want multiple threads update the same location? If the inner loop is parallel, do not threads update only a personal and isolated set of positions? | |
| Oct 23, 2019 at 20:30 | history | answered | SonneXo | CC BY-SA 4.0 |