Abstract: Matrix multiplication is presumably one of the most important techniques in many scientific applications. Efficient implementation of this technique on the ubiquitous multi-cores is ...