* Program re-ordering for improved L2 cache hit rate. * Automatic performance tuning. # Motivations # Matrix multiplications are a key building block of most modern high-performance computing systems.
There have been many songs that stay with us regardless of when they released. We may have heard them as children or seen them on TV once, but that has been enough to make us put them on repeat mode.