As someone whose Ph.D work was on the auto parallelization and who spent 5 years in developing the auto-SIMDization feature of the xlc compiler for POWER processors, my view on auto-simdization has turned 180 degrees over the years. Yes, my 5 years on auto-SIMD was one of my most productive years and the best collaboration experience ever. But nowadays, every time someone brought up auto-SIMD as the solution to solving the programmability difficulty of SIMD, I can’t help but shoot back.
Auto-SIMD is the holy grail of SIMD programming models. Everybody wants it: programmers, executives, program managers, academics, compiler designers (myself included). The problem is that the perceived capability of auto-SIMD is quite different from the realistic capability of an auto-SIMD compiler. Putting it bluntly, auto-SIMD compilers rarely work when applied to real codes. Many times, compiler users came to us with a piece of their codes that the compiler cannot parallelize. Sometimes the code is simple to human eyes, but complicated to compilers because of aliasing and unknown side-effects through function calls. Sometimes, the loop is parallel but may be “messed up” by internal compiler transformations that confused the SIMD analysis. This is no fault of the compiler. It is simply the wrong task for the compiler to figure out a parallel loop out of a sequential program. Think about it: how many times do you rely on auto-parallelization to produce a parallel code? Probably none. The same is true for auto-SIMD.
There are times where the compiler indeed can SIMDize a loop, but the loop is often so simple (e.g., matmul w/ all global side-effects known) and the amount of compiler analysis required is so humongous (e.g., inter-procedural analysis) that it is much easier for the programmer to indicate the SIMDizable region to the compiler using some programming interface (e.g., OMP SIMD directives).
I have seen so many times, decision makers embrace SIMD because it sounds like the best solution to solve the problem; compiler practitioners declare victories after SIMDizing a few self-selected kernels and publishing the initial results; and users gave up using the feature after a few frustrated tries.
I never forget the three questions that my boss often asks about a new research idea: 1) does it work? 2) what does it do for me? 3) when is it available? Auto-SIMD fails the very 1st test.