Reproducibility of SNDC’s output

A great deal of thought and attention went into making sure that the same file will compile into the same sound on the same system, and as close as possible across different systems. We’ll discuss some aspects of this design here.

Testing

Every SNDC module has its own unit test in tests/unit. The expected MD5 hash is stored in tests/manifest and when running ./run_tests.sh, the output of every unit test will be generated, hashed and checked against the expected value.

The result has three possibilities:

More complex assemblies in tests/assemblies are tested the same way and will catch any subtle modification across a large portion of the code base.

Reference system:

Noise and other random sources

The most important source of potential differences in output signal when running the same file is of course noise (specifically the noise module) and other random sources like the $r parameter in the func module.

Because the C built-in rand() function may have different implementations across different systems, and also (and perhaps even more importantly) because its output is actually bounded by a fairly low value (RAND_MAX, which is often defined as 32767, way insufficient to produce quality noise), we have implemented a built-in random generator using the Mersenne-Twister algorithm.

We also always seed it with the same default value, so if the user doesn’t specify a seed, it will always produce the same output on any platform.

Noise and randomness is replicable across all systems and platforms.

Mathematical functions

The C standard libm library defines quite a few useful mathematical functions such as cos, sqrt and so on. So are usually well implemented and fast, but unfortunately their exact implementations differ between libc implementations.

Thus, it’s likely that if you are using a different operating system such as MacOS, those function will not return an exactly matching result compared to the reference used in the testing infrastructure (glibc).

So on MacOS, it is expected to see a few CHANGED tests when running even simple tests like osc. However, the difference is extremely small and due to rounding differences.

Mathematical functions are replicable on the same system, often across systems sharing the same libc implementation but not necessarily across operating systems and different hardware.

FFTW

The last, rather tricky source of differences between systems is whatever comes out of FFTW’s outputs.

FFTW has quite an advanced algorithm selection process: given a fixed buffer, it may select different algorithms based on alignment of the buffer and other hardware factors. At some point during SNDC’s development, unit tests using FFT filters and other FFT-dependent processing would sometime return a different value because the buffer given to FFTW would differ in alignment across multiple runs on the same machine.

This was solved by giving the right flag to FFTW to tell it to do only minimal guessing and select “the most likely fastest algo” for that particular system.

However, the algorithm still sometimes differs on different hardware and thus leads to bit differences and tests flagging CHANGED.

FFTs are only reproducible bit for bit on the same hardware and OS.

Take-away

Bit for bit reproducibility is only guaranteed on the same machine, however the differences encountered between systems are always way under the audible threshold.

You can see whether your system matches the “reference” system by running ./run_tests.sh and check for CHANGED tests.