Using cargo-fuzz to Transfer Code Review of Simple Safe Code to Complex Code that Uses unsafe

encoding_rs::mem is a Rust module for performing conversions between different in-RAM text representations that are relevant to Gecko. Specifically, it converts between potentially invalid UTF-16, Latin1 (in the sense that unsigned byte value equals the Unicode scalar value), potentially invalid UTF-8, and guaranteed-valid UTF-8, and provides some operations on buffers in these encodings, such as checking if a UTF-16 or UTF-8 buffer only has code points in the ASCII range or only has code points in the Latin1 range. (You can read more about encoding_rs::mem in a write-up about encoding_rs as a whole.)

The whole point of this module is to make things very fast using Rust’s (not-yet-stable) portable SIMD features. The code was written before slices in the standard library had the align_to method or the chunks_exact method. Moreover, to get speed competitive with the instruction set-specific and manually loop-unrolled C++ code that the Rust code replaced, some loop unrolling is necessary, but Rust does not yet support directives for the compiler that would allow the programmer to request specific loop unrolling from the compiler.

As a result, the code is a relatively unreviewable combination of manual alignment calculations, manual loop unrolling and manual raw pointer handling. This indeed achieves high speed, but by looking at the code, it isn’t at all clear whether the code is actually safe or otherwise correct.

To validate the correctness of the rather unreviewable code, I used model-based testing with cargo-fuzz. cargo-fuzz provides Rust integration for LLVM’s libFuzzer coverage-guided fuzzer. That is, the fuzzer varies the inputs it tries based on observing how the inputs affect the branches taken inside the code being fuzzed. The fuzzer runs with one of LLVM’s sanitizers enabled. By default, the Address Sanitizer (ASAN) is used. (Even though the sanitizers should never find bugs in safe Rust code, the sanitizers are relevant to bugs in Rust code that uses unsafe.)

I wrote a second implementation (the “model”) of the same API in the most obvious way possible using Rust standard-library facilities and without unsafe, except where required to be able to write into an &mut str. I also used the second implementation to validate the speed of the complex implementation. Obviously, there’d be no point in having a complex implementation if it wasn’t faster than the simple and obvious one. (The complex implementation is, indeed, faster.)

For example, the function for checking if a buffer of potentially invalid UTF-16 only contains characters in the Latin1 range is 8 lines (including the function name and the closing brace) in the safe version. In the fast version, it’s 3 lines that just call to another function expanded from a macro, where the expansion is either generated using either a 76-line SIMD-using macro or a 71-line ALU-using macro depending on whether the code was compiled with SIMD enabled. Of these macros, the SIMD calls another (tiny) function that has a specialized implementation for aarch64 and a portable implementation.

To use cargo-fuzz, you create a “fuzzer script”, which is a Rust function that gets a slice of bytes from the fuzzer and exercises the code being fuzzed. In the case of fuzzing encoding_rs::mem, the first byte is used to decide which function to exercise and the rest of the slice is used as the input to the function. When the function being called takes a slice of u16, a suitably aligned u16 subslice of the input is taken.

For each function, the fuzzer script calls both the complex implementation and the corresponding simple implementation with the same input and checks that the outputs match. The fuzzer finds a bug if the outputs don’t match, if there is a panic, or if the LLVM Address Sanitizer notices bad memory access, which could arise from the use of unsafe.

Once the fuzzer fails to find problems after having run for a few days, we can have high confidence that the complex implementation is correct in the sense that its observable behavior, ignoring speed, matches the observable behavior of the simple implementation. Therefore, a code review for the correctness of the simple implementation can, with high confidence, be considered to apply to the complex implementation as well.