eBACS: ECRYPT Benchmarking of Cryptographic Systems
|How to submit new software:||Tips||hash||stream||aead||dh||kem||encrypt||sign|
|List of primitives measured:||lwc||sha3||hash||stream||lwc||caesar||aead||dh||kem||encrypt||sign|
|Measurements indexed by machine:||lwc||sha3||hash||stream||lwc||caesar||aead||dh||kem||encrypt||sign|
|List of subroutines:||verify||decode||encode||sort||core||hashblocks||scalarmult|
Download and unpack SUPERCOP on your development machine, and switch to the top supercop-... directory.
During development you won't want to try every possible compiler option. Edit okcompilers/c to remove every line after the first. Currently the first line is gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv. If this doesn't work on your machine, try just gcc -O3 -fomit-frame-pointer. You may also want to add -Wall -g for debugging.
Similarly edit okcompilers/cpp.
Inside the appropriate directory for the operation that you want to implement (e.g., crypto_hash or crypto_kem or crypto_sign), create a new subdirectory for your primitive: e.g., crypto_sign/square2048. The subdirectory name should consist solely of digits (0123456789) and lowercase ASCII letters (abcdefghijklmnopqrstuvwxyz); please omit dashes, dots, slashes, and other punctuation marks.
Try compiling and testing an existing primitive to see what a successful test looks like:
./do-part used ./do-part crypto_sign ed25519 less bench/*/data
./do-part used takes about 60 minutes on a typical machine (3GHz Skylake using one core) to compile various subroutines: basic modules that are required for the benchmarking framework; the gmp, ntl, and cryptopp libraries; and cryptographic subroutines marked as "used", such as crypto_hash_sha512. (For comparison, compiling and measuring everything with one compiler option takes roughly two days on one core.) If this number of minutes is too long to wait, you can save most of the time by doing something like
./do-part init ./do-part crypto_verify 32 ./do-part crypto_hash sha512 ./do-part crypto_stream chacha20 ./do-part crypto_rng chacha20 ./do-part crypto_sign ed25519
but this requires you to know the whole chain of relevant subroutines.
Running ./do-part crypto_sign ed25519 takes only 10 seconds. Also look at the resulting data file.
Now try testing your own primitive:
./do-part crypto_sign square2048
This will run instantaneously but will produce a "measure: not found" error message, since you don't have any implementations.
Make a crypto_sign/square2048/ref directory with empty files api.h and sign.c, and try the test again. You'll again see "measure: not found", and more error messages coming from the fact that api.h doesn't define CRYPTO_PUBLICKEYBYTES etc.
The basic development loop at this point is to edit files in crypto_sign/square2048/ref, run ./do-part crypto_sign square2048 again, and repeat until you're happy with the results. Then put a tarball of crypto_sign/square2048 on the web and submit the URL to the eBATS mailing list.
To generate random bytes in public-key software:
#include "randombytes.h" ... randombytes(buf,len);
randombytes is a strong random-number generator (RNG) provided by SUPERCOP.
Don't use the standard rand or random functions: they aren't cryptographically strong. If possible, don't use RNGs from other libraries, such as the OpenSSL RNG: SUPERCOP's automatic testing (see "Checksums" below) relies on all randomness coming from randombytes.
SUPERCOP versions starting 20200816 track which subroutines are declared to follow the constant-time coding discipline. Warnings regarding all other subroutines will appear on the eBACS web pages. To declare that your implementation is constant-time:
Of course, you shouldn't do this if you aren't sure that the implementation is constant-time. If you're sure that the implementation isn't constant-time, you can create warning-varbranch or warning-varindex or both.
For example, crypto_sort/int32/radix256ml is marked goal-constbranch but warning-varindex. Radix sort uses input-dependent indices. The type of radix sort used in this implementation has input-independent branches, but that's not enough for a constant-time implementation.
If you've used TIMECOP (see below) to try to find violations of goal-constbranch and goal-constindex, copy the timecop_pass line into goal-constbranch and into goal-constindex. The word "reviewed" indicates a manual code review. Either way, you're taking responsibility for your goal-constbranch and goal-constindex declarations, just like any other security claims that you make.
If a primitive has a constant-time implementation and a faster variable-time implementation, SUPERCOP will measure both. Both measurements will appear on the eBACS web pages, one with a warning and one without.
SUPERCOP versions starting 20200816 include TIMECOP, which tries to find deviations from the constant-time coding discipline. To run TIMECOP on an implementation marked goal-constbranch and goal-constindex, use ./do-part as above, but set the environment variable TIMECOP to 1:
env TIMECOP=1 ./do-part crypto_hash sha512
You can also set TIMECOP to a larger number to carry out that number of TIMECOP tests; one test catches most deviations, but not all. The database will say timecop_pass for implementations where TIMECOP did not detect problems, and timecop_fail for implementations where TIMECOP detected problems. The timecop_fail details will often let you rapidly pinpoint the code causing problems, especially if you add -g to the compiler options.
TIMECOP can have false negatives: timecop_pass for code that is not actually constant-time. The most common reason for this is that the message lengths tried by TIMECOP take constant time but other message lengths take variable time.
TIMECOP can also have false positives: timecop_fail for code that is actually constant-time. The most common reasons for false positives are (1) early aborts, such as authenticated encryption checking the authenticator first and returning immediately if authentication fails; and (2) rejection sampling, such as RSA key generation repeatedly generating secret integers until it finds a prime. To make TIMECOP happy, call
#include "crypto_declassify.h" ... crypto_declassify(&condition,sizeof condition);
to mark an early-abort condition or a rejection-sampling condition (before the branch) as being public information. Of course, you shouldn't do this unless you're sure that the condition really is independent of secrets.
TIMECOP relies on Valgrind, version 3.4.0 (from 2009) or later. To check whether you have Valgrind installed, type valgrind --version. If you see command not found, install valgrind as root: for example, apt install valgrind on Debian or Ubuntu. If you are compiling with -m32 on a 64-bit Intel/AMD machine, you will also need apt install libc6-dbg:i386.
TIMECOP will produce timecop_error if Valgrind fails in a way that is not clearly attributable to a branch or memory address based on secret data. This can indicate an implementation bug, or a lack of instruction support in Valgrind. Instructions not supported by Valgrind include AMD XOP instructions; 32-bit AVX2 instructions; setend on ARM; cycle-counting coprocessor instructions on ARM; and rdpmc for cycle counting on Intel/AMD. If you run into cycle-counting problems, try disabling the problematic cycle counter: e.g., move cpucycles/armv8.c to cpucycles/armv8.c.disabled or move cpucycles/amd64rdpmc.c to cpucycles/amd64rdpmc.c.disabled, and then start over with ./do-part init. In SUPERCOP versions starting 20210125, TIMECOP automatically skips cycle counters that don't work under Valgrind.
The benchmarks allow cryptographic implementations to call subroutines that are listed earlier in OPERATIONS and marked as "used". Examples:
Including crypto_hash_sha256.h gives you access not only to the crypto_hash_sha256() function but also to a crypto_hash_sha256_BYTES macro, defined the same way as CRYPTO_BYTES in api.h in the implementation. Similar comments apply to other api.h macros.
When you are implementing crypto_hash, including crypto_hash.h lets you use a crypto_hash_BYTES macro. For the moment SUPERCOP also lets you include api.h and use the CRYPTO_BYTES macro, but this is not guaranteed to continue to work.
When crypto_X calls subroutines crypto_Y and crypto_Z, the goal-constbranch and goal-constindex files for X say that X takes constant time given constant-time implementations of Y and Z. Variable branches and indices in an implementation of Y or Z are contrary to goal-constbranch and const-index for that implementation of Y or Z, but are not contrary to goal-constbranch and const-index for the implementation of X.
For example, some crypto_stream_aes128ctr implementations are variable-time, and on some platforms these are the fastest crypto_stream_aes128ctr implementations. An implementation of X that is otherwise constant-time can declare goal-constbranch and goal-constindex even if it calls crypto_stream_aes128ctr. SUPERCOP will then report the fastest constant-time X implementation (using the fastest constant-time crypto_stream_aes128ctr implementation) and, with a warning, the fastest X implementation (using the fastest crypto_stream_aes128ctr implementation), if that is faster.
When crypto_X calls subroutines from outside SUPERCOP, the goal-constbranch and goal-constindex files for X do say that those subroutines are also constant-time. For example, do not claim goal-constindex if you are directly calling OpenSSL's AES subroutines: OpenSSL does not promise, and on some platforms does not provide, constant-time AES.
If you are calling crypto_stream_aes128ctr on public inputs, you can instead use crypto_stream_aes128ctr_publicinputs.h and crypto_stream_aes128ctr_publicinputs(). SUPERCOP will then allow your constant-time code to use a faster variable-time AES implementation without a warning.
You can also write your own subroutines: for example, you might write a new crypto_stream_aey to be used inside a new crypto_aead_aey. This has advantages even when the subroutine has no other callers. The subroutine has its own automatic SUPERCOP tests; it has its own automatic SUPERCOP optimizations; and, since it is simpler, it is more likely to attract attention from people interested in further optimization and verification. (Also, this often reduces the time needed for benchmarking.)
You need to create crypto_stream/aey/used to mark the subroutine as "used". You also need to be careful to have the crypto_stream/aey/ref implementation stay within an implementation-specific namespace, as explained in the following paragraphs, so that the implementation doesn't bump into other subroutines.
For example, if your decrypt.c calls an internal function decoder declared in your decoder.h and defined in your decoder.c, then you're stepping outside your namespace, and crashing into anybody else defining another decoder function (or a decoder constant). With SUPERCOP versions starting 20200816, your implementation can even crash into itself, because SUPERCOP can compile a constant-time version of the implementation (if the implementation is marked as constant-time) and a variable-time version of the implementation (if a subroutine used in your implementation has a faster variable-time version).
This problem does not arise for the SUPERCOP interface functions such as crypto_aead_decrypt, because SUPERCOP creates crypto_aead.h to automatically move those functions to an implementation-specific namespace. This problem also does not arise for static functions and static constants, since those functions and constants are not visible outside the files that define them.
To fix, e.g., the decoder namespacing violation, insert a line
#define decoder CRYPTO_NAMESPACE(decoder)
into decoder.h. SUPERCOP automatically compiles your code with an appropriate CRYPTO_NAMESPACE macro, plus CRYPTO_NAMESPACETOP if you need to refer to the top of the namespace.
Proper namespacing is also important for people considering deploying your code in applications. There are other workarounds (e.g., "hidden visibility" in shared libraries), but namespacing is more portable and often more robust.
In older SUPERCOP versions, the namespace for the ref implementation of crypto_stream_aey was guaranteed to be crypto_stream_aey_ref. SUPERCOP versions starting 20200816 have changed the namespace (for example, indicating constbranchindex or timingleaks), could make further changes to the namespace, and no longer allow implementations to make assumptions about the namespace.
If you want a 32-bit unsigned integer type, you can
#include "crypto_uint32.h"and then use crypto_uint32. This is somewhat more portable than using stdint.h or inttypes.h. SUPERCOP provides similar facilities for uint8, uint16, uint32, uint64, int8, int16, int32, int64.
Submitting a new implementation of an existing primitive is just like submitting a new primitive. You simply have to put the new implementation into a new third-level directory under the same second-level directory.
For example, SUPERCOP already includes several AES-256-GCM implementations. One implementation, crypto_aead/aes256gcmv1/ref, is a (very slow) reference implementation. Another implementation, crypto_aead/aes256gcmv1/openssl, calls the OpenSSL library. You can submit another AES-256-GCM implementation such as crypto_aead/aes256gcmv1/smith/m4 in the same way that the designer of AEY can submit crypto_aead/aey/ref; by using the existing name aes256gcmv1 you indicate that your implementation computes the same AES-256-GCM cipher.
Your implementation is allowed to be unportable. If it doesn't compile on a particular computer, SUPERCOP will skip it and select a different implementation for that computer. But if you're submitting a new primitive then you should start by submitting a portable reference implementation designed to be as easy as possible to read, even if you're also submitting an unportable (or portable) optimized implementation. Other people verifying your software and optimizing it for other platforms will want to start with the reference implementation.
You can write internal subroutines in assembly language, using filenames *.S or *.s. *.S lets you use the preprocessor, including (in SUPERCOP versions starting 20200816) CRYPTO_NAMESPACE.
If you write an implementation using, e.g., AVX vector intrinsics or AVX assembly language, then you should create a file architectures in the implementation directory with two lines: amd64 and x86. This saves time in benchmarking: it tells SUPERCOP to skip trying to compile the implementation on, e.g., ARM.
Similarly, if you write an implementation that's ARM-specific, you should create a file architectures with three lines: aarch64, armeabi, and arm.
You can write your implementation in C++ instead of C: simply use filenames *.cc or *.cpp instead of *.c.
For used subroutines, SUPERCOP will ignore C++ implementations and use only C implementations. C++ compilers generally expect to be in control of program startup, causing problems when C++ libraries are called from other languages.
SUPERCOP automatically generates a deterministic list of inputs for your software, and hashes together the outputs into two checksums: a "small" checksum meant to very quickly weed out most bad implementations, and a "big" checksum that tries more inputs.
These checksums appear (separated by slashes) on "try" lines in the SUPERCOP database. The word after the checksums is "ok" or "fails" or "unknown"; "ok" means that the checksums match the files checksumsmall and checksumbig in the primitive directory, "fails" means that the checksums don't match (and then SUPERCOP will discard the implementation without benchmarking it), and "unknown" means that the files are absent.
Of course it's possible for a bug to slip past both checksums, but many bugs are caught by checksums, so you should include checksums.
There's a file crypto_hash/sha1/description saying "SHA-1 with 160-bit output", and a file crypto_hash/sha1/designers saying "NSA". This information goes into the online list of primitives. Typically designers is a list of names, one per line.
There's also a file crypto_hash/sha1/openssl/implementors saying "Daniel J. Bernstein (wrapper around OpenSSL)". This information goes into the online list of implementations. Typically implementors is a list of names, one per line.
If these files don't exist, they're treated as blank.
You can add a line
#define CRYPTO_VERSION "3.01a"to api.h indicating that this is version 3.01a of your software. SUPERCOP will report this information in its database of measurements.
You are encouraged to include additional files such as README or documentation.pdf with references, intellectual-property information, descriptions of the software, etc. These files do not interact with SUPERCOP's benchmarking but are often of interest for human readers.
In particular, you are encouraged to clearly specify one of the following levels of copyright protection:
You are also encouraged to clearly specify one of the following levels of patent protection:
No matter what the copyright status is, and no matter what the patent status is, all software included in SUPERCOP will be distributed to the public to ensure verifiability of benchmark results. You must ensure before submission that publication is legal.