Colin Percival, Daemonic Dispatches, Zeroing buffers is insufficient, here.
Now, some parts of the stack are easy to zero (assuming a cooperative compiler): The parts which contain objects which we have declared explicitly. Sensitive data may be stored in other places on the stack, however: Compilers are free to make copies of data, rearranging it for faster access. One of the worst culprits in this regard is GCC: Because its register allocator does not apply any backpressure to the common subexpression elimination routines, GCC can decide to load values from memory into “registers”, only to end up spilling those values onto the stack when it discovers that it does not have enough physical registers (this is one of the reasons why gcc -O3 sometimes produces slower code than gcc -O2). Even without register allocation bugs, however, all compilers will store temporary values on the stack from time to time, and there is no legal way to sanitize these from within C. (I know that at least one developer, when confronted by this problem, decided to sanitize his stack by zeroing until he triggered a page fault — but that is an extreme solution, and is both non-portable and very clear C “undefined behaviour”.)
One might expect that the situation with sensitive data left behind in registers is less problematic, since registers are liable to be reused more quickly; but in fact this can be even worse. Consider the “XMM” registers on the x86 architecture: They will only be used by the SSE family of instructions, which is not widely used in most applications — so once a value is stored in one of those registers, it may remain there for a long time. One of the rare instances those registers are used by cryptographic code, however, is for AES computations, using the “AESNI” instruction set.
It gets worse. Nearly every AES implementation using AESNI will leave two values in registers: The final block of output, and the final round key. For encryption operations these aren’t catastrophic things to leak — the final block of output is ciphertext, and the final AES round key, while theoretically dangerous, is not enough on its own to permit an attack on AES — but the situation is very different for decryption operations: The final block of output is plaintext, and the final AES round is the AES key itself (or the first 128 bits of the key for AES-192 and AES-256). I am absolutely certain that there is software out there which inadvertantly keeps an AES key sitting in an XMM register long after it has been wiped from memory. As with “anonymous” temporary space allocated on the stack, there is no way to sanitize the complete CPU register set from within portable C code — which should probably come as no surprise, since C, being designed to be a portable language, is deliberately agnostic about the registers and even the instruction set of the target machine.
Let me say that again: It is impossible to safely implement any cryptosystem providing forward secrecy in C.