Today I was asked for some quick help with additions to a customer’s C++ code running as a user exit in one of our systems. This bit of code used sprintf to format an error message into a buffer passed in by our system. Pretty pedestrian stuff, but it came among many surprises, one being that the code often had no regard for buffer lengths, and risked overflowing them.
The buffer was only 256 bytes because it’s used for a summary message (if there’s more to say, it should be said elsewhere, such as a log file). 256 bytes is pretty easy to overflow, particularly with additions we added, so we needed some crash protection. snprintf would have been a quick alternative, but was not available on this platform. So I added widths to the format specifiers, changing something like this:
sprintf(pszMessage, “SQL error in whatever, rc=%04d, SQL=%s”, rc, pSql->Stmt);
to this:
sprintf(pszMessage, “SQL error in whatever, rc=%04d, SQL=%.215s”, rc, pSql->Stmt);
To avoid magic numbers, I could have done something like:
sprintf(pszMessage, “SQL error in whatever, rc=%04d, SQL=%.*s”, rc, (BUFFER_SIZE – 40), pSql->Stmt);
And had snprintf been available, it would have been even cleaner:
snprintf(pszMessage, BUFFER_SIZE, “SQL error in whatever, rc=%04d, SQL=%s”, rc, pSql->Stmt);
Truncating a message is not ideal, but it sure beats tossing a GPF. In this case, losing the end of the SQL statement would not be a big deal, but the message could be rearranged or additional logging added if it were.
The point is that too much C and C++ code in the wild holds too little concern for bounds checking. Fortunately, I now code mainly in OO languages where this is not a concern – things grow as needed. But since C and C++ (taken together) are still arguably by far the most pervasive programming languages (with new programmers on board with it every day), it’s very much relevant. The advice is simple:
Avoid unbounded string functions. Avoid sprintf, strcpy, strcat, strlen, and gets. Use snprintf, strlcpy, strlcat, strncpy, strncat, fgets, and similar functions depending on availability. If these alternatives aren’t available, use techniques like the above to avoid overflowing buffers. When using the strn functions, remember to leave room at the end to add the null terminator (and remember to add it!)
You can find further such advice in the C Programmer’s Ten Commandments.
Back in the day, when I coded mostly in C and C++, we augmented our lint process to scan for things like strcpy, strcat, and sprintf. Richer compiler warnings have largely taken the place of lint (with extensions lost along the way), but it’s easy enough to simply grep for these. And the time spent staying between the lines is a small price to pay for some crash insurance.