Developers of libraries have to be more diligent than «classic» application programmers. Why? You never know where and when the library will be used: Platforms; Compilers; Optimizations; Usage scenarios.
5. About me
• Karpov Andrey Nikolaevich
• On Habr as Andrey2008
habr.com/users/andrey2008/
• CTO in ООО «Program Verification Systems»
• Microsoft MVP
• Intel Black Belt Software Developer
• One of the founders of the PVS-Studio project
5
6. The main point
• Developers of libraries have to be more diligent than «classic»
application programmers
6
7. Why?
• You never know where and when the library will be used
• Platforms
• Compilers
• Optimizations
• Usage scenarios
7
9. Say hi to the problem of 2038 year
time_t
32-bit type
is still used
time_t
9
10. 2038
• Especially relevant to:
• Embedded systems
• libraries
• Jonathan Corbet. 2038: only 21 19 years away
• https://www.viva64.com/en/b/0505/
10
12. Let's take the example of C4201
• C4201: unnamed structures or unions. Level 4.
• Issued with /Wall.
• If you write default, the warning won’t be issued any more, as it’s
disabled by default.
12
14. Correct: push, pop
#pragma warning(push)
#pragma warning(disable: NNNN)
....
// Correct code, for which an NNNN warning is issued....
#pragma warning(pop)
14
15. #pragma warning(default: NNN)
• It is ok in your *.cpp file
• Mustn’t be in a header file of a library:
• Useful warnings may disappear from the user’s code
• Redundant warnings may appear in the user’s code
15
16. 64-bit errors
• I’ve had many talks and notes on this topic
• Details by these links:
• Lessons on development of 64-bit C/C++ applications
https://www.viva64.com/en/l/full/
• DevGAMM 2018. Patterns of 64-bit errors in games
https://youtu.be/uzmdelgQYuc
16
17. 64-bit errors
• Important: what will work for your code, isn’t going to fly for your
library
void Init(double *array, size_t n)
{
for (int i = 0; i < n; i++)
array[i] = 1.0;
}
17
18. A bit of 64-bit humor
• StackOverflow:
C and static Code analysis: Is this safer than memcpy?
https://stackoverflow.com/questions/55239222/c-and-
static-code-analysis-is-this-safer-than-memcpy
• The analyzer issues a warning for the memset function;
• Let’s «solve the problem» by writing our own copy
function!
18
19. A bit of 64-bit humor
void myMemCpy(void *dest, void *src, size_t n)
{
char *csrc = (char *)src;
char *cdest = (char *)dest;
for (int i=0; i<n; i++)
cdest[i] = csrc[i];
}
19
21. malloc, realloc You must check
what memory
allocation functions
returned!
21
22. Null pointer dereference= UB
• You cannot hope that a crash will happen immediately;
• You may taint some data, which is handled in parallel threads.
void *memset(void *dest, int c, size_t n)
{
size_t k;
if (!n) return dest;
dest[0] = dest[n-1] = c;
if (n <= 2) return dest;
//....
} 22
23. Who said that a null offset would happen?
EFL library
static void
st_collections_group_parts_part_description_filter_data(void)
{
//....
filter->data_count++;
array = realloc(filter->data,
sizeof(Edje_Part_Description_Spec_Filter_Data) *
filter->data_count);
array[filter->data_count - 1].name = name;
array[filter->data_count - 1].value = value;
filter->data = array;
}
23
24. Null pointer dereference
is a potential vulnerability
• A crash with memory shortage is a denial of service;
• Such behavior is inexcusable for libraries’ code;
• An exception has to be thrown or one has to return the error status.
24
25. xmalloc for libraries is not a solution
• xmalloc terminates the application if memory allocation fails;
• It can be called in your own code;
• You MUSTN’T call it in the code of libraries.
25
26. Undefined behavior
• Code of libraries has more chances that undefined behavior will show
itself
• Let’s see an interesting function, which seized working due to
«faulty GCC 8.0»
https://www.linux.org.ru/forum/development/14422428
26
27. int foo(const unsigned char *s)
{
int r = 0;
while(*s) {
r += ((r * 20891 + *s *200) | *s ^ 4 | *s ^ 3) ^ (r >> 1);
s++;
}
return r & 0x7fffffff;
}
27
28. Dangerous usage of printf
char buffer[1001];
int len;
while ((len = pBIO_read(bio, buffer, 1000)) > 0)
{
buffer[len] = 0;
fprintf(file, buffer);
}
28
30. Don’t create macros/functions with names,
matching standard ones
• The first code example to demonstrate the point, not from a library.
static int EatWhitespace (FILE * InFile)
{
int c;
for (c = getc (InFile);
isspace (c) && ('n' != c);
c = getc (InFile))
;
return (c);
}
Midnight Commander.
Where is the error?
30
31. You won’t find the error until you look in the
header file
#ifdef isspace
#undef isspace
#endif
....
#define isspace(c) ((c)==' ' || (c) == 't')
31
32. What’s wrong with this code?
bool set_value_convert(char_t*& dest, uintptr_t& header,
uintptr_t header_mask, int value)
{
char buf[128];
sprintf(buf, "%d", value);
return set_value_buffer(dest, header, header_mask, buf);
}
32
35. Uninitialized data
• When it comes to libraries, it makes sense to be a big nerd and
initialize just in case.
35
36. Gods may do what cattle may not
• Remember, that from now on this != nullptr.
void CWnd::AttachControlSite(CHandleMap* pMap)
{
if (this != NULL && m_pCtrlSite == NULL)
{
//....
}
}
36
37. Fun fact
• At the time of C++ 1983 (Cfront) you could do like this:
this = p;
37
38. Make a clear distinction in the interfaces
BSTR and wchar_t *
• Some people sometimes forget about differences between BSTR and
wchar_t *;
• Which leads to usage of incorrect types with all the consequences in
interfaces.
38
39. BSTR and wchar_t *
• wchar_t * – null-terminated string;
• BSTR is a string data type used in COM, Automation and Interop
functions;
• «Different physical sense», but in terms of C and C++, these are equal
types:
typedef wchar_t OLECHAR;
typedef OLECHAR * BSTR;
39
40. BSTR
• Length prefix (4 bytes);
• String of symbols in the Unicode encoding;
• Terminal null;
• The BSTR type is a pointer, which points at the first symbol of a string,
not the length prefix.
A B C D 0Length prefix
BSTR 40
42. BSTR
• It is incorrect to work with BSTR the same way as with wchar_t *;
• Clearly distinguish these types in the interfaces;
• Don’t confuse users.
BSTR str = foo();
str += 3; // Теперь str это испорченная BSTR строка
42
43. Be careful with data truncation
• Classic ways to lose significant bits:
int i = fgetc(stream);
char c = i;
void *ptr = foo();
int n = (int)ptr;
ptr = (void *)n;
43
44. Truncation of string length
std::string str(foo());
short len = str.length();
size_t len = str.length();
std::string::size_type len = str.length();
auto len = str.length();
Gives a way to
potential
vulnerabilities.
It is unforgivable for
libraries.
44
47. Truncation: EOF
char c;
printf("%s is already in *.base_fs format, just copying ....);
rewind(blk_alloc_file);
while ((c = fgetc(blk_alloc_file)) != EOF) {
fputc(c, base_fs_file);
}
Android, C
47
48. More about EOF
while (!cin.eof())
cin >> x;
while (!cin.eof() && !cin.fail())
cin >> x;
while (cin)
cin >> x;
48
50. memset and CWE-14
• Safe Clearing of Private Data
• https://www.viva64.com/en/b/0388/
• CWE-14: Compiler Removal of Code to Clear Buffers
https://cwe.mitre.org/data/definitions/14.html
50
52. Quick response to vulnerabilities
• Many apps depend on one library;
• It is reliability, which often wins over, rather than functionality;
• Don’t hesitate to charge for this;
• Users buy expensive paid libraries for creating PDF, even though there
are plenty of free ones;
• To save money do things right!
52
53. Code Review
• Error handlers (usually tested badly);
• correctness;
• no exit, abort and so on.
• malloc, realloc;
• Names of functions and macros;
• Namespace was made up for a reason;
• Special attention to UB;
• Special attention to truncation of high bits;
• Initialization and memory clearing.
53
54. Recommendations
• Much attention to Code Review;
• Use static code analyzers;
• Google opened a system for creating sandbox-environments for
C/C++ libraries:
https://security.googleblog.com/2019/03/open-sourcing-
sandboxed-api.html
54