SlideShare a Scribd company logo
1 of 11
Download to read offline
Software	diseases:	memset
Author: Andrey Karpov
Date: 23.07.2012
While analyzing the source codes of various programs I can't help creating associations that each
program has a tendency to certain diseases. In many projects you can easily make out patterns of
incorrect code that can be found in different project files. In some programs these are Copy-Paste
errors, while in others it's "unsigned_integer < 0"-like checks. Each project has its own disease. The sore
of the next project (called MAME) we have checked is the memset() function.
MAME is an emulator application designed to recreate the hardware of arcade game systems in
software to preserve gaming history by preventing vintage games from being lost or forgotten [1].
Although almost all the project files have the ".c" extension, MAME is actually a C++ project. The source
code's size is rather large - 110 Mbytes.
Checking MAME with PVS-Studio was impossible before because it is built with MinGW on Windows.
MinGW is a native software port of the GNU Compiler Collection (GCC) under Microsoft Windows [2]. It
means that PVS-Studio has to provide correct support of the special features of the GCC syntax and
special keywords.
Support of MinGW has been available in PVS-Studio since version 4.70. It is not full yet, but it is enough
to check most projects. MAME was one of the first projects to be analyzed.
Note. While performing the analysis, there will be a lot of similar false reports. The odd code fragments
are located in several macros widely used in various project parts. It seems at first that there are only
false positives - scattered useful messages just get lost among them. However, you can easily fix it by
adding just a few comments to suppress the warnings triggered by the macros. See the "Suppression of
false alarms" section in the documentation to find out how to do that.
Now let's study the errors we have detected.
Incompletely cleared arrays
As we have already said, you can find a lot of fragments in the MAME project where the memset
function is used incorrectly. A typical mistake is filling only a part of an array. Consider a simple example:
UINT32 m_pstars_regs[16];
static DRIVER_INIT( pstar )
{
...
memset(state->m_pstars_regs, 0, 16);
...
}
PVS-Studio: V512 A call of the 'memset' function will lead to underflow of the buffer 'state-
>m_pstars_regs'. pgm.c 4458
Number 16 means the number of items in the "m_pstars_regs" array. But it is the number of bytes being
filled in the buffer that should be passed into the memset function. As a result, only a part of the array is
filled with zeroes.
This is the correct code:
memset(state->m_pstars_regs, 0, 16 * sizeof(UINT32));
The mistake is trivial. Programmers often think that there are few trivial errors in their programs (see
the second myth [3]). It's not so. It is very simple and silly mistakes that make up the largest part of
errors found in programs.
Do you think the error shown above is a single one? No. Here you are at least 8 other fragments where
instances of the same mistake can be found:
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'.
pgm.c 4975
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'.
pgm.c 4996
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'.
pgm.c 5056
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state-
>m_oldsplus_ram'. pgm.c 5780
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state-
>m_oldsplus_regs'. pgm.c 5781
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_sysreg'.
rungun.c 399
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_ttl_vram'.
rungun.c 400
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state-
>m_playfield_code'. malzak.c 392
In the example above, the number of items was defined by an absolute number. It is bad. You'd better
calculate the array size instead of using constants. Unfortunately, it doesn't help to avoid the error we're
discussing.
UINT16 m_control_0[8];
#define ARRAY_LENGTH(x) (sizeof(x) / sizeof(x[0]))
static MACHINE_RESET( tumbleb )
{
...
memset(state->m_control_0, 0,
ARRAY_LENGTH(state->m_control_0));
}
PVS-Studio: V512 A call of the 'memset' function will lead to underflow of the buffer 'state-
>m_control_0'. tumbleb.c 2065
The ARRAY_LENGTH macro is used to calculate the number of array items. Again, it's incorrect. The
programmer should have calculated the array size, not number of items it contains.
There are two ways to fix it.
The first:
memset(state->m_control_0, 0, sizeof(state->m_control_0));
The second:
memset(state->m_control_0, 0,
ARRAY_LENGTH(state->m_control_0) * sizeof(UINT16));
These are some other fragments where arrays are not filled correctly in the same way:
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_pmac_read'.
megadriv.c 7156
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_pmac_write'.
megadriv.c 7157
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state-
>m_cart_is_genesis'. megatech.c 426
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_vol_ctrl'.
nycaptor.c 841
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_rotate_ctrl'.
wgp.c 949
• V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_vreg'.
othldrby.c 237
Misfortunes with the memset() function are over here, though I may have missed some mistakes. But
it's time for another, as scary, function memcpy().
Incorrect use of the memcpy() function
Let's look at a code that causes an array overrun:
#define CHD_SHA1_BYTES 20
#define CHD_V4_HEADER_SIZE 108
#define CHD_MAX_HEADER_SIZE CHD_V4_HEADER_SIZE
static chd_error header_read(...., chd_header *header)
{
UINT8 rawheader[CHD_MAX_HEADER_SIZE];
...
memcpy(header->parentsha1, &rawheader[100], CHD_SHA1_BYTES);
...
}
PVS-Studio: V512 A call of the 'memcpy' function will lead to the '& rawheader[100]' buffer becoming
out of range. chd.c 1870
The 'rawheader' array consists of 108 bytes. We want to copy its contents from byte 100 on. The trouble
is we will reach outside the array boundaries. We can copy only 8 bytes, yet 20 bytes are actually
copied. Unfortunately, I don't know how to fix this code, since I'm not familiar with the program logic.
When using the memset() function, it often happens that only a part of an array is filled.
Correspondingly, when you use the memset() function, there may often be errors causing only a part of
an array to be copied. Consider the following sample:
UINT16 m_spriteram16[0x1000];
UINT16 m_spriteram16_buffered[0x1000];
static WRITE32_HANDLER( deco32_buffer_spriteram_w )
{
deco32_state *state =
space->machine().driver_data<deco32_state>();
memcpy(state->m_spriteram16_buffered,
state->m_spriteram16, 0x1000);
}
PVS-Studio: V512 A call of the 'memcpy' function will lead to underflow of the buffer 'state-
>m_spriteram16_buffered'. deco32.c 706
That's a small function. But it has an error. I think you have already guessed that multiplication by
sizeof(UINT16) is missing.
This is the correct code:
memcpy(state->m_spriteram16_buffered,
state->m_spriteram16,
0x1000 * sizeof(UINT16));
The same error here:
V512 A call of the 'memcpy' function will lead to underflow of the buffer 'state-
>m_spriteram16_2_buffered'. deco32.c 726
Misprints and Copy-Paste
In any project you can see misprints and errors caused by using the Copy-Paste technology. There are
few of them in some projects and quite many in others. In MAME these errors are not numerous, yet
they are there. Let's study some of them.
static WRITE8_HANDLER( tms70x0_pf_w )
{
...
if( ((cpustate->pf[0x03] & 0x80) == 0) &&
((data & 0x80) == 0x80 ) )
{
...
}
else if( ((data & 0x80) == 0x80 ) &&
((cpustate->pf[0x03] & 0x80) == 0) )
{
...
}
...
}
PVS-Studio: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of
logical error presence. Check lines: 577, 584. tms7000.c 577
If you look close, you will notice that the first and the second conditions are identical. They have
different order of comparisons, but it doesn't influence the result in any way.
Consider the following example.
class device_debug
{
device_disasm_interface *m_disasm;
...
int min_opcode_bytes() const
{
return (m_disasm != NULL) ?
m_disasm->max_opcode_bytes() : 1;
}
int max_opcode_bytes() const
{
return (m_disasm != NULL) ?
m_disasm->max_opcode_bytes() : 1;
}
}
PVS-Studio: V524 It is odd that the body of 'max_opcode_bytes' function is fully equivalent to the body
of 'min_opcode_bytes' function (debugcpu.h, line 150). debugcpu.h 151
The max_opcode_bytes() function is identical to the min_opcode_bytes() function. This is most likely
incorrect. I suppose that the min_opcode_bytes() function was intended to be written as follows:
int min_opcode_bytes() const
{
return (m_disasm != NULL) ?
m_disasm->min_opcode_bytes() : 1;
}
Here are some other code fragments which are most probably misprints:
• V583 The '?:' operator, regardless of its conditional expression, always returns one and the same
value: ",(%d,". 9900dasm.c 670
• V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical
error presence. Check lines: 549, 579. cdrom.c 549
• V501 There are identical sub-expressions 'offset != (0x370 >> 1)' to the left and to the right of
the '&&' operator. decoprot.c 118
• V501 There are identical sub-expressions 'offset != (0x3c0 >> 1)' to the left and to the right of
the '&&' operator. decoprot.c 118
• V501 There are identical sub-expressions 'offset != 0x2c / 2' to the left and to the right of the
'&&' operator. decoprot.c 240
• V501 There are identical sub-expressions 'offset != 0xe' to the left and to the right of the '&&'
operator. decoprot.c 447
Undefined behavior
Quite a number of warnings generated by PVS-Studio for this project refer to shift operations. These
operations lead to undefined behavior. Of course, when you use particular compilers, your code can
work properly for many years. That's why we can call these errors potential. They may reveal
themselves when moving to a different platform, compilers or optimization switches. To learn more
about it, please see the article: "Wade not in unknown waters. Part three." [4].
Consider a couple of samples causing undefined behavior. The first sample:
#define ATARIRLE_PRIORITY_SHIFT 12
#define ATARIRLE_PRIORITY_MASK 
((~0 << ATARIRLE_PRIORITY_SHIFT) & 0xffff)
PVS-Studio: V610 Undefined behavior. Check the shift operator '<<. The left operand '~0' is negative.
atarig42.c 220
Any code fragment using the ATARIRLE_PRIORITY_MASK macro leads to undefined behavior. Negative
numbers cannot be shifted. This macro should be rewritten in the following way:
#define ATARIRLE_PRIORITY_MASK 
((~(0u) << ATARIRLE_PRIORITY_SHIFT) & 0xffff)
Here is another, larger, sample:
UINT32 m_color1_mask;
#define ARRAY_LENGTH(x) (sizeof(x) / sizeof(x[0]))
PALETTE_INIT( montecar )
{
static const UINT8 colortable_source[] =
{
0x00, 0x00, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03,
0x03, 0x03, 0x03, 0x02, 0x03, 0x01, 0x03, 0x00,
0x00, 0x00, 0x02, 0x00, 0x02, 0x01, 0x02, 0x02,
0x00, 0x10, 0x20, 0x30, 0x00, 0x04, 0x08, 0x0c,
0x00, 0x44, 0x48, 0x4c, 0x00, 0x84, 0x88, 0x8c,
0x00, 0xc4, 0xc8, 0xcc
};
...
for (i = 0; i < ARRAY_LENGTH(colortable_source); i++)
{
UINT8 color = colortable_source[i];
if (color == 1)
state->m_color1_mask |= 1 << i;
...
}
...
}
PVS-Studio: V610 Undefined behavior. Check the shift operator '<<. The right operand ('i' = [0..43]) is
greater than or equal to the length in bits of the promoted left operand. firetrk.c 111
The 'colortable_source' array contains 44 items. Therefore, the 'i' loop counter takes values from 0 to
43. Number '1' has the int type - it cannot be shifted more than by 31 bits. If you shift it by more bits, it
will cause undefined behavior according to the language standard.
Since there are quite few warnings related to shifts, we won't cite them in the article. You can look
through the list of these messages in the text file: mame-shift-ub.txt.
Other errors
Besides the functions memset() and memcpy(), there is memcmp() I've almost forgotten about. This
function is from the same gang. Fortunately, I've found only one error related to use of this function in
MAME.
static const char *apr_magic = "ACT Apricot disk imagex1ax04";
FLOPPY_IDENTIFY( apridisk_identify )
{
UINT8 header[APR_HEADER_SIZE];
floppy_image_read(floppy, &header, 0, sizeof(header));
if (memcmp(header, apr_magic, sizeof(apr_magic)) == 0)
...
}
PVS-Studio: V579 The memcmp function receives the pointer and its size as arguments. It is possibly a
mistake. Inspect the third argument. apridisk.c 128
The sizeof() operator calculates the pointer size instead of the number of bytes in a string. As a result,
only the first several bytes are compared. We can fix it by defining the 'apr_magic' variable as an array:
static const char apr_magic[] = "ACT Apricot disk imagex1ax04";
This is an example of an expression which is always true:
int m_led_extender;
#define CARD_A 1
#define NO_EXTENDER 0
static WRITE8_DEVICE_HANDLER( pia_ic5_porta_w )
{
...
else if ((state->m_led_extender != CARD_A)||
(state->m_led_extender != NO_EXTENDER))
...
}
PVS-Studio: V547 Expression is always true. Probably the '&&' operator should be used here. mpu4.c
934
The "X != 1 || X != 0" condition is always true. Most probably, the '&&' operator should be written
instead of the '||' operator.
Use of a pointer before a check. I will cite only one example of this. I saw other V595 messages too but
didn't note them down. In many cases the code works well, as the pointer never equals zero in those
fragments. Here is an example of odd code:
static void stv_vdp2_drawgfxzoom(...,
const gfx_element *gfx, ...)
{
...
if (gfx->pen_usage &&
transparency == STV_TRANSPARENCY_PEN)
{
...
}
if( gfx )
{
...
}
...
}
PVS-Studio: V595 The 'gfx' pointer was utilized before it was verified against nullptr. Check lines: 2457,
2483. stvvdp2.c 2457
At times I come across some strange code about which I cannot tell for sure if it has an error or not.
Maybe there is a Copy-Paste mistake. And maybe everything is correct and the two code branches are
actually intended to be identical. Here is an example:
static DEVICE_START( deco16ic )
{
...
if (intf->split)
deco16ic->pf2_tilemap_16x16 =
tilemap_create_device(device, get_pf2_tile_info,
deco16_scan_rows, 16, 16, fullwidth ?
64 : 32, fullheight ? 64 : 32);
else
deco16ic->pf2_tilemap_16x16 =
tilemap_create_device(device, get_pf2_tile_info,
deco16_scan_rows, 16, 16, fullwidth ?
64 : 32, fullheight ? 64 : 32);
...
}
PVS-Studio: V523 The 'then' statement is equivalent to the 'else' statement. deco16ic.c 943
Regardless of the condition, one and the same action is performed. Here is another similar sample:
int compute_res_net(int inputs, int channel, const res_net_info *di)
{
...
if (OpenCol)
{
rTotal += 1.0 / di->rgb[channel].R[i];
v += vOL / di->rgb[channel].R[i];
}
else
{
rTotal += 1.0 / di->rgb[channel].R[i];
v += vOL / di->rgb[channel].R[i];
}
...
}
PVS-Studio: V523 The 'then' statement is equivalent to the 'else' statement. resnet.c 628
Conclusion
As usual, I will stress that these are probably quite not all the errors that PVS-Studio can find in MAME.
The task of this article is to show that PVS-Studio is learning to check crossplatform projects. To know
how exactly you can integrate into the make-file, please see the documentation. You may also ask us if
you have any troubles analyzing projects built with MinGW.
P.S. Review of analysis results currently implies that you need the Visual Studio environment where you
can open the report and study it. Manual analysis of the report is very effortful. Maybe we will make a
special tool in future that will allow you to conveniently review the report and perform code navigation
without having Visual Studio installed.
References
1. Wikipedia. MAME. http://www.viva64.com/go.php?url=900
2. Wikipedia. MinGW. http://www.viva64.com/go.php?url=901
3. Myths about static analysis. The second myth - expert developers do not make silly mistakes.
http://www.viva64.com/en/b/0116/
4. Wade not in unknown waters. Part three. http://www.viva64.com/en/b/0142/

More Related Content

Viewers also liked

Viewers also liked (7)

The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 
Open Source Creativity
Open Source CreativityOpen Source Creativity
Open Source Creativity
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 

Similar to Software diseases: memset

Penetration Testing for Easy RM to MP3 Converter Application and Post Exploit
Penetration Testing for Easy RM to MP3 Converter Application and Post ExploitPenetration Testing for Easy RM to MP3 Converter Application and Post Exploit
Penetration Testing for Easy RM to MP3 Converter Application and Post Exploit
JongWon Kim
 

Similar to Software diseases: memset (20)

I want to sell a PVS-Studio license to the Intel company
I want to sell a PVS-Studio license to the Intel companyI want to sell a PVS-Studio license to the Intel company
I want to sell a PVS-Studio license to the Intel company
 
Pre New Year Check of PostgreSQL
Pre New Year Check of PostgreSQLPre New Year Check of PostgreSQL
Pre New Year Check of PostgreSQL
 
Safe Clearing of Private Data
Safe Clearing of Private DataSafe Clearing of Private Data
Safe Clearing of Private Data
 
200 Open Source Projects Later: Source Code Static Analysis Experience
200 Open Source Projects Later: Source Code Static Analysis Experience200 Open Source Projects Later: Source Code Static Analysis Experience
200 Open Source Projects Later: Source Code Static Analysis Experience
 
Linux Kernel, tested by the Linux-version of PVS-Studio
Linux Kernel, tested by the Linux-version of PVS-StudioLinux Kernel, tested by the Linux-version of PVS-Studio
Linux Kernel, tested by the Linux-version of PVS-Studio
 
Analyzing the Dolphin-emu project
Analyzing the Dolphin-emu projectAnalyzing the Dolphin-emu project
Analyzing the Dolphin-emu project
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...
 
How to make fewer errors at the stage of code writing. Part N1.
How to make fewer errors at the stage of code writing. Part N1.How to make fewer errors at the stage of code writing. Part N1.
How to make fewer errors at the stage of code writing. Part N1.
 
How to make fewer errors at the stage of code writing. Part N1.
How to make fewer errors at the stage of code writing. Part N1.How to make fewer errors at the stage of code writing. Part N1.
How to make fewer errors at the stage of code writing. Part N1.
 
How to make fewer errors at the stage of code writing. Part N1
How to make fewer errors at the stage of code writing. Part N1How to make fewer errors at the stage of code writing. Part N1
How to make fewer errors at the stage of code writing. Part N1
 
Re-checking the ReactOS project - a large report
Re-checking the ReactOS project - a large reportRe-checking the ReactOS project - a large report
Re-checking the ReactOS project - a large report
 
PVS-Studio delved into the FreeBSD kernel
PVS-Studio delved into the FreeBSD kernelPVS-Studio delved into the FreeBSD kernel
PVS-Studio delved into the FreeBSD kernel
 
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
 
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
 
Why Windows 8 drivers are buggy
Why Windows 8 drivers are buggyWhy Windows 8 drivers are buggy
Why Windows 8 drivers are buggy
 
Picking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckPicking Mushrooms after Cppcheck
Picking Mushrooms after Cppcheck
 
Checking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto GameChecking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto Game
 
Penetration Testing for Easy RM to MP3 Converter Application and Post Exploit
Penetration Testing for Easy RM to MP3 Converter Application and Post ExploitPenetration Testing for Easy RM to MP3 Converter Application and Post Exploit
Penetration Testing for Easy RM to MP3 Converter Application and Post Exploit
 
The Ultimate Question of Programming, Refactoring, and Everything
The Ultimate Question of Programming, Refactoring, and EverythingThe Ultimate Question of Programming, Refactoring, and Everything
The Ultimate Question of Programming, Refactoring, and Everything
 

Recently uploaded

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Software diseases: memset

  • 1. Software diseases: memset Author: Andrey Karpov Date: 23.07.2012 While analyzing the source codes of various programs I can't help creating associations that each program has a tendency to certain diseases. In many projects you can easily make out patterns of incorrect code that can be found in different project files. In some programs these are Copy-Paste errors, while in others it's "unsigned_integer < 0"-like checks. Each project has its own disease. The sore of the next project (called MAME) we have checked is the memset() function. MAME is an emulator application designed to recreate the hardware of arcade game systems in software to preserve gaming history by preventing vintage games from being lost or forgotten [1]. Although almost all the project files have the ".c" extension, MAME is actually a C++ project. The source code's size is rather large - 110 Mbytes. Checking MAME with PVS-Studio was impossible before because it is built with MinGW on Windows. MinGW is a native software port of the GNU Compiler Collection (GCC) under Microsoft Windows [2]. It means that PVS-Studio has to provide correct support of the special features of the GCC syntax and special keywords. Support of MinGW has been available in PVS-Studio since version 4.70. It is not full yet, but it is enough to check most projects. MAME was one of the first projects to be analyzed. Note. While performing the analysis, there will be a lot of similar false reports. The odd code fragments are located in several macros widely used in various project parts. It seems at first that there are only false positives - scattered useful messages just get lost among them. However, you can easily fix it by adding just a few comments to suppress the warnings triggered by the macros. See the "Suppression of false alarms" section in the documentation to find out how to do that. Now let's study the errors we have detected. Incompletely cleared arrays As we have already said, you can find a lot of fragments in the MAME project where the memset function is used incorrectly. A typical mistake is filling only a part of an array. Consider a simple example: UINT32 m_pstars_regs[16]; static DRIVER_INIT( pstar ) { ... memset(state->m_pstars_regs, 0, 16); ... }
  • 2. PVS-Studio: V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_pstars_regs'. pgm.c 4458 Number 16 means the number of items in the "m_pstars_regs" array. But it is the number of bytes being filled in the buffer that should be passed into the memset function. As a result, only a part of the array is filled with zeroes. This is the correct code: memset(state->m_pstars_regs, 0, 16 * sizeof(UINT32)); The mistake is trivial. Programmers often think that there are few trivial errors in their programs (see the second myth [3]). It's not so. It is very simple and silly mistakes that make up the largest part of errors found in programs. Do you think the error shown above is a single one? No. Here you are at least 8 other fragments where instances of the same mistake can be found: • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'. pgm.c 4975 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'. pgm.c 4996 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_kb_regs'. pgm.c 5056 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_oldsplus_ram'. pgm.c 5780 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_oldsplus_regs'. pgm.c 5781 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_sysreg'. rungun.c 399 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_ttl_vram'. rungun.c 400 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_playfield_code'. malzak.c 392 In the example above, the number of items was defined by an absolute number. It is bad. You'd better calculate the array size instead of using constants. Unfortunately, it doesn't help to avoid the error we're discussing. UINT16 m_control_0[8]; #define ARRAY_LENGTH(x) (sizeof(x) / sizeof(x[0])) static MACHINE_RESET( tumbleb ) { ... memset(state->m_control_0, 0,
  • 3. ARRAY_LENGTH(state->m_control_0)); } PVS-Studio: V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_control_0'. tumbleb.c 2065 The ARRAY_LENGTH macro is used to calculate the number of array items. Again, it's incorrect. The programmer should have calculated the array size, not number of items it contains. There are two ways to fix it. The first: memset(state->m_control_0, 0, sizeof(state->m_control_0)); The second: memset(state->m_control_0, 0, ARRAY_LENGTH(state->m_control_0) * sizeof(UINT16)); These are some other fragments where arrays are not filled correctly in the same way: • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_pmac_read'. megadriv.c 7156 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_pmac_write'. megadriv.c 7157 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state- >m_cart_is_genesis'. megatech.c 426 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_vol_ctrl'. nycaptor.c 841 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_rotate_ctrl'. wgp.c 949 • V512 A call of the 'memset' function will lead to underflow of the buffer 'state->m_vreg'. othldrby.c 237 Misfortunes with the memset() function are over here, though I may have missed some mistakes. But it's time for another, as scary, function memcpy(). Incorrect use of the memcpy() function Let's look at a code that causes an array overrun: #define CHD_SHA1_BYTES 20 #define CHD_V4_HEADER_SIZE 108 #define CHD_MAX_HEADER_SIZE CHD_V4_HEADER_SIZE
  • 4. static chd_error header_read(...., chd_header *header) { UINT8 rawheader[CHD_MAX_HEADER_SIZE]; ... memcpy(header->parentsha1, &rawheader[100], CHD_SHA1_BYTES); ... } PVS-Studio: V512 A call of the 'memcpy' function will lead to the '& rawheader[100]' buffer becoming out of range. chd.c 1870 The 'rawheader' array consists of 108 bytes. We want to copy its contents from byte 100 on. The trouble is we will reach outside the array boundaries. We can copy only 8 bytes, yet 20 bytes are actually copied. Unfortunately, I don't know how to fix this code, since I'm not familiar with the program logic. When using the memset() function, it often happens that only a part of an array is filled. Correspondingly, when you use the memset() function, there may often be errors causing only a part of an array to be copied. Consider the following sample: UINT16 m_spriteram16[0x1000]; UINT16 m_spriteram16_buffered[0x1000]; static WRITE32_HANDLER( deco32_buffer_spriteram_w ) { deco32_state *state = space->machine().driver_data<deco32_state>(); memcpy(state->m_spriteram16_buffered, state->m_spriteram16, 0x1000); } PVS-Studio: V512 A call of the 'memcpy' function will lead to underflow of the buffer 'state- >m_spriteram16_buffered'. deco32.c 706 That's a small function. But it has an error. I think you have already guessed that multiplication by sizeof(UINT16) is missing. This is the correct code:
  • 5. memcpy(state->m_spriteram16_buffered, state->m_spriteram16, 0x1000 * sizeof(UINT16)); The same error here: V512 A call of the 'memcpy' function will lead to underflow of the buffer 'state- >m_spriteram16_2_buffered'. deco32.c 726 Misprints and Copy-Paste In any project you can see misprints and errors caused by using the Copy-Paste technology. There are few of them in some projects and quite many in others. In MAME these errors are not numerous, yet they are there. Let's study some of them. static WRITE8_HANDLER( tms70x0_pf_w ) { ... if( ((cpustate->pf[0x03] & 0x80) == 0) && ((data & 0x80) == 0x80 ) ) { ... } else if( ((data & 0x80) == 0x80 ) && ((cpustate->pf[0x03] & 0x80) == 0) ) { ... } ... } PVS-Studio: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 577, 584. tms7000.c 577 If you look close, you will notice that the first and the second conditions are identical. They have different order of comparisons, but it doesn't influence the result in any way. Consider the following example. class device_debug
  • 6. { device_disasm_interface *m_disasm; ... int min_opcode_bytes() const { return (m_disasm != NULL) ? m_disasm->max_opcode_bytes() : 1; } int max_opcode_bytes() const { return (m_disasm != NULL) ? m_disasm->max_opcode_bytes() : 1; } } PVS-Studio: V524 It is odd that the body of 'max_opcode_bytes' function is fully equivalent to the body of 'min_opcode_bytes' function (debugcpu.h, line 150). debugcpu.h 151 The max_opcode_bytes() function is identical to the min_opcode_bytes() function. This is most likely incorrect. I suppose that the min_opcode_bytes() function was intended to be written as follows: int min_opcode_bytes() const { return (m_disasm != NULL) ? m_disasm->min_opcode_bytes() : 1; } Here are some other code fragments which are most probably misprints: • V583 The '?:' operator, regardless of its conditional expression, always returns one and the same value: ",(%d,". 9900dasm.c 670 • V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 549, 579. cdrom.c 549 • V501 There are identical sub-expressions 'offset != (0x370 >> 1)' to the left and to the right of the '&&' operator. decoprot.c 118 • V501 There are identical sub-expressions 'offset != (0x3c0 >> 1)' to the left and to the right of the '&&' operator. decoprot.c 118 • V501 There are identical sub-expressions 'offset != 0x2c / 2' to the left and to the right of the '&&' operator. decoprot.c 240
  • 7. • V501 There are identical sub-expressions 'offset != 0xe' to the left and to the right of the '&&' operator. decoprot.c 447 Undefined behavior Quite a number of warnings generated by PVS-Studio for this project refer to shift operations. These operations lead to undefined behavior. Of course, when you use particular compilers, your code can work properly for many years. That's why we can call these errors potential. They may reveal themselves when moving to a different platform, compilers or optimization switches. To learn more about it, please see the article: "Wade not in unknown waters. Part three." [4]. Consider a couple of samples causing undefined behavior. The first sample: #define ATARIRLE_PRIORITY_SHIFT 12 #define ATARIRLE_PRIORITY_MASK ((~0 << ATARIRLE_PRIORITY_SHIFT) & 0xffff) PVS-Studio: V610 Undefined behavior. Check the shift operator '<<. The left operand '~0' is negative. atarig42.c 220 Any code fragment using the ATARIRLE_PRIORITY_MASK macro leads to undefined behavior. Negative numbers cannot be shifted. This macro should be rewritten in the following way: #define ATARIRLE_PRIORITY_MASK ((~(0u) << ATARIRLE_PRIORITY_SHIFT) & 0xffff) Here is another, larger, sample: UINT32 m_color1_mask; #define ARRAY_LENGTH(x) (sizeof(x) / sizeof(x[0])) PALETTE_INIT( montecar ) { static const UINT8 colortable_source[] = { 0x00, 0x00, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x03, 0x03, 0x03, 0x02, 0x03, 0x01, 0x03, 0x00, 0x00, 0x00, 0x02, 0x00, 0x02, 0x01, 0x02, 0x02, 0x00, 0x10, 0x20, 0x30, 0x00, 0x04, 0x08, 0x0c, 0x00, 0x44, 0x48, 0x4c, 0x00, 0x84, 0x88, 0x8c, 0x00, 0xc4, 0xc8, 0xcc
  • 8. }; ... for (i = 0; i < ARRAY_LENGTH(colortable_source); i++) { UINT8 color = colortable_source[i]; if (color == 1) state->m_color1_mask |= 1 << i; ... } ... } PVS-Studio: V610 Undefined behavior. Check the shift operator '<<. The right operand ('i' = [0..43]) is greater than or equal to the length in bits of the promoted left operand. firetrk.c 111 The 'colortable_source' array contains 44 items. Therefore, the 'i' loop counter takes values from 0 to 43. Number '1' has the int type - it cannot be shifted more than by 31 bits. If you shift it by more bits, it will cause undefined behavior according to the language standard. Since there are quite few warnings related to shifts, we won't cite them in the article. You can look through the list of these messages in the text file: mame-shift-ub.txt. Other errors Besides the functions memset() and memcpy(), there is memcmp() I've almost forgotten about. This function is from the same gang. Fortunately, I've found only one error related to use of this function in MAME. static const char *apr_magic = "ACT Apricot disk imagex1ax04"; FLOPPY_IDENTIFY( apridisk_identify ) { UINT8 header[APR_HEADER_SIZE]; floppy_image_read(floppy, &header, 0, sizeof(header)); if (memcmp(header, apr_magic, sizeof(apr_magic)) == 0) ... }
  • 9. PVS-Studio: V579 The memcmp function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. apridisk.c 128 The sizeof() operator calculates the pointer size instead of the number of bytes in a string. As a result, only the first several bytes are compared. We can fix it by defining the 'apr_magic' variable as an array: static const char apr_magic[] = "ACT Apricot disk imagex1ax04"; This is an example of an expression which is always true: int m_led_extender; #define CARD_A 1 #define NO_EXTENDER 0 static WRITE8_DEVICE_HANDLER( pia_ic5_porta_w ) { ... else if ((state->m_led_extender != CARD_A)|| (state->m_led_extender != NO_EXTENDER)) ... } PVS-Studio: V547 Expression is always true. Probably the '&&' operator should be used here. mpu4.c 934 The "X != 1 || X != 0" condition is always true. Most probably, the '&&' operator should be written instead of the '||' operator. Use of a pointer before a check. I will cite only one example of this. I saw other V595 messages too but didn't note them down. In many cases the code works well, as the pointer never equals zero in those fragments. Here is an example of odd code: static void stv_vdp2_drawgfxzoom(..., const gfx_element *gfx, ...) { ... if (gfx->pen_usage && transparency == STV_TRANSPARENCY_PEN)
  • 10. { ... } if( gfx ) { ... } ... } PVS-Studio: V595 The 'gfx' pointer was utilized before it was verified against nullptr. Check lines: 2457, 2483. stvvdp2.c 2457 At times I come across some strange code about which I cannot tell for sure if it has an error or not. Maybe there is a Copy-Paste mistake. And maybe everything is correct and the two code branches are actually intended to be identical. Here is an example: static DEVICE_START( deco16ic ) { ... if (intf->split) deco16ic->pf2_tilemap_16x16 = tilemap_create_device(device, get_pf2_tile_info, deco16_scan_rows, 16, 16, fullwidth ? 64 : 32, fullheight ? 64 : 32); else deco16ic->pf2_tilemap_16x16 = tilemap_create_device(device, get_pf2_tile_info, deco16_scan_rows, 16, 16, fullwidth ? 64 : 32, fullheight ? 64 : 32); ... }
  • 11. PVS-Studio: V523 The 'then' statement is equivalent to the 'else' statement. deco16ic.c 943 Regardless of the condition, one and the same action is performed. Here is another similar sample: int compute_res_net(int inputs, int channel, const res_net_info *di) { ... if (OpenCol) { rTotal += 1.0 / di->rgb[channel].R[i]; v += vOL / di->rgb[channel].R[i]; } else { rTotal += 1.0 / di->rgb[channel].R[i]; v += vOL / di->rgb[channel].R[i]; } ... } PVS-Studio: V523 The 'then' statement is equivalent to the 'else' statement. resnet.c 628 Conclusion As usual, I will stress that these are probably quite not all the errors that PVS-Studio can find in MAME. The task of this article is to show that PVS-Studio is learning to check crossplatform projects. To know how exactly you can integrate into the make-file, please see the documentation. You may also ask us if you have any troubles analyzing projects built with MinGW. P.S. Review of analysis results currently implies that you need the Visual Studio environment where you can open the report and study it. Manual analysis of the report is very effortful. Maybe we will make a special tool in future that will allow you to conveniently review the report and perform code navigation without having Visual Studio installed. References 1. Wikipedia. MAME. http://www.viva64.com/go.php?url=900 2. Wikipedia. MinGW. http://www.viva64.com/go.php?url=901 3. Myths about static analysis. The second myth - expert developers do not make silly mistakes. http://www.viva64.com/en/b/0116/ 4. Wade not in unknown waters. Part three. http://www.viva64.com/en/b/0142/