Pitfalls in Static Code Analysis: Resource Allocation

Author: Lukas Charvat
Published: 8/26/2023
Tags: #staticcodeanalysis #sca #finetuning #resourceleak

As software complexity increases, engineering teams rely on static code analysis tools to automatically detect bugs and security vulnerabilities before shipping the products. However, many teams struggle to use these tools efficiently, overwhelmed by trivial alerts and unsure how to focus analysis on critical code.

This results in frustrated engineers and missed opportunities to eliminate potentially catastrophic defects. To maximize the impact of static analysis, developers need strategies to cut through the noise and target these powerful tools on high-risk code.

The purpose of this article is to reveal common pitfalls teams encounter when rolling out static code analysis tools and provide concrete tips to avoid them.

Why We Need The Power of Static Code Analysis?

Static code analysis tools evaluate source code without executing programs. Leveraging techniques like data flow analysis, control flow analysis, and pattern matching, these tools can identify common coding errors like null pointers, array bounds violations, resource leaks, and/or insecure calls. By flagging these issues before software gets deployed, teams can fix problems early when they are cheaper to correct.

Studies show that fixing bugs after launch is 15-100x more expensive than catching them during coding phases.

The benefits of finding and eliminating defects prior to release are enormous. High quality code that follows security best practices is reliable, resilient, and less vulnerable to being hacked. Engineers utilizing static analysis can do their jobs more efficiently by reducing time wasted chasing bugs post-launch.

By preventing costly quality escapes, security incidents, or late-stage bug fixing, static code analysis delivers an extremely high ROI. Studies show that fixing bugs after launch is 15-100x more expensive than catching them during coding phases. For products where safety and security are paramount, static analysis plays a pivotal role in minimizing catastrophic field failures.

Pitfall: Neglecting Resource Allocation and Deallocation

Proper resource allocation and deallocation is critical. Engineers must carefully manage memory, handles, locks and other limited system resources. Failure to do so can lead to crashes, deadlocks, and vulnerabilities like denial of service.

For example, neglecting to free memory after usage can slowly leak resources over time, eventually starving the system. Failing to release file handles can wedge processes when resources become exhausted. These resource exhaustion bugs are insidious and difficult to trace when systems are under load.

Static analysis tools are excellent at detecting potential resource leaks and oversights in allocation/deallocation logic. Technologies like control flow analysis can trace resource usage and uncover bugs like:

Forgetting to free memory after allocation.
Failing to close files after opening.
Releasing locks out of order.

By catching these bugs pre-release, engineers can remedy them before product ships and, therefore, avoid nasty production crashes.

Resource Leaks in Low-Level APIs

While the above checks are built-in and generally work well for common APIs like POSIX, problems can arise when low-level functions are used instead of the standard APIs. For instance, in Azure ThreadX / NetX RTOS, all calls to nx_tcp_socket_create must be eventually accompanied by a corresponding nx_tcp_socket_delete to properly register/deregister the socket within the kernel. Static tools may miss these lower-level issues if not configured properly.

But how exactly can we instruct the static analyzer to account for these additional custom resource acquire/release functions? Well, it depends on the tool. For instance, in the Klocwork static analyzer, one has the option to specify these rules in the form of a knowledge base file (*.kb). The entries for the standard POSIX fopen and fclose file stream manipulation functions might look as follows:

fopen - ACQUIRE FILE : 1 : $$ : $$ NE 0
fclose - RELEASE FILE : 1 : $1 : 1

In this example, the knowledge base specifies that the fopen function returns a resource of type FILE. The function fopen executes without any preconditions (shown by 1), but only if the return value is not NULL (indicated by $$ NE 0 post-condition). The fclose line states that the function always closes the file descriptor passed as the first argument ($1).

We can use a similar approach to describe contracts for the nx_tcp_socket_create and nx_tcp_socket_delete functions:

nx_tcp_socket_create - ACQUIRE NX_TCP_SOCKET : 1 : *$2 : $$ EQ 0
nx_tcp_socket_delete - RELEASE NX_TCP_SOCKET : 1 : *$1 : $$ EQ 0

The knowledge base specifies that a new socket resource of type NX_TCP_SOCKET is acquired by calling nx_tcp_socket_create with no preconditions (value of 1). The socket handle is passed by reference in the second parameter (as indicated by *$2). The resource is only acquired if the result is equal to NX_SUCCESS (described by $$ EQ 0 post-condition).

Similarly, nx_tcp_socket_delete releases the socket resource if a valid descriptor is passed via reference in the first parameter and the call returns NX_SUCCESS.

Customizing the knowledge base is key to maximizing effectiveness across all code.

By modeling the acquire/release semantics in this way, the analyzer can now track correctness of socket creation and deletion calls, even for these lower-level RTOS functions. As one can see, customizing the knowledge base is key to maximizing effectiveness across all code.

Resource Leak in Real Production Code

The simplified C code example below comes from a production firmware of an embedded system that was always able to run flawlessly for months and then it suddenly began exhibiting sporadic anomalies.

int sender(/* ... */)
{
  /* Code omitted */

  do {
    /* Code omitted */

    /* Socket control block is located on the stack. */
    NX_TCP_SOCKET tcp_client_socket;

    /* Code omitted */

    /* Socket is created, i.e., its control block is registered
        inside ThreadX / NetX OS. */
    if (nx_tcp_socket_create(&ip, &tcp_client_socket, /* ... */)
      != NX_SUCCESS) {
      FAIL("TCP socket creation!");
    }

    /* Code omitted */

    if (nx_tcp_socket_send(&tcp_client_socket, packet, 0)
      != NX_SUCCESS) {
      FAIL("TCP packet send failed!");
      nx_packet_delete(&packet);
      nx_tcp_socket_disconnect(&tcp_client_socket);
    } else {
      LOG("TCP packet sent.");
      nx_tcp_socket_disconnect(&tcp_client_socket);

      break;
    }
  } while (retry_count < 3);

  /* Code omitted */

  /* After loop breaks, a return from function occurs but
      the socket is not deleted, i.e., its invalidated control
      block remains registered within ThreadX / NetX OS. */

  return 0;
}

In the faulty code above, a TCP socket is created and used within a retry loop. While the loop might appear robust, a subtle leak is present. After exiting the loop, the function returns but the socket is never deleted. This leaves the invalid socket (and stack allocated) control block still registered in the ThreadX OS, slowly leaking resources over time.

This example illustrates how easy it is for even experienced engineers to overlook resource cleanup issues that eventually cascade into system instability. Luckily, static analysis tools customized to the environment are highly effective at catching these invisible bugs early. By taking the time to configure knowledge bases of static analysis tools thoughtfully, software engineers can uncover flaws like this pre-release when they are easy to fix.

References

Azure ThreadX / NetX RTOS: Azure RTOS embedded development suite.
Klocwork C/C++ knowledge base reference: Additional reading material for Klocwork's C/C++ knowledge base files.