Open Projects
This page lists several projects that would boost analyzer's usability and power. Most of the projects listed here are infrastructure-related so this list is an addition to the potential checkers list. If you are interested in tackling one of these, please send an email to the cfe-dev mailing list to notify other members of the community.
- Release checkers from "alpha"
    New checkers which were contributed to the analyzer, but have not passed a rigorous evaluation process, are committed as "alpha checkers" (from "alpha version"), and are not enabled by default. Ideally, only the checkers which are actively being worked on should be in "alpha", but over the years the development of many of those has stalled. Such checkers should either be improved up to a point where they can be enabled by default, or removed from the analyzer entirely. - alpha.security.ArrayBoundand- alpha.security.ArrayBoundV2- Array bounds checking is a desired feature, but having an acceptable rate of false positives might not be possible without a proper loop widening support. Additionally, it might be more promising to perform index checking based on tainted index values. - (Difficulty: Medium) 
- alpha.cplusplus.MisusedMovedObject- The checker emits a warning on objects which were used after move. Currently it has an overly high false positive rate due to classes which have a well-defined semantics for use-after-move. This property does not hold for STL objects, but is often the case for custom containers. - (Difficulty: Medium) 
- alpha.unix.StreamChecker- A SimpleStreamChecker has been presented in the Building a Checker in 24 Hours talk (slides video). - This alpha checker is an attempt to write a production grade stream checker. However, it was found to have an unacceptably high false positive rate. One of the found problems was that eagerly splitting the state based on whether the system call may fail leads to too many reports. A delayed split where the implication is stored in the state (similarly to nullability implications in - TrustNonnullChecker) may produce much better results.- (Difficulty: Medium) 
 
- Improve C++ support
  - Handle aggregate construction.
      Aggregates are objects that can be brace-initialized without calling a constructor (that is, CXXConstructExprdoes not occur in the AST), but potentially calling constructors for their fields and base classes These constructors of sub-objects need to know what object they are constructing. Moreover, if the aggregate contains references, lifetime extension needs to be properly modeled. One can start untangling this problem by trying to replace the current ad-hocParentMaplookup inCXXConstructExpr::CK_NonVirtualBasebranch ofExprEngine::VisitCXXConstructExpr()with proper support for the feature.(Difficulty: Medium) 
- Handle constructors within new[]When an array of objects is allocated using the operator new[], constructors for all elements of the array are called. We should model (potentially some of) such evaluations, and the same applies for destructors called fromoperator delete[].
- Handle constructors that can be elided due to Named Return Value Optimization (NRVO)
      Local variables which are returned by values on all return statements may be stored directly at the address for the return value, eliding the copy or move constructor call. Such variables can be identified using the AST call VarDecl::isNRVOVariable.
- Handle constructors of lambda captures
      Variables which are captured by value into a lambda require a call to a copy constructor. This call is not currently modeled. 
- Handle constructors for default arguments 
      Default arguments in C++ are recomputed at every call, and are therefore local, and not static, variables. 
- Enhance the modeling of the standard library.
      The analyzer needs a better understanding of STL in order to be more useful on C++ codebases. While full library modeling is not an easy task, large gains can be achieved by supporting only a few cases: e.g. calling .length()on an emptystd::stringalways yields zero.(Difficulty: Medium) 
- Enhance CFG to model exception-handling.
      Currently exceptions are treated as "black holes", and exception-handling control structures are poorly modeled in order to be conservative. This could be improved for both C++ and Objective-C exceptions. (Difficulty: Hard) 
 
- Handle aggregate construction.
      
- Core Analyzer Infrastructure
  - Handle unions.
      Currently in the analyzer the value of a union is always regarded as an unknown. This problem was previously discussed on the mailing list, but no solution was implemented. (Difficulty: Medium) 
- Floating-point support.
      Currently, the analyzer treats all floating-point values as unknown. This project would involve adding a new SValkind for constant floats, generalizing the constraint manager to handle floats, and auditing existing code to make sure it doesn't make incorrect assumptions (most notably, thatX == Xis always true, since it does not hold forNaN).(Difficulty: Medium) 
- Improved loop execution modeling.
      The analyzer simply unrolls each loop N times before dropping the path, for a fixed constant N. However, that results in lost coverage in cases where the loop always executes more than N times. A Google Summer Of Code project was completed to make the loop bound parameterizable, but the widening problem still remains open. (Difficulty: Hard) 
- Basic function summarization support
      The analyzer performs inter-procedural analysis using either inlining or "conservative evaluation" (invalidating all data passed to the function). Often, a very simple summary (e.g. "this function is pure") would be enough to be a large improvement over conservative evaluation. Such summaries could be obtained either syntactically, or using a dataflow framework. (Difficulty: Hard) 
- Implement a dataflow flamework.
      The analyzer core implements a symbolic execution engine, which performs checks (use-after-free, uninitialized value read, etc.) over a single program path. However, many useful properties (dead code, check-after-use, etc.) require reasoning over all possible in a program. Such reasoning requires a dataflow analysis framework. Clang already implements a few dataflow analyses (most notably, liveness), but they implemented in an ad-hoc fashion. A proper framework would enable us writing many more useful checkers. (Difficulty: Hard) 
- Track type information through casts more precisely.
      The DynamicTypePropagationchecker is in charge of inferring a region's dynamic type based on what operations the code is performing. Casts are a rich source of type information that the analyzer currently ignores.(Difficulty: Medium) 
 
- Handle unions.
      
- Fixing miscellaneous bugs
    Apart from the open projects listed above, contributors are welcome to fix any of the outstanding bugs in the Bugzilla. (Difficulty: Anything)