Skip to content

Cleanup / modernization - general code audit #17

@schwehr

Description

@schwehr

I'm taking a look at what it might look like to cleanup shapelib and move it to C99 or newer. I know that opinions vary as to if these are good changes or not, so I'm doing this in a separate repo without worrying out making PRs:

https://github.com/schwehr/shapelib-experimental

I used debian testing with -Wall -Wextra and cppcheck to keep an eye on issues.

The changes are not always fully implemented. The goal is evaluate what the library might look like. There is not currently enough testing in shapelib to be sure that nothing has broken. I'm hoping to find some time to add more test coverage. I did put the core files into my local copy of GDAL and my shape tests pass.

So far, the changes I've made are roughly:

  • clang-format
  • Remove unused variables
  • Add missing include of stdio.h (beginnings of IWYU)
  • Order includes
  • Remove comments that are no longer relevant
  • Remove cvs log messages from file comments
  • Reduce bulky block comments
  • Localize variables
  • Combine definition and initialization
  • Add const when possible
  • int boolean → bool (using stdbool.h)
  • Remove dbmalloc
  • Remove C++ compilation of the C source (removed C++ casting macros)
  • Mark a few possible bugs with TODO
  • Reorder functions to not need local prototypes
  • Fix some error exit's to properly close open resources

This has already reduced the number of lines of source files:

# Before
wc *.[ch] contrib/*.[ch] | tail -1
 15995  58264 553965 total

# Currently
wc *.[ch] contrib/*.[ch] | tail -1
 13566  48865 469071 total

There are a lot more things I'd like to try out including:

  • Add C test coverage (maybe with catch2? or I could default to gtest)
  • Add command line tests using python3 to exercise the resulting programs
  • Fuzzing directly on shapelib (both to find bugs and to make it easy to find files for testing)
  • Use stdint.h types
  • Use C99 printf codes
  • Remove DEBUG code
  • convert int foo; if (a) foo=b; else foo=c; → const int foo = a ? b : c;
  • Running tests in asan and msan modes
  • Looking at coverity
  • etc.

Some of the issues I've found so far:

  • A variety of license issues. e.g. files that state public domain and contain a copyright
  • I'm not sure if some of the contrib files work
  • booleans that are true, false, and sometimes -1
  • Missing conditions in if chains / switch statements. e.g. dbfdump.c
  • Variables with incorrect Hungarian prefixes. e.g. bMinX for int and iunselect for bool
  • Multiple endian check and a few other redundancies
  • Unchecked malloc and realloc calls that may fail
  • Variables that don't change causing code to be unreachable. e.g. byRing
  • shpdxf.c has a huge number of warnings including may be used uninitialized
  • CMake doesn't run the tests
  • shputils.c has a large number of globals and is very hard to follow
  • man pages would be nice
  • functions should have const on pointer args they do not alter

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions