Need to exclude globally for-certain-nothing-matters-in-here dirs like node_modules and __pycache__ even if they were committed into git repos being analyzed.
Doing the exclusion in the _foreach_gitfile or list_all_git_files function offers the best chance at being future-proof with current architecture.
Following this logic we might start to also exclude known binary files, but this can lead to a slippery slope argument of going back to manual file extension lists.