Better handling of categoricals/imputation flags

While categoricals and imputation flags can be handled as groups using the regular expression pattern functionality here, we loose the feature distribution information when doing so. However, this shouldn't be necessary, since the distributions for the other features are still relevant and we should be able to provide some sort of distribution information for the categorical variables as well.