fix: decode numpy arrays; feature: topostats versions #177
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Decode numpy arrays of strings
I found whilst working through TopoStats refactoring that the inclusion of the
configas an attribute toTopoStatsobjects resulted in some of the values, which are lists of strings, being converted to Numpy arrays (technically everything is converted to Numpy arrays as HDF5 can't handled "lists" of mixed types and so they are coerced to Numpy arrays with a singledtype).However, these could not be decoded directly and
item[()].decode("utf-8")failed with anAttributeErrorstating thatnumpy.ndarray does not have attribute decode.The solution proposed here is to capture this error and if
item[()]is an instance ofnp.ndarrayto iterate over the list decoding each item in turn. The typing is explicitly ignored because we want it as a list rather than a dictionary.Test included, and whilst it passes seems a bit light but does mirror the scenario encountered (had to use
group_pathas the higher level of nesting).Handle newer topostats file versions
Restructuring of TopoStats to define classes means the
TopoStatsobject now holds thetopostats_versionrather thantopostats_file_version.This commit allows both to be handled and switches to using
packging.versionto do so which ensures a more consistent approach to comparing version numbers.Yet to write a test for working with newer
.topostatswheretopostats_version >= 2.3.2as work is still on-going but parameterised test is in place for when work is complete.