Skip to content

Conversation

@ns-rse
Copy link
Collaborator

@ns-rse ns-rse commented Nov 18, 2025

Decode numpy arrays of strings

I found whilst working through TopoStats refactoring that the inclusion of the config as an attribute to TopoStats objects resulted in some of the values, which are lists of strings, being converted to Numpy arrays (technically everything is converted to Numpy arrays as HDF5 can't handled "lists" of mixed types and so they are coerced to Numpy arrays with a single dtype).

However, these could not be decoded directly and item[()].decode("utf-8") failed with an AttributeError stating that numpy.ndarray does not have attribute decode.

The solution proposed here is to capture this error and if item[()] is an instance of np.ndarray to iterate over the list decoding each item in turn. The typing is explicitly ignored because we want it as a list rather than a dictionary.

Test included, and whilst it passes seems a bit light but does mirror the scenario encountered (had to use group_path as the higher level of nesting).

Handle newer topostats file versions

Restructuring of TopoStats to define classes means the TopoStats object now holds the topostats_version rather than topostats_file_version.

This commit allows both to be handled and switches to using packging.version to do so which ensures a more consistent approach to comparing version numbers.

Yet to write a test for working with newer .topostats where topostats_version >= 2.3.2 as work is still on-going but parameterised test is in place for when work is complete.

I found whilst working through TopoStats refactoring that the inclusion of the `config` as an attribute to `TopoStats`
objects resulted in some of the values, which are lists of strings, being converted to Numpy arrays.

These could not be decoded direction and `item[()].decode("utf-8")` failed with an `AttributeError` stating that `numpy.ndarray
does not have attribute decode`.

The solution proposed here is to capture this error and if `item[()]` is an instance of `np.ndarray` to iterate over the
list decoding each item in turn.  The typing is explicitly ignored because we want it as a list rather than a
dictionary.

Test included, and whilst it passes seems a bit light but does mirror the scenario encountered (had to use `group_path`
as the higher level of nesting).
Restructuring of TopoStats to define classes means the `TopoStats` object now holds the `topostats_version` rather than
`topostats_file_version`.

This commit allows both to be handled and switches to using
[`packging.version`](https://packaging.pypa.io/en/stable/version.html) to do so which ensures a more consistent approach
to comparing version numbers.

Yet to write a test for working with newer `.topostats` where `topostats_version >= 2.3.2` as work is still on-going but
parameterised test is in place for when work is complete.
@ns-rse ns-rse added bug Something isn't working topostats Loading the .topotats file format labels Nov 18, 2025
@ns-rse ns-rse requested a review from SylviaWhittle November 18, 2025 11:00
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 63.63636% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.57%. Comparing base (ac9753c) to head (0ac2229).
⚠️ Report is 230 commits behind head on main.

Files with missing lines Patch % Lines
AFMReader/io.py 42.85% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #177      +/-   ##
==========================================
+ Coverage   74.62%   79.57%   +4.94%     
==========================================
  Files           8       12       +4     
  Lines         607      935     +328     
==========================================
+ Hits          453      744     +291     
- Misses        154      191      +37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working topostats Loading the .topotats file format

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants