Release API and Architecture Refactoring · UCD-BDLab/BioNeuralNet

[1.2.0] - 2025-11-23
API and Architecture Refactoring (copied from CHANGELOG.md

Namespace Hierarchy Overhaul: Transitioned from a flat namespace to a hybrid hierarchical structure to enhance modularity and prevent namespace pollution.
    Core Classes: DPMON, GNNEmbedding, SubjectRepresentation, SmCCNet, and DatasetLoader remain accessible at the top level (e.g., bnn.DPMON).
    Utilities and Metrics: Functional tools are now scoped to their respective submodules (e.g., bnn.metrics.plot_network, bnn.utils.preprocess_clinical).
Utils Module Restructuring: Decomposed the monolithic utils module into specialized submodules for improved maintainability:
    utils.data: Contains summary statistics functions (e.g., variance_summary).
    utils.preprocess: Contains data transformation functions (e.g., impute_omics, normalize_omics).
    utils.reproducibility: Dedicated module for seeding functions (set_seed).

New Features

Graph Engineering Module (graph_tools): Introduced a new module for the diagnosis and repair of network topology issues.
    repair_graph_connectivity: Implemented an algorithm to reconnect fragmented network components (islands) to the global network using eigenvector centrality hubs or omics-driven correlation.
    find_optimal_graph: Added an AutoML-style search function that benchmarks various graph construction strategies (Gaussian, Correlation, Threshold) using a structural proxy task to optimize downstream stability.
    graph_analysis: Added diagnostic utilities to log topological metrics (clustering coefficient, average degree) and identify isolated subgraphs broken down by omics modality.
DPMON Enhancements: Expanded the NeuralNetwork backbone to support multiple dimensionality reduction strategies beyond the standard AutoEncoder.
    Linear Projection: Added ScalarProjection, utilizing a linear layer to map embeddings to feature weights.
    MLP Projection: Added MLPProjection, utilizing a non-linear Multilayer Perceptron for complex feature weighting.
Dataset Loaders:
    Implemented functional loaders (load_brca, load_kipan, load_lgg, load_paad, load_monet, load_example) to provide immediate access to data dictionaries, aligning with scikit-learn conventions.
    Added __getitem__ support to the DatasetLoader class for direct key access (e.g., loader['rna']).

Data Standardization

BRCA Clinical Update: Removed 15 duplicated columns from the BRCA clinical dataset, reducing the feature dimensionality from 118 to 103 to ensure data uniqueness.
Dataset Renaming:
    Renamed the synthetic dataset example1 to example.
    Renamed gbmlgg to lgg (Brain Lower Grade Glioma).
Target Variable Update: Updated the target variable for the lgg dataset from 'histological type' to 'vital_status' to better align with prognostic prediction tasks.
Key Standardization: Removed redundant _data suffixes from dataset dictionary keys (e.g., monet['mirna_data'] is now monet['mirna']).
Dataset Specifications: Updated documentation to explicitly define the dimensions (samples × features) for all included datasets:
    BRCA: miRNA (769, 503), Target (769, 1), Clinical (769, 103), RNA (769, 2500), Meth (769, 2203).
    LGG: miRNA (511, 548), Target (511, 1), Clinical (511, 13), RNA (511, 2127), Meth (511, 1823).
    PAAD: CNV (177, 1035), Target (177, 1), Clinical (177, 19), RNA (177, 1910), Meth (177, 1152).
    KIPAN: miRNA (658, 472), Target (658, 1), Clinical (658, 19), RNA (658, 2284), Meth (658, 2102).
    Monet: Gene (107, 5039), miRNA (107, 789), Phenotype (106, 1), RPPA (107, 175), Clinical (107, 5).
    Example: X1 (358, 500), X2 (358, 100), Y (358, 1), Clinical (358, 6).

Improvements and Fixes

Documentation: Refactored all docstrings across the library to adhere to strict Google Style formatting (Args/Returns) to ensure consistent API documentation generation.
Clustering:
    Hybrid Louvain: Corrected the parameter tuning logic for k3 and k4 weights and refined the iterative refinement loop for identifying phenotype-associated subgraphs.
    Correlated PageRank: Enhanced input validation to ensure proper alignment between graph nodes and omics features.

Removed

Metrics Evaluation: Removed the metrics.evaluation module. Its functionality has been consolidated into the metrics module or deprecated in favor of external validation workflows.

Left to Do

Errors: Online docs will be updated. on v1.2.1, package tests should pass once datasets get posted to pypi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API and Architecture Refactoring

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!