A Python preprocessing package for working with the Healthy Brain Network clinician consensus diagnostic data.
The raw HBN dataset includes final clinician diagnostic data given across ten numbered diagnosis columns. The order of these diagnoses is not indicative of severity, chronology, or importance, and this format requires data manipulation to be useful for analysis. This package transforms the data to a wider format, organized by specific diagnoses or categories rather than diagnosis numbers. It also includes the option to filter by diagnostic certainty or time of diagnosis and creates a visualization of the diagnostic data. Option to either run interactively in the command line (recommended if not familiar with the dataset) or to install as a python package.
| DX_01 | DX_01_Cat | DX_01_Sub | DX_01_Time | DX_01_Confirmed | DX_01_Presum | DX_01_RC | DX_01_RuleOut | DX_02 | DX_02_Cat | DX_02_Sub | DX_02_Time | DX_02_Confirmed | DX_02_Presum | DX_02_RC | DX_02_RuleOut | ... |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ADHD - Hyperactive Type | Neurodevelopmental Disorders | ADHD | 1 | 1 | 0 | 0 | 0 | ... | ||||||||
| Selective Mutism | Anxiety Disorders | Anxiety Disorders | 1 | 1 | 0 | 0 | 0 | Autism Spectrum Disorder | Autism Spectrum Disorder | Neurodevelopmental Disorders | 1 | 0 | 0 | 0 | 1 | ... |
↓
| Neurodevelopmental_Disorders_CategoryPresent | ADHD_Hyperactive_Type_DiagnosisPresent | ADHD_Hyperactive_Type_Time | ADHD_Hyperactive_Type_Certainty | Autism_Spectrum_DiagnosisPresent | Autism_Spectrum_Time | Autism_Spectrum_Certainty | Anxiety_Disorders_CategoryPresent | Selective_Mutism_DiagnosisPresent | Selective_Mutism_Time | Selective_Mutism_Certainty |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Current | Confirmed | 0 | 0 | 0 | ||||
| 1 | 0 | 1 | Current | Rule-Out | 1 | 1 | Past | Confirmed |
For more information on the HBN data, please see the HBN Data Portal
Install this package via :
pip install git+https://github.com/childmindresearch/hbn-ddp.githbnddp
from hbnddp import HBNData
data = HBNData.create(input_path="path/to/data.csv")
processed_data = data.process(
# change to output path if you want to save the results
output_path="path/to/output.csv",
# pivot on diagnoses, subcategories, categories or all
by="diagnoses",
# add certainty filter if desired
certainty_filter=None,
# set True to visualize results
viz=True,
)