Skip to content

Conversation

@NickAkhmetov
Copy link
Collaborator

@NickAkhmetov NickAkhmetov commented Nov 17, 2025

Summary

This branch contains the integrated dataset pages functionality.

Context:
We've received recurring asks for dedicated detail pages for EPICs, rather than having them redirect to the parent dataset. I am taking care of early prototyping then handing off any remaining TODOs to polish the functionality and accelerate development.

Lisa also highlighted that we have received previous feedback that the representation of SNARE-Seq2 data in the portal is not ideal. EPICs and SNARE-Seq2 are both instances where users' interest is in processed outputs that integrate data from multiple component datasets, rather than individual datasets' raw data. Therefore, an Integrated Dataset detail page handles both use-cases, while also allowing us to avoid presenting the internal "EPIC" name which users are not familiar with.

When users navigate directly to an integrated dataset's detail page (from dataset search, from being included in a publication's list of datasets, from copying/pasting the URL, etc), rather than displaying the unified view that focuses on the raw data, an alternative view is presented which prioritizes displaying the visualization of the processed data and provides tools to read the metadata/download the bulk data for the integrated analysis and its component datasets.

Integrated datasets will also remain visible in their comprising datasets' processed data section.

Designs are here: https://www.figma.com/design/LExciZTIeYVDkSYnqY7ebe/Integrated-Dataset-Detail-Pages?node-id=1371-3815&t=D21hJHYEi5MjcAV3-1

TODOs

  • Add is_integrated logic to search api transformations
  • Remove dataset relationship diagram for now - defer redesign to future work
  • Visualization
  • Integrated Data
  • Bulk Data Transfer
  • Provenance
  • Protocols
  • Workflow Details
  • Internal Datasets-specific
    • Attribution text tweak
  • External Datasets-specific
    • Attribution text tweak
    • Metadata

Design Documentation/Original Tickets

https://www.figma.com/design/LExciZTIeYVDkSYnqY7ebe/Integrated-Dataset-Detail-Pages?node-id=1371-3815&t=gNbcm7gK0d7Eb4FG-1

Testing

Describe how the feature has been tested, including both automated and manual testing strategies.

Screenshots/Video

Include screenshots or video demonstrating any significant visual or behavioral changes.

Checklist

  • Code follows the project's coding standards
    • Lint checks pass locally
    • New CHANGELOG-your-feature-name-here.md is present in the root directory, describing the change(s) in full sentences.
  • Unit tests covering the new feature have been added
  • All existing tests pass
  • Any relevant documentation in JIRA/Confluence has been updated to reflect the new feature
  • Any new functionalities have appropriate analytics functionalities added

Additional Notes

Preview

image (3)

NickAkhmetov and others added 4 commits December 12, 2025 11:04
…nt datasets

* feat(integrated-dataset): combine contributors from parent datasets

* feat(integrated-dataset): deduplicate people by ORCiD if available, and sort by average index

* fix(integrated-dataset): fix type errors

* style(integrated-dataset): code style feedback
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new “integrated dataset” detail-page variant and supporting infrastructure for integrated/derived datasets, along with related UX and data-access improvements across search, detail pages, bulk download, and metadata handling.

Changes:

  • Add an Integrated Dataset detail page variant (including visualization, files, integrated data tables, provenance, analysis details, bulk transfer, and attribution) and route-level wiring to select it based on is_integrated.
  • Refactor shared table, metadata, and download utilities (e.g., EntitiesTables, metadata TSV download, useDownloadTSV, DownloadsDropdownMenu) to support richer actions (saving, workspace actions, bulk download, TSV export) and integrated entity workflows.
  • Improve protocol URL handling (including GitHub links), visualization loading UX, publication data display (integrated data tables + collections), and several smaller UX/RBAC improvements (dataset access warnings, workspace buttons visibility, etc.).

Reviewed changes

Copilot reviewed 82 out of 84 changed files in this pull request and generated no comments.

Show a summary per file
File Description
context/app/utils.py Extends should_redirect_entity to not redirect integrated datasets, allowing them to have their own detail pages.
context/app/static/js/typings/search.ts Extends search document typings to include donor BMI fields and dataset description, enabling new columns and filters.
context/app/static/js/shared-styles/tables/columns.tsx Adds dataset access warning component, new donor and description columns, safer joins for array fields, and wiring for sortable/width-configured columns.
context/app/static/js/shared-styles/tables/NumSelectedHeader/style.ts Makes the header wrapper optionally omit its bottom border for use inside composite header rows.
context/app/static/js/shared-styles/tables/NumSelectedHeader/NumSelectedHeader.tsx Reuses the new HeaderWrapperProps for consistent typing and styling.
context/app/static/js/shared-styles/tables/EntitiesTable/types.ts Extends EntitiesTabTypes to support per-tab headerActions, initial sort state, and tooltip text.
context/app/static/js/shared-styles/tables/EntitiesTable/EntityTableHeaderCell.tsx Adjusts header cell flex alignment and text wrapping to better accommodate filter controls and long labels.
context/app/static/js/shared-styles/tables/EntitiesTable/EntityTable.tsx Allows non-selectable tables, adds an optional extra sticky header row for selection and header actions, and correctly manages sticky offsets.
context/app/static/js/shared-styles/tables/EntitiesTable/EntitiesTables.tsx Refactors into separate tabs/bodies components, adds skeleton loading, per-tab emptiness handling, selection reset on tab change, and support for inline header actions.
context/app/static/js/shared-styles/icons/sectionIconMap.ts Adds icons for new integrated-data and protocols-and-workflow-details sections and normalizes visualization icons.
context/app/static/js/pages/Sample/Sample.tsx Switches to DetailContextProvider and passes explicit entity type, simplifying and standardizing detail context setup.
context/app/static/js/pages/Publication/Publication.tsx Uses DetailContextProvider, replaces publication-specific bulk transfer with the generic BulkDataTransfer (using ancestor dataset UUIDs), and wires in the new PublicationsDataSection.
context/app/static/js/pages/Donor/Donor.tsx Uses DetailContextProvider with entity type, aligning donor detail pages with the new context model.
context/app/static/js/pages/Dataset/utils.ts Adds combinePeopleLists to deduplicate and order contributors/contacts across multiple datasets using ORCID or name+affiliation.
context/app/static/js/pages/Dataset/hooks.ts Renames the processed dataset section key to protocols-and-workflow-details to match new icons and section components.
context/app/static/js/pages/Dataset/IntegratedDataset.tsx Implements the new Integrated Dataset detail page (summary, visualization, integrated data tables, metadata, files, bulk transfer, provenance, analysis details, collections, publications, and attribution).
context/app/static/js/pages/Dataset/DatasetPageSummaryChildren.tsx Adds reusable summary content that links to assay docs and organ pages, aggregating organs across origin samples.
context/app/static/js/pages/Dataset/Dataset.tsx Wires integrated dataset variant, reuses SummaryDataChildren, routes processed-data section through ProcessedData, and delegates to IntegratedDatasetPage when appropriate.
context/app/static/js/hooks/useUBKG.ts Hardens cellTypeDetail against empty IDs, avoiding unnecessary requests with empty identifiers.
context/app/static/js/hooks/useSearchData.ts Fixes multisearch to send the correct number of Elasticsearch endpoints (one per request), aligning with multiFetch expectations.
context/app/static/js/hooks/useProtocolData.ts Refactors protocol URL formatting to separate Protocols.io vs GitHub URLs and handle multiple/variant URL formats more robustly.
context/app/static/js/hooks/useProtocolData.spec.ts Adds TypeScript tests for useFormattedProtocolUrls and isGithubUrl, covering mixed URL formats and edge cases.
context/app/static/js/hooks/useProtocolData.spec.js Removes the legacy JS test file in favor of the new TS test suite.
context/app/static/js/hooks/useEntityData.ts Extends useEntitiesData with shouldFetch and useDefaultQuery options to better control downstream useSearchHits behavior.
context/app/static/js/hooks/useDownloadTSV.ts Introduces a reusable hook for TSV downloads driven by search/lineup entities, including analytics and error handling.
context/app/static/js/components/types.ts Extends Entity and Dataset types with immediate ancestor/descendant IDs and is_integrated, enabling integrated-dataset-specific logic.
context/app/static/js/components/search/MetadataMenu/DownloadTSVItem.tsx Refactors TSV download item to delegate to useDownloadTSV, simplifying local state and behavior.
context/app/static/js/components/publications/PublicationsDataSection/index.ts Exports the new TypeScript PublicationsDataSection implementation.
context/app/static/js/components/publications/PublicationsDataSection/PublicationsDataSection.tsx Reimplements publication “Data” section using integrated data tables plus collections, even when an associated collection is present.
context/app/static/js/components/publications/PublicationsDataSection/PublicationsDataSection.jsx Removes the old JS implementation that used PublicationRelatedEntities.
context/app/static/js/components/publications/PublicationRelatedEntities/index.ts Removes unused index barrel for the old related-entities component.
context/app/static/js/components/publications/PublicationRelatedEntities/hooks.ts Removes the specialized publication-related-entities search hooks in favor of integrated data logic.
context/app/static/js/components/publications/PublicationRelatedEntities/PublicationRelatedEntities.tsx Removes the old related-entities UI, replaced by integrated data tables.
context/app/static/js/components/publications/PublicationCollections/PublicationCollections.tsx Simplifies the collections sub-section to render within the new “Data” section instead of owning its own section wrapper.
context/app/static/js/components/detailPage/visualization/style.ts Centralizes the vitessceFixedHeight constant for reuse across visualization wrappers and skeletons.
context/app/static/js/components/detailPage/visualization/VitessceSkeleton/index.ts Adds barrel export for the new Vitessce skeleton component.
context/app/static/js/components/detailPage/visualization/VitessceSkeleton/VisualizationSkeleton.tsx Implements a skeleton layout approximating Vitessce’s multi-panel visualization while data is loading.
context/app/static/js/components/detailPage/visualization/VisualizationWrapper/style.ts Simplifies wrapper styling to use the shared vitessceFixedHeight and drops the now-unused generic background wrapper.
context/app/static/js/components/detailPage/visualization/VisualizationWrapper/VisualizationWrapper.tsx Switches Suspense fallback to the new visualization skeleton for a smoother loading experience.
context/app/static/js/components/detailPage/visualization/VisualizationWrapper/VisualizationSuspenseFallback.tsx Reworks fallback to use the skeleton plus header, decoupled from being the Suspense fallback.
context/app/static/js/components/detailPage/visualization/Visualization/style.ts Moves the fixed Vitessce height into a shared style file and keeps visualization styling/export surface consistent.
context/app/static/js/components/detailPage/visualization/Visualization/Visualization.tsx Uses the new shared height and skeleton when Vitessce config is unavailable, improving user feedback during load.
context/app/static/js/components/detailPage/visualization/IntegratedDatasetVisualizationSection/index.ts Barrel export for the integrated-dataset visualization section.
context/app/static/js/components/detailPage/visualization/IntegratedDatasetVisualizationSection/IntegratedDatasetVisualizationSection.tsx Adds a dedicated “Visualization” section for integrated datasets with descriptive copy and wrapped Vitessce viewer.
context/app/static/js/components/detailPage/provenance/ProvTabs/ProvTabs.tsx Adds an integratedDataset mode that skips tabs and shows only the graph (with large-graph warning) for integrated datasets.
context/app/static/js/components/detailPage/provenance/ProvSection/ProvSection.tsx Wires integratedDataset through to ProvTabs, preserving other provenance behavior.
context/app/static/js/components/detailPage/multi-assay/MultiAssayMetadataTabs/MultiAssayMetadataTabs.tsx Enhances multi-assay metadata tabs to disambiguate duplicate labels by appending HuBMAP IDs.
context/app/static/js/components/detailPage/files/file-fixtures.spec.ts Extends test fixtures with entityType to match the new DetailContext shape.
context/app/static/js/components/detailPage/files/MultiFileDownloader/MultiFileDownloader.tsx Changes to accept UnprocessedFile[] and generate download URLs via useFileLinks.
context/app/static/js/components/detailPage/files/IntegratedDatasetFiles/index.ts Barrel export for the integrated dataset files section.
context/app/static/js/components/detailPage/files/IntegratedDatasetFiles/IntegratedDatasetFiles.tsx Adds an integrated dataset-specific “Files” section wired to FilesTabs and DetailContext.
context/app/static/js/components/detailPage/files/FilesTabs/index.ts Barrel export for the new shared FilesTabs component.
context/app/static/js/components/detailPage/files/FilesTabs/FilesTabs.tsx Consolidates “Data Products” and “File Browser” into a single tabbed UI, tracking tab changes for analytics.
context/app/static/js/components/detailPage/files/Files/Files.spec.tsx Updates tests to use DataProductProvider instead of processed dataset context for file-link generation.
context/app/static/js/components/detailPage/files/FileBrowserNode/FileBrowserNode.spec.tsx Adjusts tests to wrap in DataProductProvider to satisfy new context requirements.
context/app/static/js/components/detailPage/files/FileBrowserFile/FileBrowserFile.spec.tsx Similar test adjustments to use DataProductProvider.
context/app/static/js/components/detailPage/files/FileBrowser/FileBrowser.spec.tsx Updates tests for the new DataProduct-based context usage.
context/app/static/js/components/detailPage/files/DataProducts/hooks.ts Refactors file-link creation to use a new DataProductContext for dataset UUIDs instead of processed dataset context.
context/app/static/js/components/detailPage/files/DataProducts/DataProducts.tsx Switches “Download All” behavior to pass raw UnprocessedFile[] into MultiFileDownloader; continues using file links internally.
context/app/static/js/components/detailPage/files/DataProducts/DataProducts.spec.tsx Adds DataProductProvider to tests to match new context requirements and extends detail context with entityType.
context/app/static/js/components/detailPage/files/DataProducts/DataProductContext.tsx Introduces a simple context/provider for the dataset UUID used in data-product file links.
context/app/static/js/components/detailPage/files/DataProducts/DataProduct.tsx Performs minor cleanup (removal of commented pipeline-info import) while still using useFileLink.
context/app/static/js/components/detailPage/entityHeader/EntityHeaderActionButtons/EntityHeaderActionButtons.tsx Gates the Processed Data Workspace menu behind isWorkspacesUser, consistent with workspace access controls.
context/app/static/js/components/detailPage/Protocol/Protocol.tsx Integrates new formatted protocol URL structure and adds support for rendering GitHub repository links separately.
context/app/static/js/components/detailPage/ProcessedData/ProcessedDataset/ProcessedDataset.tsx Refactors file tabbing to use shared FilesTabs, standardizes the protocols/workflow section id, and updates copy/analytics hooks.
context/app/static/js/components/detailPage/MetadataSection/hooks.ts Adds a hook to compute per-entity metadata tables and TSV download URLs using UBKG field descriptions.
context/app/static/js/components/detailPage/MetadataSection/columns.ts Expands TSV columns to include HuBMAP ID and entity label alongside key/value/description.
context/app/static/js/components/detailPage/MetadataSection/MetadataSection.tsx Refactors metadata rendering to use the new hook and table layout, adds a primary “Download” button, and adjusts dataset descriptions for integrated datasets.
context/app/static/js/components/detailPage/IntegratedData/index.ts Barrel export for the integrated data section.
context/app/static/js/components/detailPage/IntegratedData/IntegratedDataTables.tsx Implements integrated data tables over donors, samples, and datasets using EntitiesTables with rich header actions.
context/app/static/js/components/detailPage/IntegratedData/IntegratedData.tsx Adds the “Integrated Data” section (with TSV download for all integrated entities) used on integrated dataset and publication pages.
context/app/static/js/components/detailPage/DetailContext.tsx Extends detail context with entityType and provides a typed DetailContextProvider used across entity pages.
context/app/static/js/components/detailPage/BulkDataTransfer/PublicationBulkDataTransfer.tsx Removes the publication-specific bulk transfer wrapper in favor of the generalized section.
context/app/static/js/components/detailPage/BulkDataTransfer/BulkDataTransferSection.tsx Generalizes bulk data transfer to support both regular datasets and integrated entities (dataset or publication) including CLT instructions and UUID sets.
context/app/static/js/components/detailPage/Attribution/Attribution.tsx Adds specialized attribution descriptions for integrated and external datasets while retaining existing behavior for standard datasets.
context/app/static/js/components/detailPage/AnalysisDetails/AnalysisDetailsSection.tsx Introduces a “Protocols & Workflow Details” section combining protocol links and analysis details with graceful fallbacks.
context/app/static/js/components/detailPage/AnalysisDetails/AnalysisDetails.tsx Small type-cleanup (rename of props interface) for clarity.
context/app/static/js/components/data-transfer/DownloadsDropdownMenu/index.ts Barrel export for the new downloads dropdown menu.
context/app/static/js/components/data-transfer/DownloadsDropdownMenu/DownloadsDropdownMenu.tsx Adds a reusable “Download” dropdown (bulk datasets + TSV metadata) wired to selection state and access checks.
context/app/static/js/components/Routes/Routes.jsx Threads the integrated flag from Flask data into the Dataset route to select the integrated dataset variant.
context/app/static/js/components/Contexts.tsx Extends FlaskDataContextType with an integrated flag for integrated datasets.
context/app/routes_browse.py Reads is_integrated from entities and injects it into flask_data for the React app to consume.
CHANGELOG-integrated-dataset-pages.md Adds changelog entry for the introduction of integrated dataset pages.
CHANGELOG-integrated-dataset-pages-continued.md Adds follow-on changelog entries for bulk data transfer and publication data section behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@NickAkhmetov NickAkhmetov marked this pull request as ready for review January 26, 2026 19:53
@NickAkhmetov
Copy link
Collaborator Author

NickAkhmetov commented Jan 26, 2026

See #3899 for up-to-date screenshots. The review of that PR covered all currently present functionality.

Comment on lines 42 to 64
const labelCounts: Record<string, number> = {};
entities.forEach(({ label }) => {
labelCounts[label] = (labelCounts[label] || 0) + 1;
});

const deduplicated = entities.map(({ label, hubmap_id, tableRows, ...rest }) => {
const count = labelCounts[label];
return {
hubmap_id,
tableRows,
label:
count > 1 ? (
<Stack key={hubmap_id} spacing={0} alignItems="center">
<div>{label}</div>
<div>({hubmap_id})</div>
</Stack>
) : (
label
),
...rest,
};
});
return deduplicated;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we dedupe before counting? Also counting feels like the use case for reduce.

Copy link
Collaborator Author

@NickAkhmetov NickAkhmetov Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is meant to disambiguate cases where there is more than one ancestor of a given type - e.g. if an object x analyte has 20 different datasets of the same data type, but with different metadata, this processing avoids having two labels with the same title by appending the HuBMAP ID.

As a result, we do need the counts - but calling it "deduplicated" is definitely unclear.

I'll revise this to use a .reduce call for the initial count aggregation and rename things accordingly, but the logic is otherwise sound.

Copy link
Collaborator

@john-conroy john-conroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Most of the comments resolved in earlier PRs. Thanks!

@NickAkhmetov NickAkhmetov merged commit c5e8f29 into main Jan 27, 2026
9 checks passed
@NickAkhmetov NickAkhmetov deleted the nickakhmetov/integrated-dataset-pages branch January 27, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants