Skip to content

Conversation

@ivanmilevtues
Copy link

This PR includes high-level mermaid diagrams representing the CoSpred codebase. You can view them here: https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/starfish/on_boarding.md

The goal is to help new contributors quickly understand the code. We're especially interested in whether these diagrams are useful for scientists or others who use code but aren't full-time engineers—like at Pfizer. What does your current onboarding look like, and could something like this fit in?

We generate the diagrams using static analysis and LLMs. Feedback is very welcome! We're also building a GitHub Action to auto-update the diagrams on each main/release merge.

Full disclosure: we're exploring turning this into a startup and are still in the early stages.

@ivanmilevtues
Copy link
Author

Hey, a quick update on our side, this week we released our diagram generaiton engine as an open-source project! If you are interested to see more on how the generation works you can do so at https://github.com/CodeBoarding/CodeBoarding

@shachafl shachafl requested a review from Copilot August 6, 2025 13:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds high-level mermaid diagrams to help new contributors understand the starfish codebase architecture, specifically targeting scientists and non-full-time engineers. The diagrams provide visual representations of the system's components and their relationships.

  • Introduces comprehensive documentation with interactive mermaid diagrams showing system architecture
  • Creates detailed component breakdowns for six major subsystems of the starfish project
  • Provides clickable links to source code and component details for deeper exploration

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
.codeboarding/on_boarding.md Main overview diagram showing high-level architecture and component relationships
.codeboarding/Spot_Intensity_Analysis.md Detailed diagram for spot detection, decoding, and intensity analysis workflows
.codeboarding/Mask_Label_Management.md Component diagram for binary mask, label image, and segmentation processing
.codeboarding/Image_Processing_Management.md Diagram showing image data structures, parsers, and processing algorithms
.codeboarding/Expression_Matrix_Generation.md Simple diagram for expression matrix creation and management
.codeboarding/Experiment_Data_Core.md Core data structures and their hierarchical relationships
.codeboarding/Core_Infrastructure.md Infrastructure components for configuration, logging, and utilities


click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Expression_Matrix_Generation.md" "Details"

click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL contains double slashes (//) before the filename. This should be a single slash to form a valid GitHub URL path.

Suggested change
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
click Experiment_Data_Core href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Experiment_Data_Core.md" "Details"
click Image_Processing_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Image_Processing_Management.md" "Details"
click Spot_Intensity_Analysis href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Spot_Intensity_Analysis.md" "Details"
click Mask_Label_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Mask_Label_Management.md" "Details"
click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Expression_Matrix_Generation.md" "Details"
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Core_Infrastructure.md" "Details"

Copilot uses AI. Check for mistakes.

click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Expression_Matrix_Generation.md" "Details"

click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL contains double slashes (//) before the filename. This should be a single slash to form a valid GitHub URL path.

Suggested change
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
click Experiment_Data_Core href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Experiment_Data_Core.md" "Details"
click Image_Processing_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Image_Processing_Management.md" "Details"
click Spot_Intensity_Analysis href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Spot_Intensity_Analysis.md" "Details"
click Mask_Label_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Mask_Label_Management.md" "Details"
click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Expression_Matrix_Generation.md" "Details"
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Core_Infrastructure.md" "Details"

Copilot uses AI. Check for mistakes.

click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Expression_Matrix_Generation.md" "Details"

click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL contains double slashes (//) before the filename. This should be a single slash to form a valid GitHub URL path.

Suggested change
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
click Experiment_Data_Core href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Experiment_Data_Core.md" "Details"
click Image_Processing_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Image_Processing_Management.md" "Details"
click Spot_Intensity_Analysis href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Spot_Intensity_Analysis.md" "Details"
click Mask_Label_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Mask_Label_Management.md" "Details"
click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Expression_Matrix_Generation.md" "Details"
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Core_Infrastructure.md" "Details"

Copilot uses AI. Check for mistakes.

click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Expression_Matrix_Generation.md" "Details"

click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL contains double slashes (//) before the filename. This should be a single slash to form a valid GitHub URL path.

Suggested change
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
click Experiment_Data_Core href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Experiment_Data_Core.md" "Details"
click Image_Processing_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Image_Processing_Management.md" "Details"
click Spot_Intensity_Analysis href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Spot_Intensity_Analysis.md" "Details"
click Mask_Label_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Mask_Label_Management.md" "Details"
click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Expression_Matrix_Generation.md" "Details"
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Core_Infrastructure.md" "Details"

Copilot uses AI. Check for mistakes.

click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Expression_Matrix_Generation.md" "Details"

click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL contains double slashes (//) before the filename. This should be a single slash to form a valid GitHub URL path.

Suggested change
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
click Experiment_Data_Core href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Experiment_Data_Core.md" "Details"
click Image_Processing_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Image_Processing_Management.md" "Details"
click Spot_Intensity_Analysis href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Spot_Intensity_Analysis.md" "Details"
click Mask_Label_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Mask_Label_Management.md" "Details"
click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Expression_Matrix_Generation.md" "Details"
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Core_Infrastructure.md" "Details"

Copilot uses AI. Check for mistakes.

click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Expression_Matrix_Generation.md" "Details"

click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL contains double slashes (//) before the filename. This should be a single slash to form a valid GitHub URL path.

Suggested change
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding//Core_Infrastructure.md" "Details"
click Experiment_Data_Core href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Experiment_Data_Core.md" "Details"
click Image_Processing_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Image_Processing_Management.md" "Details"
click Spot_Intensity_Analysis href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Spot_Intensity_Analysis.md" "Details"
click Mask_Label_Management href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Mask_Label_Management.md" "Details"
click Expression_Matrix_Generation href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Expression_Matrix_Generation.md" "Details"
click Core_Infrastructure href "https://github.com/spacetx/starfish/blob/master/.codeboarding/Core_Infrastructure.md" "Details"

Copilot uses AI. Check for mistakes.
@shachafl
Copy link
Collaborator

Hey @ivanmilevtues,
Thank you for contributing.
I apologize for the delayed response. I am extremely busy this summer and didn't have the bandwidth to explore your tool.
At a first glance this looks interesting, especially the flow and the automatic diagram generation and update. But I did notice that there are some issues with generated links, where it either does not generate a link or the link does not lead to the source code.
Your most recent builds might solve those issues, can you rerun on our repo?

Regarding onboarding starfish, I think the current documentations are reasonable including the manually drawn diagrams (although not homogenous, so some are better than others). But your tool does a better job at giving a high-level summary, yet allows to look in more details on the various components.

@berl you have better perspective than me regarding onboarding new people, and the burden of documentation, what are your thoughts?

@ivanmilevtues
Copy link
Author

@shachafl you are absolutely right! We have added a lot of new things and updated our approach (also open-source now).

I regenerated the graphs for you, you can see how they render here: https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/starfish/on_boarding.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants