Skip to content

[feature:] Create a vignette #8

@kellijohnson-NOAA

Description

@kellijohnson-NOAA

Describe the situation that led to the request and a solution

I think it would be good to create a vignette that shows the functionality of what {projectstats} can do. This could be general or could create the summary for all NSAP-relevant repositories where the vignette could be updated on a regular schedule, preferably every Sunday so the report could be looked at during the NSAP meeting on Monday.

Alternative solutions

No response

Statistical validity

No response

Additional context

Below is the code in the NSAP summary that was created for the 2025 NSAP meeting. Note that it should be looked at in "raw" form if it is to be copied and pasted. It was a .qmd file.


---
title: "GitHub Repositories"
format:
  html:
    theme: default
---

## Why

Within the National Stock Assessment Program (NSAP), it is a priority to place versioned code on GitHub Enterprise to facilitate ensuring that the code is automatically backed up without extra work from the individual user. This summary lists the number of repositories each user has under their personal GitHub account and how many of those are forks, which do not need to be on Enterprise.

```{r setup}
#| echo: false
#| warning: false

library(projectstats)
library(dplyr)
library(ggplot2)
library(knitr)
library(purrr)
library(tidyr)

# All GitHub user names for people in NSAP
user_names <- c(
  "Steven-Saul-NOAA",
  # Abby
  "AndreaChan-NOAA",
  "ClaireGonzales-NOAA",
  # Jeff
  # Len
  "kellijohnson-NOAA",
  "msupernaw",
  "Melissa-Karp",
  "ChristineStawitz-NOAA",
  "Andrea-Havron-NOAA",
  "Bai-Li-NOAA",
  "k-doering-NOAA",
  "PatrickLynch-NOAA",
  "e-perl-NOAA",
  "Schiano-NOAA",
  "sbreitbart-NOAA"
)

projects <- data.frame(
  organization = c("NOAA-FIMS", "nmfs-ost", "nmfs-ost", "nmfs-ost"),
  repository = c("FIMS", "DisMAP", "asar", "ss3-source-code")
)

Data

The following GitHub users names were searched: r glue::glue_collapse({user_names}, sep = ", ", last = ", and "). For each username, we were able to determine the number of repositories that were held by the user and whether not not each repository represented original work or a fork of someone else's work.

# Pull the data from GitHub for each user name
repository_data <- purrr::map_df(
  user_names,
  get_repositories,
  type = "users"
)

# Get the number of stars for FIMS and dismap
stars_data <- purrr::map2_df(
  .x = projects[, "organization"],
  .y = projects[, "repository"],
  .f = get_stargazers,
  .id = "project"
) |>
  calculate_cumulative_stars()

Results

#| tbl-cap: "Number of forked and original repositories held by each GitHub user name."
#| echo: false

repository_data |>
  dplyr::mutate(fork = ifelse(fork, "forked", "original")) |>
  dplyr::group_by(login, fork) |>
  dplyr::count() |>
  tidyr::pivot_wider(names_from = "fork", values_from = "n") |>
  dplyr::arrange(dplyr::desc(original)) |>
  knitr::kable()
#| fig-cap: "Number of stars since the first day a repository was starred."
#| echo: false
ggplot2::ggplot(
  stars_data,
  ggplot2::aes(day, cumulative_stars, color = project)
) +
  ggplot2::geom_line() +
  ggplot2::xlab("Days since first star") +
  ggplot2::ylab("Number of stars")

Metadata

Metadata

Assignees

No one assigned

    Labels

    NSAP-code-cleanupTo be worked on by a member of the National Stock Assessment Program during code cleanup hours

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions