Skip to content

Conversation

@hdigital
Copy link
Owner

@hdigital hdigital commented Jan 3, 2026

Add codebook rendering script and Markdown formatting.

  • A Python script that renders the codebook (Rmd) to HTML, PDF, and Markdown
  • A formatting prompt to standardize Markdown style
  • Format the codebook Markdown (Rmd)

@hdigital
Copy link
Owner Author

hdigital commented Jan 3, 2026

Claude Sonnet 4.5 prompt usage

Perfect! The codebook formatting is already stable. Running the render and format cycle twice produced identical results with no changes to codebook.md. The codebook.Rmd file is properly formatted with the correct escapes and line breaks to ensure stable output through the render+format pipeline.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds infrastructure for rendering and formatting the Party Facts codebook by introducing:

  • A new R script (codebook-render.R) that renders the codebook to HTML, Markdown, and PDF formats
  • A formatting prompt document that provides guidelines for synchronizing and standardizing Markdown formatting between the source Rmd and generated Markdown files
  • Extensive formatting updates to both codebook.Rmd and codebook.md including text wrapping, consistent use of en-dashes, proper escaping of special characters, and standardized spacing

Key Changes

  • Added codebook-render.R script that automates rendering to multiple output formats with error handling for optional PDF generation
  • Created formatting prompt documentation in .github/prompts/ with a convenience symlink in scripts/
  • Applied comprehensive formatting standardization throughout the codebook files including line wrapping, escape character handling, and punctuation consistency

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
codebook/codebook-render.R New R script to render codebook to HTML, Markdown, and PDF formats with error handling
.github/prompts/format-codebook.prompt.md Documentation of the formatting workflow and guidelines for maintaining consistency
scripts/format-codebook.prompt.md Symlink to the prompt file for easier access from the scripts directory
codebook/codebook.Rmd Source RMarkdown file with extensive formatting updates including text wrapping, consistent punctuation, proper escaping, and spacing standardization
codebook/codebook.md Generated Markdown output reflecting all formatting changes from the Rmd source

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cat("\n\n2. Rendering Markdown (GitHub-flavored)...")
rmarkdown::render(
input = input_file,
output_format = "md_document",
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 27 states "GitHub-flavored Markdown", but the actual output format specified is "md_document" which is a generic markdown format. To render GitHub-flavored markdown, you should specify output_format = "github_document" instead, or use md_document with variant: gfm as shown in the YAML header of codebook.Rmd.

Note that the md_document format with variant gfm (as specified in the YAML) should produce GitHub-flavored markdown, but the explicit call to rmarkdown::render() uses just "md_document" without variant specification. Consider using github_document for clarity and consistency with the comment.

Suggested change
output_format = "md_document",
output_format = "github_document",

Copilot uses AI. Check for mistakes.
description: 'Synchronize codebook.Rmd and codebook.md with consistent formatting.'
---

1. Render the RMarkdown file: `Rscript -e "rmarkdown::render('codebook.Rmd', output_format = 'github_document')"`
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting prompt instructions (line 6) specify using output_format = 'github_document', but the actual rendering script (codebook-render.R, line 31) uses output_format = "md_document". These should be consistent to avoid confusion.

If the intention is to use GitHub-flavored markdown (as suggested by the prompt and the YAML header's variant: gfm), both the prompt and the rendering script should use the same format specification.

Copilot uses AI. Check for mistakes.
low = "thistle2", high = "darkblue",
# low="#fff7bc", high="#d95f0e",
guide="colorbar", na.value="lightgrey", name = expression(parties (log[e]))
guide = "colorbar", na.value = "lightgrey", name = expression(parties(log[e]))
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from expression(parties (log[e])) to expression(parties(log[e])) removes the space between "parties" and the opening parenthesis. This alters the visual appearance of the axis label in the plot.

If this formatting change is intentional for consistency, it's acceptable. However, if the previous spacing was deliberate for readability of the mathematical expression, consider whether this change is desired. The space might have been added to make "parties (log_e)" more readable as "parties (in logarithmic scale)" rather than appearing as a function call.

Copilot uses AI. Check for mistakes.
}

body {

body {
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from an empty line to 
body { combines the CSS style tag closing with the body selector using an HTML entity for newline. While this may be part of the formatting standardization, it reduces readability of the source code.

The HTML entity 
 represents a line feed character. This appears to be an artifact of the markdown rendering/formatting process. Consider whether this change improves or harms readability - typically having CSS selectors on their own lines is more readable than using HTML entities to represent line breaks within the code block.

Suggested change

body {
body {

Copilot uses AI. Check for mistakes.
Comment on lines +307 to +319
For public datasets we complete the linking of all parties that meet
the
Party Facts population criteria (\>5%, see above). Public datasets are
visible on PF-Web without login and are archived regularly in the
PF-Dataverse.

For smaller parties, we may apply a staggered approach. All parties >5%, are linked during the initial import. Parties **<5% and >1%** (or 2%) are imported but **may not be linked completely** in Party Facts. A section "Linking status" in the "readme" of the dataset documents the status.
For smaller parties, we may apply a staggered approach. All parties
\>5%, are linked during the initial import. Parties **\<5% and \>1%**
(or 2%) are imported but **may not be linked completely** in Party
Facts. A section "Linking status" in the "readme" of the dataset
documents the status.

**Note** -- Staggered inclusion for parties <5% introduced in March 2021. Previously, all public and archived datasets were completely linked.
**Note** – Staggered inclusion for parties \<5% introduced in March
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 308, 309, 313, and 319 contain trailing spaces (two spaces at the end of lines) which are used in Markdown to create hard line breaks. While this is valid Markdown syntax, trailing spaces can be problematic as they are often invisible and may be automatically removed by editors or pre-commit hooks.

Consider using explicit line break syntax (backslash \ at end of line) instead of trailing spaces for better visibility and maintainability. Some lines in the same section already use backslash line breaks (e.g., lines 225-226, 230-231), so using backslash consistently would be better.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +10
script_path <- sub("--file=", "", args[grep("--file=", args)])
script_dir <- dirname(normalizePath(script_path))
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script attempts to extract the script path using grep("--file=", args) which may return an empty result when the script is run in certain contexts (e.g., via Rscript without the --file parameter, or through source()). This will cause sub() to operate on an empty character vector, and normalizePath() will fail.

Consider adding error handling to check if the script path was found, or provide a fallback mechanism. For example, you could check if length(script_path) > 0 before calling normalizePath(), or use tryCatch() around the working directory setup.

Suggested change
script_path <- sub("--file=", "", args[grep("--file=", args)])
script_dir <- dirname(normalizePath(script_path))
script_arg_index <- grep("--file=", args)
if (length(script_arg_index) > 0) {
script_path <- sub("^--file=", "", args[script_arg_index[1L]])
script_dir <- dirname(normalizePath(script_path))
} else {
# Fallback: use current working directory if script path is not available
script_dir <- getwd()
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants