Add script to render codebook #83

hdigital · 2026-01-03T18:22:05Z

Add codebook rendering script and Markdown formatting.

A Python script that renders the codebook (Rmd) to HTML, PDF, and Markdown
A formatting prompt to standardize Markdown style
Format the codebook Markdown (Rmd)

hdigital · 2026-01-03T18:23:23Z

Claude Sonnet 4.5 prompt usage

Perfect! The codebook formatting is already stable. Running the render and format cycle twice produced identical results with no changes to codebook.md. The codebook.Rmd file is properly formatted with the correct escapes and line breaks to ensure stable output through the render+format pipeline.

Copilot

Pull request overview

This PR adds infrastructure for rendering and formatting the Party Facts codebook by introducing:

A new R script (codebook-render.R) that renders the codebook to HTML, Markdown, and PDF formats
A formatting prompt document that provides guidelines for synchronizing and standardizing Markdown formatting between the source Rmd and generated Markdown files
Extensive formatting updates to both codebook.Rmd and codebook.md including text wrapping, consistent use of en-dashes, proper escaping of special characters, and standardized spacing

Key Changes

Added codebook-render.R script that automates rendering to multiple output formats with error handling for optional PDF generation
Created formatting prompt documentation in .github/prompts/ with a convenience symlink in scripts/
Applied comprehensive formatting standardization throughout the codebook files including line wrapping, escape character handling, and punctuation consistency

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`codebook/codebook-render.R`	New R script to render codebook to HTML, Markdown, and PDF formats with error handling
`.github/prompts/format-codebook.prompt.md`	Documentation of the formatting workflow and guidelines for maintaining consistency
`scripts/format-codebook.prompt.md`	Symlink to the prompt file for easier access from the scripts directory
`codebook/codebook.Rmd`	Source RMarkdown file with extensive formatting updates including text wrapping, consistent punctuation, proper escaping, and spacing standardization
`codebook/codebook.md`	Generated Markdown output reflecting all formatting changes from the Rmd source

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-03T18:27:25Z

codebook/codebook-render.R

+cat("\n\n2. Rendering Markdown (GitHub-flavored)...")
+rmarkdown::render(
+  input = input_file,
+  output_format = "md_document",


The comment on line 27 states "GitHub-flavored Markdown", but the actual output format specified is "md_document" which is a generic markdown format. To render GitHub-flavored markdown, you should specify output_format = "github_document" instead, or use md_document with variant: gfm as shown in the YAML header of codebook.Rmd.

Note that the md_document format with variant gfm (as specified in the YAML) should produce GitHub-flavored markdown, but the explicit call to rmarkdown::render() uses just "md_document" without variant specification. Consider using github_document for clarity and consistency with the comment.

Suggested change

output_format = "md_document",

output_format = "github_document",

Copilot · 2026-01-03T18:27:26Z

.github/prompts/format-codebook.prompt.md

+description: 'Synchronize codebook.Rmd and codebook.md with consistent formatting.'
+---
+
+1. Render the RMarkdown file: `Rscript -e "rmarkdown::render('codebook.Rmd', output_format = 'github_document')"`


The formatting prompt instructions (line 6) specify using output_format = 'github_document', but the actual rendering script (codebook-render.R, line 31) uses output_format = "md_document". These should be consistent to avoid confusion.

If the intention is to use GitHub-flavored markdown (as suggested by the prompt and the YAML header's variant: gfm), both the prompt and the rendering script should use the same format specification.

Copilot · 2026-01-03T18:27:26Z

codebook/codebook.Rmd

+  low = "thistle2", high = "darkblue",
  # low="#fff7bc", high="#d95f0e",
-  guide="colorbar", na.value="lightgrey", name = expression(parties (log[e]))
+  guide = "colorbar", na.value = "lightgrey", name = expression(parties(log[e]))


The change from expression(parties (log[e])) to expression(parties(log[e])) removes the space between "parties" and the opening parenthesis. This alters the visual appearance of the axis label in the plot.

If this formatting change is intentional for consistency, it's acceptable. However, if the previous spacing was deliberate for readability of the mathematical expression, consider whether this change is desired. The space might have been added to make "parties (log_e)" more readable as "parties (in logarithmic scale)" rather than appearing as a function call.

Copilot · 2026-01-03T18:27:26Z

codebook/codebook.Rmd

 }
-
-body {
+&#10;body {


The change from an empty line to 
body { combines the CSS style tag closing with the body selector using an HTML entity for newline. While this may be part of the formatting standardization, it reduces readability of the source code.

The HTML entity 
 represents a line feed character. This appears to be an artifact of the markdown rendering/formatting process. Consider whether this change improves or harms readability - typically having CSS selectors on their own lines is more readable than using HTML entities to represent line breaks within the code block.

Suggested change


body {

body {

Copilot · 2026-01-03T18:27:27Z

codebook/codebook.Rmd

+For public datasets we complete the linking of all parties that meet
+the  
+Party Facts population criteria (\>5%, see above). Public datasets are  
+visible on PF-Web without login and are archived regularly in the
+PF-Dataverse.

-For smaller parties, we may apply a staggered approach. All parties >5%, are linked during the initial import. Parties **<5% and >1%** (or 2%) are imported but **may not be linked completely** in Party Facts. A section "Linking status" in the "readme" of the dataset documents the status.
+For smaller parties, we may apply a staggered approach. All parties  
+\>5%, are linked during the initial import. Parties **\<5% and \>1%**
+(or 2%) are imported but **may not be linked completely** in Party
+Facts. A section "Linking status" in the "readme" of the dataset
+documents the status.

-**Note** -- Staggered inclusion for parties <5% introduced in March 2021. Previously, all public and archived datasets were completely linked.
+**Note** – Staggered inclusion for parties \<5% introduced in March  


Lines 308, 309, 313, and 319 contain trailing spaces (two spaces at the end of lines) which are used in Markdown to create hard line breaks. While this is valid Markdown syntax, trailing spaces can be problematic as they are often invisible and may be automatically removed by editors or pre-commit hooks.

Consider using explicit line break syntax (backslash \ at end of line) instead of trailing spaces for better visibility and maintainability. Some lines in the same section already use backslash line breaks (e.g., lines 225-226, 230-231), so using backslash consistently would be better.

Copilot · 2026-01-03T18:27:27Z

codebook/codebook-render.R

+script_path <- sub("--file=", "", args[grep("--file=", args)])
+script_dir <- dirname(normalizePath(script_path))


The script attempts to extract the script path using grep("--file=", args) which may return an empty result when the script is run in certain contexts (e.g., via Rscript without the --file parameter, or through source()). This will cause sub() to operate on an empty character vector, and normalizePath() will fail.

Consider adding error handling to check if the script path was found, or provide a fallback mechanism. For example, you could check if length(script_path) > 0 before calling normalizePath(), or use tryCatch() around the working directory setup.

Suggested change

script_path <- sub("--file=", "", args[grep("--file=", args)])

script_dir <- dirname(normalizePath(script_path))

script_arg_index <- grep("--file=", args)

if (length(script_arg_index) > 0) {

script_path <- sub("^--file=", "", args[script_arg_index[1L]])

script_dir <- dirname(normalizePath(script_path))

} else {

# Fallback: use current working directory if script path is not available

script_dir <- getwd()

}

hdigital added 3 commits January 3, 2026 16:48

Add script to render codebook into all formats

9f03356

Add prompt to format codebook

6e5bff6

Format Markdown codebook

5f9ae66

hdigital requested a review from Copilot January 3, 2026 18:23

Copilot started reviewing on behalf of hdigital January 3, 2026 18:23 View session

Copilot AI reviewed Jan 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add script to render codebook #83

Add script to render codebook #83

Uh oh!

hdigital commented Jan 3, 2026 •

edited

Loading

Uh oh!

hdigital commented Jan 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Copilot AI Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	output_format = "md_document",
	output_format = "github_document",

		script_path <- sub("--file=", "", args[grep("--file=", args)])
		script_dir <- dirname(normalizePath(script_path))

-script_path <- sub("--file=", "", args[grep("--file=", args)])
-script_dir <- dirname(normalizePath(script_path))
+script_arg_index <- grep("--file=", args)
+if (length(script_arg_index) > 0) {
+  script_path <- sub("^--file=", "", args[script_arg_index[1L]])
+  script_dir <- dirname(normalizePath(script_path))
+} else {
+  # Fallback: use current working directory if script path is not available
+  script_dir <- getwd()
+}

Add script to render codebook #83

Are you sure you want to change the base?

Add script to render codebook #83

Uh oh!

Conversation

hdigital commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hdigital commented Jan 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hdigital commented Jan 3, 2026 •

edited

Loading