Skip to content

Conversation

@abhijit9040
Copy link

Key Changes
PDF Conversion Engine:-

  • Prioritized LibreOffice (headless mode) for PDF generation across all platforms.
  • Added automatic detection of LibreOffice on Windows, including the default installation path when it is not available in PATH.
  • Implemented a temporary user profile using -env:UserInstallation to:
  • Prevent file locking
  • Enable parallel and reliable PDF conversions
  • Improve stability in CI/CD and Docker environments

Template Migration:-

  • Replaced .docx guide templates with .odt (OpenDocument Text) to ensure consistent, pixel-accurate PDF output via LibreOffice.
  • Updated scripts/convert.py to prioritize .odt templates for the guide layout.
  • Enhanced XML Processing
  • Refactored replace_text_in_xml_file to robustly support:
  • ODT internal XML (content.xml, styles.xml)
  • Existing IDML XML structures
  • Optimized replacement logic by sorting keys from longest to shortest, preventing partial placeholder corruption during substitution.

Script Robustness:-

  • Moved Microsoft Word–dependent imports (docx, docx2pdf) into targeted functions, allowing:
  • The script to run without these libraries when LibreOffice is available
  • Cleaner execution in Linux, Docker, and CI environments
  • Improved metadata extraction and language file matching to better handle:
  • Legacy naming conventions
  • Edge cases such as against-security editions
  • Variations in language code formats

Cleanup and Performance:-

  • Automated cleanup of intermediate .odt files after successful PDF generation.
  • Added more granular debug-level logging to improve observability and troubleshooting of the conversion pipeline.

Issue:-#2110 – Implement Cross-Platform PDF Generation without MS Word dependency
image

Copy link
Collaborator

@sydseter sydseter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at my comments. Thank you for your efforts!

@sydseter
Copy link
Collaborator

sydseter commented Feb 1, 2026

Remember to make sure you run the tests before pushing your commits.

@sydseter
Copy link
Collaborator

sydseter commented Feb 1, 2026

nc

@sydseter sydseter closed this Feb 1, 2026
@sydseter sydseter reopened this Feb 1, 2026
@abhijit9040 abhijit9040 requested a review from sydseter February 2, 2026 05:20
@abhijit9040
Copy link
Author

Hi @sydseter , Please review my PR and let me know if any further changes are needed.

@sydseter
Copy link
Collaborator

sydseter commented Feb 3, 2026

I will need some time to test it out. I'll get back to you.

@sydseter
Copy link
Collaborator

sydseter commented Feb 3, 2026

This works quite well. It would be great if we also could do a couple of the following things:
Could we document the installation for Libreoffice: here? https://github.com/owasp/cornucopia/blob/master/scripts/README.md

  • Windows: winget install -e --id TheDocumentFoundation.LibreOffice
  • Mac OS X: ?
  • Ubuntu: ?

@abhijit9040
Copy link
Author

Yes, that makes sense. I’ll add LibreOffice installation instructions to scripts/README.md

@abhijit9040
Copy link
Author

Hi @sydseter , Take a final look ,I have updated LibreOffice installation instructions . Let me know if any further changes are needed.

# install build dependencies
RUN apt-get update -y && apt-get install -y build-essential git nodejs npm \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
RUN apt-get update -y && (apt-get install -y build-essential git nodejs npm || (sleep 10 && apt-get update -y && apt-get install -y build-essential git nodejs npm)) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why installing these packages?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this is work on the converter it’s better not to do fixes on copi. If you believe this could be an improvement open an issue and explain why you believe we should do these changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened a separate issue proposing this as an improvement, with an explanation of why it could be useful.

# - Ex: hexpm/elixir:1.14.2-erlang-25.1.1-debian-bullseye-20220801-slim
#
FROM --platform=linux/amd64 hexpm/elixir:1.19-erlang-28.3-debian-bullseye-20251208@sha256:9d1e59c326674de89a2eac9cd7f118ae2917e1c6cde02e8fa4cd785198ca9be0 as builder
FROM hexpm/elixir:1.19-erlang-28.3-debian-bullseye-20251208@sha256:9d1e59c326674de89a2eac9cd7f118ae2917e1c6cde02e8fa4cd785198ca9be0 AS builder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe revert this?


Cornucopia is developed, maintained, updated and promoted by a worldwide team of volunteers. The contributors to date have been:

- Abhijit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also add your full name. Your choice.

# start a new build stage so that the final image will only contain
# the compiled release and other runtime necessities
FROM --platform=linux/amd64 hexpm/elixir:1.19-erlang-28.3-debian-bullseye-20251208@sha256:9d1e59c326674de89a2eac9cd7f118ae2917e1c6cde02e8fa4cd785198ca9be0
FROM hexpm/elixir:1.19-erlang-28.3-debian-bullseye-20251208@sha256:9d1e59c326674de89a2eac9cd7f118ae2917e1c6cde02e8fa4cd785198ca9be0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix in separate issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be done together with the retry mechanism, but not here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants