Skip to content

EIM-519: add streaming tar.xz decompression using xz2#453

Open
alirana01 wants to merge 2 commits intomasterfrom
windows-installer-tar-xz-oom-449
Open

EIM-519: add streaming tar.xz decompression using xz2#453
alirana01 wants to merge 2 commits intomasterfrom
windows-installer-tar-xz-oom-449

Conversation

@alirana01
Copy link
Collaborator

EIM-519: Fix OOM during tar.xz decompression on Windows

Problem

The ESP-IDF installation fails with on Windows during the tar.xz decompression step due to memory allocation failure. The root cause is that decompress_tar_xz reads the entire decompressed payload into memory via lzma_rs::xz_decompress(&mut reader, &mut decompressed_data) before passing it to the tar extractor. For large archives this exhausts available memory.

Fixes #449

Solution

Replace the in-memory decompression approach with streaming decompression using the xz2 crate's XzDecoder, which wraps liblzma. The decoder is piped directly into tar::Archive, so data flows from disk through the XZ decoder into the tar extractor without buffering the full decompressed content in memory.

Key changes:

  • New dependency: xz2 = "0.1.6" (with static feature for cross-platform builds) replaces the in-memory lzma-rs usage for tar.xz decompression
  • decompress_tar_xz: Rewritten to use XzDecoder::new(BufReader::new(file))Archive::new(decoder) streaming pipeline
  • Per-entry extraction: Entries are extracted individually via entry.unpack_in(), with debug logging for each entry and summary statistics on completion
  • Removed: Old in-memory decompression path using lzma_rs::xz_decompress and Vec<u8> buffer

Testing

  • Verified with system monitoring to see the memory consumption of the process
  • Confirmed successful extraction of tar.xz archives without too much memory consumption
  • Verified no regression on existing zip and tar.gz decompression paths

Checklist

  • Old decompress_tar_xz function and lzma-rs dependency cleaned up
  • Doc comments added following project conventions
  • Tested on Windows, macOS, and Linux

@alirana01 alirana01 self-assigned this Feb 18, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alirana01, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical Out-Of-Memory issue encountered during tar.xz archive decompression on Windows by transitioning from an in-memory decompression method to a streaming one. The change significantly reduces memory consumption by processing archive contents on the fly, thereby improving the reliability and performance of the ESP-IDF installation process for large archives.

Highlights

  • Memory Optimization: Replaced in-memory tar.xz decompression with a streaming approach to prevent Out-Of-Memory (OOM) errors, especially on Windows with large archives.
  • Dependency Update: Switched from the lzma-rs crate to the xz2 crate (with the static feature) for XZ decompression, leveraging liblzma for streaming capabilities.
  • Streaming Decompression Pipeline: The decompress_tar_xz function was rewritten to pipe the XzDecoder directly into tar::Archive, allowing data to flow from disk through decompression to extraction without full buffering.
  • Detailed Extraction Logging: Implemented per-entry extraction with debug logging for each file/directory and summary statistics (files, directories, total bytes) upon completion.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src-tauri/Cargo.lock
    • Removed lzma-rs from the dependency lock file.
    • Added xz2 to the dependency lock file.
  • src-tauri/Cargo.toml
    • Replaced the lzma-rs dependency with xz2 and enabled its static feature.
  • src-tauri/src/lib/mod.rs
    • Imported debug from the log crate for more granular logging.
    • Imported XzDecoder from xz2::read to enable streaming XZ decompression.
    • Rewrote the decompress_tar_xz function to utilize XzDecoder for streaming decompression.
    • Implemented an iterative process to extract each entry from the tar archive individually using entry.unpack_in().
    • Added debug logging for each extracted file or directory, including its type, path, and size.
    • Introduced counters for files, directories, and total bytes to provide extraction summary statistics.
    • Removed the previous implementation that used lzma_rs::xz_decompress to load the entire decompressed payload into memory.
Activity
  • No specific activity (comments, reviews, etc.) was provided in the context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot]

This comment was marked as outdated.

Copy link
Collaborator

@Hahihula Hahihula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, the sreaming approach is good idea, but this PR not as it's regresion to already abandoned xz2 crate which is wrapper for liblzma C library. This was causing lot of issues in the past.

tar = { version = "0.4", default-features = false }
zip = { version = "2.2.2", default-features = true }
lzma-rs = "0.3.0"
xz2 = { version = "0.1.6", features = ["static"] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not introduce this dependency on C

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug Report] Issue with ESP-IDF Installation (EIM-519)

2 participants

Comments