Skip to content

UTF-8 decode error when downloading binary files (tar.gz) #21

@ag26jan

Description

@ag26jan

Title

UTF-8 decode error when downloading binary files (tar.gz) - bundled PGPy issue


Description

When downloading binary files (e.g., .tar.gz) using the SendSafely Python SDK, the download fails with a UTF-8 decode error at approximately 8-9% completion. This occurs in both version 1.0.1 and the latest version 1.0.9.6.

Environment

  • Python version: 3.10.12
  • SendSafely SDK version: 1.0.9.6 (also tested with 1.0.1)
  • OS: Linux (Ubuntu)
  • cryptography version: 46.0.4

Error Message

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 0: invalid start byte

Full traceback:

File "/site-packages/sendsafely/pgpy/packet/packets.py", line 1245, in parse
    self.filename = packet[:fnl].decode()

pgpy.errors.PGPError: 'utf-8' codec can't decode byte 0x94 in position 0: invalid start byte

Steps to Reproduce

  1. Install sendsafely SDK: pip install sendsafely==1.0.9.6
  2. Attempt to download a binary file (tar.gz) from SendSafely:
from sendsafely import SendSafely

ss = SendSafely(base_url, api_key, api_secret)
package = ss.load_package_from_link(secure_link)
package.download_and_decrypt_file(file_id, download_directory="/tmp")
  1. Download starts but fails around 8-9% with the UTF-8 decode error

Root Cause Analysis

The issue is in the bundled PGPy library at sendsafely/pgpy/packet/packets.py line 1245:

self.filename = packet[:fnl].decode()

This line assumes UTF-8 encoding but fails when processing binary data chunks during decryption. The .decode() method defaults to UTF-8, which cannot handle arbitrary binary bytes.

Attempted Workaround

Patching line 1245 to use errors='replace':

self.filename = packet[:fnl].decode(errors='replace')

This fixes the UTF-8 error but reveals a secondary issue where the decryption then fails with:

PGPError: This message is not encrypted!

This suggests deeper issues with the bundled PGPy library's packet parsing for certain encrypted file formats.

Suggested Fix

  1. Update line 1245 in sendsafely/pgpy/packet/packets.py:

    self.filename = packet[:fnl].decode(errors='replace')

    Or use latin-1 encoding which can handle any byte value:

    self.filename = packet[:fnl].decode('latin-1')
  2. Investigate the deeper PGP packet parsing issue that causes "This message is not encrypted!" errors after the UTF-8 fix.

Impact

This bug prevents downloading binary files (tar.gz, zip, etc.) from SendSafely using the Python SDK, which is a critical functionality issue.

Additional Context

  • The error occurs at the same point (~8.5%) regardless of file size
  • The issue appears to be in how encrypted chunks are parsed, not in the initial connection or authentication
  • Both SDK versions (1.0.1 and 1.0.9.6) exhibit the same behavior since they both bundle the same PGPy code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions