Skip to content

Google Drive performance: ~280x faster recursive listing with OpenDAL batch queries #443

@mro68

Description

@mro68

Summary

OpenDAL's Google Drive backend now supports efficient batch recursive listing, improving performance by ~280x compared to the previous implementation. This issue tracks when rustic_core can benefit from this improvement.

Problem

When using rustic with Google Drive via OpenDAL (opendal:gdrive), operations like repoinfo, check, or backup were extremely slow compared to using rclone:gdrive.

Root cause: OpenDAL's generic FlatLister made one API call per directory. For a repository with 256 data/XX/ subdirectories + other dirs, this meant ~260+ sequential API calls.

Solution (implemented in OpenDAL)

A custom GdriveFlatLister was implemented that uses Google Drive's OR query syntax to batch multiple directory lookups:

('id1' in parents or 'id2' in parents or 'id3' in parents ...)

This is the same approach rclone uses for its ListR implementation.

Performance Results

Tested with rustic repoinfo against a real Google Drive repository (~2100 files, 130 GiB):

Metric Before After Improvement
Scanning time 14 minutes 3 seconds ~280x faster
API calls ~260+ ~12 ~20x fewer

Dependencies

This improvement requires two OpenDAL PRs to be merged and released:

  1. fix(services/gdrive): include size and modifiedTime in list() metadata apache/opendal#7058 - fix(services/gdrive): include size and modifiedTime in list() metadata

    • Required for rustic to work at all (without this, file sizes are missing)
  2. feat(services/gdrive): implement batch recursive listing for ~200x performance improvement apache/opendal#7059 - feat(services/gdrive): implement batch recursive listing

    • The performance improvement itself

Timeline

  1. ⏳ Wait for OpenDAL PRs to be reviewed and merged
  2. ⏳ Wait for OpenDAL release (0.55.1 or 0.56.0)
  3. 🔧 Update rustic_core's OpenDAL dependency (see Update OpenDAL integration for 0.55+ (Scheme enum removed) #442 for API changes needed)
  4. ✅ Benefit from improved Google Drive performance

Related Issues

Notes

For users who need this fix now, you can use a patched OpenDAL version by adding to your Cargo.toml:

[patch.crates-io]
opendal = { git = "https://github.com/mro68/opendal", branch = "feat/gdrive-batch-recursive-listing" }

This branch includes both the metadata fix and the performance improvement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    S-triageStatus: Waiting for a maintainer to triage this issue/PR

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions