Skip to content

Conversation

@Matt711
Copy link
Contributor

@Matt711 Matt711 commented Jan 14, 2026

Closes #216

Adds progress bars via --progress flag. The basic design is
straightforward. I'm using a global thread-safe ProgressTracker
that holds a HashMap<Table, TableProgress>. Each TableProgress
contains counters for parts and buffers, plus an indicatif::ProgressBar
for visualizing the bars. The tracker is cloned to all tasks and they all
update the same ProgressTracker. Updates happen at two points:
after each buffer/chunk write and after each file completes. The MultiProgress
coordinator is stored to keep the display alive. The progress visualization
is disabled by default. tpchgen-rs is usually too fast to turn it on until
about sf10 from local macbook experiments. As @clflushopt said,

This seems like a good UX improvement, although being fast enough that a progress bar isn't necessary might be better 😅

So lets keep going 🚀

Screen.Recording.2026-01-13.at.10.43.42.AM.mov

@alamb
Copy link
Collaborator

alamb commented Jan 17, 2026

Thank you @Matt711 -- this looks very cool. I am not sure I will have time to review it properly. The screen recording looks really nice

One question I had was what do you think about showing progress by default? It seems like this might interact weirdly with #225 from @kevinjqliu -- specifically logging)

@kevinjqliu
Copy link
Collaborator

Just merged #225 which defines the logging level based on the given param (default, --quiet, --verbose)

// Configure logging
if self.quiet {
// Quiet mode: only show error-level logs
env_logger::builder()
.filter_level(LevelFilter::Error)
.init();
} else if self.verbose {
env_logger::builder().filter_level(LevelFilter::Info).init();
info!("Verbose output enabled (ignoring RUST_LOG environment variable)");
} else {
// Default: show warnings and errors, but respect RUST_LOG if set
env_logger::builder()
.filter_level(LevelFilter::Warn)
.parse_default_env()
.init();
}

I think we should take this into consideration when rendering the process bar. i.e. turn progress bar off for --quiet

@alamb
Copy link
Collaborator

alamb commented Jan 18, 2026

I think we should take this into consideration when rendering the process bar. i.e. turn progress bar off for --quiet

Yeah, I was thinking we could potentially go all the way and enable the progress bar by default -- though I am not sure how that would interact with the printing of the default logging


/// Tracks progress for all tables being generated
#[derive(Clone, Debug)]
pub struct ProgressTracker {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing some other users have asked for is to reuse some of the tpchgen-cli machinery -- I suspect they may have ideas about using the ProgressTracker, etc

What do you think about abstracting the interface out -- e.g. so this is like Box<dyn ProgressTracker> or something

A default implementaion using multi_progress seems good

tables: Mutex<HashMap<Table, TableProgress>>,
// MultiProgress is unused but we need to it alive to
// manage the registered progress bars.
#[allow(dead_code)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why allow dead code?

@kevinjqliu
Copy link
Collaborator

Yeah, I was thinking we could potentially go all the way and enable the progress bar by default -- though I am not sure how that would interact with the printing of the default logging

I had some time to think about the different interactions when implementing --quiet. The conclusion I've came to is

  • --quiet is explicit and absolute. The user specifies this to silence ALL output. We should not display a progress bar when --quiet is specified.
  • --verbose is a request for more information. Internally, we raise the logging level to output more information. Progress bar should show for this case.
  • For the "default" case, we can show the progress bar. I think its a nice improvement to the UI

LMK what you think!

@kevinjqliu
Copy link
Collaborator

i tested this out locally and found an edge case where skip generation (because the file already exists) has some weird interactions

➜  tpchgen-rs git:(fea/progress-tracker) ✗ cargo run --bin tpchgen-cli --release -- --tables region --format csv -P              
    Finished `release` profile [optimized] target(s) in 0.45s
     Running `target/release/tpchgen-cli --tables region --format csv -P`
region     [░░░░░░░░░░░░░░░░░░░░░░░░░░░░] Parts:   0/1    Buffers:     0   0%                                                                                                                               Info: ./region.csv already exists, skipping generation
region     [████████████████████████████] Parts:   1/1    Buffers:     0 100%                                                                                                                               

@alamb
Copy link
Collaborator

alamb commented Jan 19, 2026

LMK what you think!

That sounds like a good idea to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Track table generation with progress bars

3 participants