Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/site/reference/formatter-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ The formatter's CLI must be of the form:
Where:

- `<command>` is the name of the formatter executable.
- `[options]` is any number of flags and options that the formatter accepts.
- `[...<files>]` is one or more files given to the formatter for processing.
- `[options]` are any number of flags and options that the formatter accepts.
- `[...<files>]` are one or more files given to the formatter for processing.

Example:

Expand Down
55 changes: 55 additions & 0 deletions docs/site/reference/stdin-spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
outline: deep
---

# Stdin Specification

Formatters **MAY** also implement the Stdin Specification, which allows
formatting "virtual files" passed via stdin.
Comment on lines +7 to +8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this support expressed in the treefmt.toml config?

Copy link
Collaborator Author

@jfly jfly Aug 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we'd add a new formatter.<mylanguage>.stdin (or stdin_supported) bool that defaults to false if unspecified. Feedback appreciated.

Do you think the stdin spec is the place to put this? I was thinking this would go in our config docs.


A formatter **MUST** implement the Stdin Specification if its formatting behavior
can depend on the name of the file being formatted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit hard to read

Copy link
Collaborator Author

@jfly jfly Aug 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I don't know how to fix it. Does flipping the sentence around help? Suggestions appreciated.

If a formatter's behavior can depend on the name of the file being formatted, then it MUST implement the Stdin Specification.


## Rules

In order for the formatter to comply with this spec, it **MUST** implement the
vanilla [Formatter Specification](/reference/formatter-spec), and additionally
satisfy the following:

### 1. `--stdin-filepath` flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### 1. `--stdin-filepath` flag
### 1. `--stdin` flag

KISS?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm lightly opposed to this, for 2 reasons:

  1. I think --stdin-filepath is clearer that the option takes a value.
  2. treefmt itself currently implements a --stdin flag with different semantics, and I would like for treefmt to implement the new stdin spec. Changing this might be awkward?

WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfly FWIW, that sounds reasonable to me (I agree with and I would like for treefmt to implement the new stdin spec).


The formatter's CLI **MUST** be of the form:

```
<command> [options] [--stdin-filepath <path>]
```

Where:

- `<command>` is the name of the formatting tool.
- `[options]` are any number of flags and options that the formatter accepts.
- `--stdin-filepath <path>` is an optional flag that puts the formatter in
"stdin mode". In stdin mode, the formatter reads file contents from stdin
rather than the filesystem.
- The formatter _MAY_ alter its behavior based on the given `<path>`. For
example, if a there are different formatting rules in different
directories. If the formatter's behavior doesn't depend on the given
`<path>`, it's ok to ignore it.
- The formatter _MAY_ understand `--stdin-filepath=<path>` as well, but **MUST**
understand the space separated variant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to outline the purpose of the <path> argument explicitly. Both you and I know why it's there, but maybe not the reader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path argument is not needed by the tool, it's also valid to ignore it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a stab at this. Please take a look.

Comment on lines +31 to +39
Copy link

@cormacrelf cormacrelf Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tools out there whose cmdline parsers just do not support double dash flags with values. For example:

https://pkg.go.dev/flag#hdr-Command_line_flag_syntax

According to that documentation, any go-based formatter cannot support this without swapping away from the standard library's flags package. Treefmt itself is saved by using cobra.

Here are a few examples of stdin mode from a repo of mine. I count 5 6 7 unique ways of specifying this:

  • detects stdin, doesn't need a filepath: rustfmt, gofmt, alejandra --quiet
  • requires a - path: ruff format --silent -, mdformat -, tofu fmt -
  • buildifier -path=$path -
  • taplo format - --stdin-filepath=$path
  • prettier --stdin-filepath=$path
  • biome format --write --stdin-file-path $path
  • An old internal tool using --stdin-filename=...

Given this extensive variation, I believe the structure of the command line flags should not be specified AT ALL. You should just let the user describe what the tool accepts. The way jj fix configures this is quite good: it substitutes $path in command args like --stdin-filepath=$path (coincidentally every tool must run in stdin mode). Making treefmt work with everything it possibly can is more important than making a nice spec. I dread having to make PRs to 15-year-old barely-maintained build tools to make them support the spec, or writing endless formatting wrapper scripts. Basically please don't make me do that.

If the tool does support stdin mode, I don't see why you would ever want to run it in non-stdin mode. For example, rustfmt has both modes, but rustfmt file.rs does a filesystem walk from there for every mod statement it discovers. That's got to be bad for performance if treefmt is scheduling one rustfmt per file, since a file is formatted N times if it's N modules deep in a tree! (Edit: Unless you're running multiple files per formatter instance, in which case rustfmt does not duplicate work and will keep its code in cache better, but you still may not want this if some other formatter is single threaded.)

Therefore, I think you should just add a toml option saying "this tool runs in stdin mode", or perhaps just an alternate set of command line flags to use in stdin mode. Stdin mode is preferable in many cases, for example you can coordinate 5 formatter processes and pipe their stdin all the way through without buffering after every formatter. So many tools support it already with so many different flags that it's worth being flexible so this can be useful without waiting 2 years for a bunch of CLI flag adjustments to go through.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to that documentation, any go-based formatter cannot support this without swapping away from the standard library's flags package

Sorry, I don't follow this point. From the page you linked to, I see this:

The following forms are permitted:

--flag   // double dashes are also permitted

What am I missing?

Basically please don't make me do that. [...] it's worth being flexible so this can be useful without waiting 2 years for a bunch of CLI flag adjustments to go through.

I hear you. We will definitely need a compatibility story, but it's TBD if that lives in treefmt itself, or treefmt-nix.

If the tool does support stdin mode, I don't see why you would ever want to run it in non-stdin mode. [...] Edit: Unless you're running multiple files per formatter instance, in which case rustfmt does not duplicate work and will keep its code in cache better

This is exactly why: invoking the formatter N / B times (where B is our batch size) rather than N times.

you still may not want this if some other formatter is single threaded

This is a good question. We probably kind of assume that formatters are single threaded.

Copy link

@cormacrelf cormacrelf Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it seems the docs page for go flags is just terrible. It says "the following forms are permitted" and enumerates -flag and --flag as if to be exhaustive, but doesn't list --flag=value or --flag value. These do in fact work, as I just discovered. Notoriously posix getopt does not support that.

Compatibility story... TBD if that lives in treefmt itself, or treefmt-nix.

Please put it in treefmt. I don't want to use treefmt-nix. Sorry but I'm not learning yet another config schema, especially an opinionated one. And presuming you mean treefmt-nix would contain wrappers for all the well-known formatters to force them to fit the spec, one of the worst parts of Nix generally is that e.g. clang is buried about 40 layers deep under hacky bash wrappers each made of 6 layers of @@TEMPLATING_SYNTAX@@ and you never know what flags are actually being passed to the thing. Aside from making treefmt difficult to use for anything not covered by treefmt-nix, I don't know why you'd want to perpetuate that or take on that workload.

For everyday difficulty, try this example... let's say you make a typo over and over again. You have a code formatter to replace that specific typo. (This is legitimately very useful with jj fix for renaming a type or rewriting code across 80 commits at once, including code that was introduced and then replaced in a later commit.) You use sd to do it:

[formatter.cameron]
command = "sd"
options = ["Totanic", "Titanic"]
includes = ["*.md"]

This is instantly not compatible with the stdin spec. Basic things like this should be silently using tempfiles and not bothering me about spec compliance.

The batching etc

I think best solution is have a batch mode and a stdin mode. If no stdin mode is specified we use a temp file for stdin tasks. If no regular (batch mode) options are specified we use stdin when batch mode is called for. Bikeshed the options names but basically

[formatter.rust]
command = "rustfmt"
options = ["--edition", "2018"]
stdin_options = ["--edition", "2018", "--stdin-filepath=$path"]
includes = ["*.rs"]

Copy link
Collaborator Author

@jfly jfly Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use sd to do it:

[formatter.cameron]
command = "sd"
options = ["Totanic", "Titanic"]
includes = ["*.md"]

This is instantly not compatible with the stdin spec.

I don't see the problem here. You are correct that this tool does not implement the stdin spec, but what's the issue? It's optional to implement. From the proposed spec:

Formatters MAY also implement the Stdin Specification [...] A formatter MUST implement the Stdin Specification if its formatting behavior can depend on the name of the file being formatted.

I think best solution is have a batch mode and a stdin mode

I like your options and stdin_options suggestion here!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my point is it's optional, but if you don't implement it, what does treefmt --stdin scene-01.md do? And if your answer is that treefmt uses a tempfile because it doesn't support the spec, how do we know that? And how does treefmt decide to use --stdin-filepath=file.rs for something that does support the spec?

Realistically if we follow this to its conclusion, we get to either "we always pass --stdin-filepath=... and if the tool does not support it, that's their fault", or "we sometimes pass it, depending on a config flag for each formatter". Either option has a dead zone of formatters that don't support the spec when running treefmt --stdin which can't be supported except with a wrapper script (which, notably, you can't write easily in bash, because of getopt not supporting double dash flags).

The stdin_options approach does not have a dead zone and only comes at the cost of repeating yourself slightly.

Copy link
Collaborator Author

@jfly jfly Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if your answer is that treefmt uses a tempfile because it doesn't support the spec, how do we know that?

I'm sorry, but I don't see the problem here. This is something treefmt already supports, and I'm not proposing changing that. It uses a tempfile under the hood because how else could it possibly work?

And how does treefmt decide to use --stdin-filepath=file.rs for something that does support the spec?

We could add a boolean for stdin_spec, but I hear you: at the point that we're implementing something like stdin_options, it's probably simpler to just be opinionated unopinionated.

Copy link

@cormacrelf cormacrelf Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses a tempfile under the hood because how else could it possibly work?

I asked only to prompt the necessity of some kind of config, instead of unconditionally assuming it supports the spec and adding --stdin-filepath=.... I think we understand each other now.


Example:

```
$ echo "{}" | nixfmt --stdin-filepath path/to/file.nix
```

### 2. Print to stdout, do not assume file is present on filesystem

When in stdin mode, the formatter:

1. **MUST** print the formatted file to stdout.
2. **MUST NOT** attempt to read the file on the filesystem. Instead, it
**MUST** read from stdin.
3. **MUST NOT** write to the given path on the filesytem. It _MAY_ write to
temporary files elsewhere on disk, but _SHOULD_ clean them up when done.