Skip to content

Enhance Git functionality with various improvements and fixes#1

Open
jame-crypto wants to merge 51 commits intoabraithwaite:alan/fetch-blob-size-limitfrom
jame-crypto:master
Open

Enhance Git functionality with various improvements and fixes#1
jame-crypto wants to merge 51 commits intoabraithwaite:alan/fetch-blob-size-limitfrom
jame-crypto:master

Conversation

@jame-crypto
Copy link

Thanks for taking the time to contribute to Git! Please be advised that the
Git community does not use github.com for their contributions. Instead, we use
a mailing list (git@vger.kernel.org) for code submissions, code reviews, and
bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
to conveniently send your Pull Requests commits to our mailing list.

For a single-commit pull request, please leave the pull request description
empty
: your commit message itself should describe your changes.

Please read the "guidelines for contributing" linked above!

gitster and others added 30 commits February 5, 2026 20:36
For a patch that touches a symbolic link, it is perfectly normal
that the contents ends with "\ No newline at end of file".  The
checks introduced recently to detect incomplete lines (i.e., a text
file that lack the newline on its final line) should not trigger.

Disable the check early for symbolic links, both in "git apply" and
"git diff" and test them.  For "git apply", we check only when the
postimage is a symbolic link regardless of the preimage, and we only
care about preimage when applying in reverse.  Similarly, "git diff"
would warn only when the postimage is a symbolic link, or the
preimage when running "git diff -R".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git checkout", "git switch", and "git restore" commands share a
single implementation, checkout_main(), which switches error message
it gives using the usage string passed by each of these three
front-ends.

In order to be able to tweak behaviours of the commands based on
which one we are executing, invent an enum that denotes which one of
these three commands is currently executing, and pass that to
checkout_main() instead.  With this step, there is no externally
visible behaviour change, as this enum parameter is only used to
choose among the three usage strings.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git checkout <dwim>" and "git switch <dwim>" need to error out
due to ambiguity of the branch name <dwim>, these two commands give
an advise message with a sample command that tells the user how to
disambiguate from the parse_remote_branch() function.  The sample
command hardcodes "git checkout", since this feature predates "git
switch" by a large margin.  To a user who said "git switch <dwim>"
and got this message, it is confusing.

Pass the "enum checkout_command", which was invented in the previous
step for this exact purpose, down the call chain leading to
parse_remote_branch() function to change the sample command shown to
the user in this advise message.

Also add a bit more test coverage for this "fail to DWIM under
ambiguity" that we lack, as well as the message we produce when we
fail.

Reported-by: Simon Cheng <cyqsimon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both subcommands in git-repo(1) accept the "keyvalue" format. This
format is newline-delimited, where the key is separated from the
value with an equals sign.

The name of this option is suboptimal though, as it is both too
limiting while at the same time not really indicating what it
actually does:

  - There is no mention of the format being newline-delimited, which
    is the key differentiator to the "nul" format.

  - Both "nul" and "keyvalue" have a key and a value, so the latter
    is not exactly giving any hint what makes it so special.

  - "keyvalue" requires there to be, well, a key and a value, but we
    want to add additional output that is only going to be newline
    delimited.

Taken together, "keyvalue" is kind of a bad name for this output
format.

Luckily, the git-repo(1) command is still rather new and marked as
experimental, so things aren't cast into stone yet. Rename the
format to "lines" instead to better indicate that the major
difference is that we'll get newline-delimited output. This new name
will also be a better fit for a subsequent extension in git-repo(1).

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user wants to find what are the available keys, they need to
either check the documentation or to ask for all the key-value pairs
by using --all.

Add a new flag --keys for listing only the available keys without
listing the values.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When setting up the revision walk in git-history(1) we also perform some
verifications whether the request actually looks sane. Unfortunately,
these verifications come _after_ we have already asked the user for the
commit message of the commit that is to be rewritten. So in case any of
the verifications fails, the user will have lost their modifications.

Extract the function to set up the revision walk and call it before we
ask for user input to fix this.

Adapt one of the tests that is expected to fail because of this check
to use false(1) as editor. If the editor had been executed by Git, it
would fail with the error message "Aborting commit as launching the
editor failed."

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The replay infrastructure is not yet capable of replaying merge commits.
Unfortunately, we only notice that we're about to replay merges after we
have already asked the user for input, so any commit message that the
user may have written will be discarded in that case.

Fix this by checking whether the revwalk contains merge commits before
we ask for user input.

Adapt one of the tests that is expected to fail because of this check
to use false(1) as editor. If the editor had been executed by Git, it
would fail with the error message "Aborting commit as launching the
editor failed."

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-history(1) command has the ability to perform a dry-run
that will not end up modifying any references. Instead, we'll only print
any ref updates that would happen as a consequence of performing the
operation.

This mode is somewhat hidden though behind the "--ref-action=print"
option. This command line option has its origin in git-replay(1), where
it's probably an okayish interface as this command is sitting more on
the plumbing side of tools. But git-history(1) is a user-facing tool,
and this way of achieving a dry-run is way too technical and thus not
very discoverable.

Besides usability issues, it also has another issue: the dry-run mode
will always operate as if the user wanted to rewrite all branches. But
in fact, the user also has the option to only update the HEAD reference,
and they might want to perform a dry-run of such an operation, too. We
could of course introduce "--ref-action=print-head", but that would
become even less ergonomic.

Replace "--ref-action=print" with a new "--dry-run" toggle. This new
toggle works with both "--ref-action={head,branches}" and is way more
discoverable.

Add a test to verify that both "--ref-action=" values behave as
expected.

This patch is best viewed with "--ignore-space-change".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the preceding commit we have changed "--ref-action=" to only
control which refs are supposed to be updated, not what happens with
them. As a consequence, the option is now somewhat misnamed, as we don't
control the action itself anymore.

Rename it to "--update-refs=" to better align it with its new use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we document the values that can be passed to the "--update-refs="
option, we don't give the user any hint what the default behaviour is.
Document it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "lstrip" and "rstrip" options to the %(refname) placeholder both
accept a negative length, which asks us to keep that many path
components (rather than stripping that many).

The code to count components and convert the negative value to a
positive was copied from lstrip to rstrip in 1a34728 (ref-filter: add
an 'rstrip=<N>' option to atoms which deal with refnames, 2017-01-10).

Let's factor it out into a separate function. This reduces duplication
and also makes the lstrip/rstrip functions much easier to follow, since
the bulk of their code is now the actual stripping.

Note that the computed "remaining" value is currently stored as a
"long", so in theory that's what our function should return. But this is
purely historical. When the variable was added in 0571979 (tag: do
not show ambiguous tag names as "tags/foo", 2016-01-25), we parsed the
value from strtol(), and thus used a long. But these days we take "len"
as an int, and also use an int to count up components. So let's just
consistently use int here. This value could only overflow in a
pathological case (e.g., 4GB worth of "a/a/...") and even then will not
result in out-of-bounds memory access (we keep stripping until we run
out of string to parse).

The minimal Myers diff here is a little hard to read; with --patience
the code movement is shown much more clearly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're walking forward in the string, skipping path components from
left-to-right. So when we've stripped as much as we want, the pointer we
have is a complete NUL-terminated string and we can just return it
(after duplicating it, of course). So there is no need for a temporary
allocated string.

But we do make an extra temporary copy due to f0062d3 (ref-filter:
free item->value and item->value->s, 2018-10-18). This is probably from
cargo-culting the technique used in rstrip_ref_components(), which
_does_ need a separate string (since it is stripping from the end and
ties off the temporary string with a NUL).

Let's drop the extra allocation. This is slightly more efficient, but
more importantly makes the code much simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're stripping path components from the end of a string, which we do by
assigning a NUL as we parse each component, shortening the string. This
requires an extra temporary buffer to avoid munging our input string.

But the way that we allocate the buffer is unusual. We have an extra
"to_free" variable. Usually this is used when the access variable is
conceptually const, like:

   const char *foo;
   char *to_free = NULL;

   if (...)
           foo = to_free = xstrdup(...);
   else
           foo = some_const_string;
   ...
   free(to_free);

But that's not what's happening here. Our "start" variable always points
to the allocated buffer, and to_free is redundant. Worse, it is marked
as const itself, requiring a cast when we free it.

Let's drop to_free entirely, and mark "start" as non-const, making the
memory handling more clear. As a bonus, this also silences a warning
from glibc-2.43 that our call to strrchr() implicitly strips away the
const-ness of "start".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To strip path components from our refname string, we repeatedly call
strrchr() to find the trailing slash, shortening the string each time by
assigning NUL over it. This has two downsides:

  1. Calling strrchr() in a loop is quadratic, since each call has to
     call strlen() under the hood to find the end of the string (even
     though we know exactly where it is from the last loop iteration).

  2. We need a temporary buffer, since we're munging the string with NUL
     as we shorten it (which we must do, because strrchr() has no other
     way of knowing what we consider the end of the string).

Using memrchr() would let us fix both of these, but it isn't portable.
So instead, let's just open-code the string traversal from back to
front as we loop.

I doubt that the quadratic nature is a serious concern. You can see it
in practice with something like:

  git init
  git commit --allow-empty -m foo
  echo "$(git rev-parse HEAD) refs/heads$(perl -e 'print "/a" x 500_000')" >.git/packed-refs
  time git for-each-ref --format='%(refname:rstrip=-1)'

That takes ~5.5s to run on my machine before this patch, and ~11ms
after. But I don't think there's a reasonable way for somebody to infect
you with such a garbage ref, as the wire protocol is limited to 64k
pkt-lines. The difference is measurable for me for a 32k-component ref
(about 19ms vs 7ms), so perhaps you could create some chaos by pushing a
lot of them. But we also run into filesystem limits (if the loose
backend is in use), and in practice it seems like there are probably
simpler and more effective ways to waste CPU.

Likewise the extra allocation probably isn't really measurable. In fact,
since our goal is to return an allocated string, we end up having to
make the same allocation anyway (though it is sized to the result,
rather than the input). My main goal was simplicity in avoiding the need
to handle cleaning it up in the early return path.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch" takes "--from=<user ident>" command line option
and uses the given ident for patch e-mails, but this is not applied
to the cover letter, the option is ignored and the committer ident
of the current user is used. This has been the case ever since
"--from" was introduced in a908047 (teach format-patch to place
other authors into in-body "From", 2013-07-03).

Teach the make_cover_letter() function to honor the option, instead of
always using the current committer identity. Change variable name from
"committer" to "from" to better reflect the purpose of the variable.

Signed-off-by: Mirko Faina <mroik@delayed.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests use a pattern that writes to a temporary file like this:

  printf "do something with %d\n" $(test_seq <count>) >tmpfile &&
  git do-something --stdin <tmpfile

Other tests use test_seq's -f parameter, but still write to a temporary file:

  test_seq -f "do something with %d" <count> >input &&
  git do-something --stdin <input

Simplify both of these patterns to

  test_seq -f "do something with %d" <count> |
  git do-something --stdin

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c39952b (fetch: choose a sensible default with --jobs=0 again,
2023-02-20), the `--jobs=0` behavior was (re)introduced, but it went
undocumented. Since this is the same behavior as `git -c fetch.parallel=0
fetch`, which is documented, this change creates symmetry between the two
documentation sections.

Signed-off-by: Daniel D. Beck <daniel@ddbeck.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
help.c has its own get_alias() config callback that duplicates the
parsing logic in alias.c. Consolidate by teaching list_aliases() to
also store the alias values (via the string_list util field), then
use it in list_all_cmds_help_aliases() instead of the private
callback.

This preserves the existing error checking for value-less alias
definitions by checking in alias.c rather than help.c.

No functional change intended.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch git_unknown_cmd_config() from skip_prefix() to
parse_config_key() for alias parsing. This properly handles the
three-level config key structure and prepares for the new
alias.*.command subsection syntax in the next commit.

This is a compatibility break: the alias configuration parser used
to be overly permissive and accepted "alias.<subsection>.<key>" as
defining an alias "<subsection>.<key>". With this change,
alias.<subsection>.<key> entries are silently ignored (unless <key>
is "command", which will be given meaning in the next commit).

This behavior was arguably a bug, since config subsections were never
intended to work this way for aliases, and aliases with dots in their
names have never been documented or intentionally supported.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git alias names are limited to ASCII alphanumeric characters and
dashes because aliases are implemented as config variable names.
This prevents aliases being created in languages using characters outside that range.

Add support for arbitrary alias names by using config subsections:

    [alias "förgrena"]
        command = branch

The subsection name is matched as-is (case-sensitive byte comparison),
while the existing definition without a subsection (e.g.,
"[alias] co = checkout") remains case-insensitive for backward
compatibility. This uses existing config infrastructure since
subsections already support arbitrary bytes, and avoids introducing
Unicode normalization.

Also teach the help subsystem about the new syntax so that "git help
-a" properly lists subsection aliases and the autocorrect feature can
suggest them. Use utf8_strwidth() instead of strlen() for column
alignment so that non-ASCII alias names display correctly.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The zsh completion function __git_zsh_cmd_alias() uses 'git config
--get-regexp' to enumerate aliases and then strips the "alias." prefix
from each key. For subsection-style aliases (alias.name.command), this
leaves "name.command" as the completion candidate instead of just
"name".

The bash completion does not have this problem because it goes through
'git --list-cmds=alias', which calls list_aliases() in C and already
handles both alias syntaxes correctly. However, zsh needs both the
alias name and its value for descriptive completion, which
--list-cmds=alias does not provide.

Add a hidden --aliases-for-completion option to 'git help', following
the existing --config-for-completion pattern. It outputs NUL-separated
"name\nvalue" pairs using list_aliases(), which correctly resolves both
the traditional (alias.name) and subsection (alias.name.command)
formats. Update __git_zsh_cmd_alias() to use it.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "run-test-slice.sh" script executes the test helper to slice up
tests passed to it. As the execution is part of a pipe though, we end up
ignoring any potential error code returned by the helper.

Make the code more robust by storing the tests in a variable first so
that we can split up the pipeline.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "ci/run-test-slice.sh" script can be used to slice up all of our
tests into N pieces and then run each of them on a separate CI job.
This is used by both GitLab and GitHub CI to speed up Windows tests,
which would otherwise be painfully slow.

The infra itself is fueled by `test-tool path-utils slice-tests`. This
tool receives as input an "offset" and a "stride" that can be combined
to slice up tests. This framing can be misleading though: you are
expected to pass a zero-based index as "offset", and the complete number
of slices to the "stride". The latter makes sense, but it is somewhat
surprising that the offset needs to be zero-based. And this is in fact
biting us: while GitHub passes zero-based indices, GitLab passes
`$CI_NODE_INDEX`, which is a one-based indice.

Ideally, we should have verification that the parameters make sense.
And naturally, one would for example expect that it's an error to call
the binary with an offset larger than the stride. But with the current
framing as "offset" it's not even wrong to do so, as it is of course
well-defined to start at a larger offset than the stride.

This means that we get this wrong on GitLab's CI, as we pass a one based
index there, and this causes us to skip one of the tests. Interestingly,
it's not the lexicographically first test that we skip. Instead, as we
sort tests by size before slicing them, we skip the _smallest_ test.

Reframe the problem to instead talk about "slice number" and "total
number of slices". For all of our use cases this is semantically
equivalent, but it allows us to perform some verifications:

  - The total number of slices must be greater than 1.

  - The selected slice must be between 1 <= nr <= slices_total.

As the indices are now one-based it means that GitLab's CI is fixed.
The GitHub workflow is updated accordingly.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, we execute tests with "--no-bin-wrappers". This has been
introduced via a87e427 (ci: speed up Windows phase, 2019-01-29) to
save some time: spawning processes is expensive on Windows, and shell
scripts tend to spawn a bunch of them. So overall, the bin-wrappers led
to a performance overhead of ~10-30%.

This causes test failures when using Meson on Windows:

  failure: t7610.28 mergetool --tool-help shows recognized tools
    ++ git mergetool --tool-help
    /d/a/git/git/build/git-mergetool--lib: line 45: cd: D:/a/git/git/build/mergetools: No such file or directory

The root cause here is that our bin-wrappers are usually responsible for
setting up the `MERGE_TOOL_DIR` environment variable so that we can
locate these scripts. But as we don't use the bin-wrappers, we'll
instead use the default location for merge tools, which is derived from
`GIT_EXEC_PATH`. And as `GIT_EXEC_PATH` points to our build directory,
which won't ever contain any of the merge tools, we will fail to locate
any of the merge tools.

This issue has went unnoticed for a long time given that we only skip
bin-wrappers on Windows, and because the CI jobs on Windows didn't
execute due to a bug.

Fix the issue by always setting the `MERGE_TOOL_DIR` environment
variable to the correct directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the win+Meson test jobs run in GitHub workflows, the shell script
that is supposed to run the jobs is seemingly not running at all. All
that the CI job prints is the following:

  Run ci/run-test-slice-meson.sh build 1 10
    ci/run-test-slice-meson.sh build 1 10
    shell: C:\Program Files\PowerShell\7\pwsh.EXE -command ". '{0}'"
    env:
      DEVELOPER: 1

The step is currently defined to use PowerShell, and of course it
doesn't know how to execute POSIX shell scripts. What's surprising
though is that this step doesn't even lead to a CI failure.

Fix the issue by using Bash instead of PowerShell, as we do in other
steps that execute shell scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit we have adjusted test slicing to be one-based
when using the "ci/run-test-slice.sh" script. But we also have an
equivalent script for Meson that is still zero-based, which is of course
inconsistent.

Adapt the script to be one-based, as well, and adapt the GitHub workflow
accordingly. Note that GitLab doesn't yet use the script, so it does not
need to be adapted. This will change in the next commit though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While our GitHub workflow already uses "ci/run-test-slice-meson.sh",
GitLab CI open-codes the parameters. Adapt the latter to also use the
same script so that we always use the same Meson options across both CI
systems.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The MSVC+Meson job does not currently have any logic to print failing
tests, nor does it upload the failed test artifacts. Backfill this logic
to make help debugging efforts in case any of its jobs has failed.

GitHub already knows to do this, so we don't need an equivalent change
over there.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two locations that iterate over the preferred bitmap tips as
configured by the user via "pack.preferBitmapTips". Both of these
callsites are subtly wrong: when the preferred bitmap tips contain an
exact refname match, then we will hit a `BUG()`.

Prepare for the fix by unifying the two callsites into a new
`for_each_preferred_bitmap_tip()` function.

This removes the last callsite of `bitmap_preferred_tips()` outside of
"pack-bitmap.c". As such, convert the function to be local to that file
only. Note that the function is still used by a second caller, so we
cannot just inline it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "pack.preferBitmapTips" configuration allows the user to specify
which references should be preferred when generating bitmaps. This
option is typically expected to be set to a reference prefix, like for
example "refs/heads/".

It's not unreasonable though for a user to configure one specific
reference as preferred. But if they do, they'll hit a `BUG()`:

    $ git -c pack.preferBitmapTips=refs/heads/main repack -adb
    BUG: ../refs/iterator.c:366: attempt to trim too many characters
    error: pack-objects died of signal 6

The root cause for this bug is how we enumerate these references. We
call `refs_for_each_ref_in()`, which will:

  - Yield all references that have a user-specified prefix.

  - Trim each of these references so that the prefix is removed.

Typically, this function is called with a trailing slash, like
"refs/heads/", and in that case things work alright. But if the function
is called with the name of an existing reference then we'll try to trim
the full reference name, which would leave us with an empty name. And as
this would not really leave us with anything sensible, we call `BUG()`
instead of yielding this reference.

One could argue that this is a bug in `refs_for_each_ref_in()`. But the
question then becomes what the correct behaviour would be:

  - Do we want to skip exact matches? In our case we certainly don't
    want that, as the user has asked us to generate a bitmap for it.

  - Do we want to yield the reference with the empty refname? That would
    lead to a somewhat weird result.

Neither of these feel like viable options, so calling `BUG()` feels like
a sensible way out. The root cause ultimately is that we even try to
trim the whole refname in the first place. There are two possible ways
to fix this issue:

  - We can fix the bug by using `refs_for_each_fullref_in()` instead,
    which does not strip the prefix at all. Consequently, we would now
    start to accept all references that start with the configured
    prefix, including exact matches. So if we had "refs/heads/main", we
    would both match "refs/heads/main" and "refs/heads/main-branch".

  - Or we can fix the bug by appending a slash to the prefix if it
    doesn't already have one. This would mean that we only match
    ref hierarchies that start with this prefix.

While the first fix leaves the user with strictly _more_ configuration
options, we have already fixed a similar case in 10e8a93 (refs.c:
stop matching non-directory prefixes in exclude patterns, 2025-03-06) by
using the second option. So for the sake of consistency, let's apply the
same fix here.

Clarify the documentation accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pks-t and others added 21 commits February 19, 2026 10:41
All callers of `refs_for_each_ref_in()` pass in a string that is
terminated with a trailing slash to indicate that they only want to see
refs in that specific ref hierarchy. This is in fact a requirement if
one wants to use this function, as the function trims the prefix from
each yielded ref. So if there was a reference that was called
"refs/bisect" as in our example, the result after trimming would be the
empty string, and that's something we disallow.

Fix this by adding the trailing slash.

Furthermore, taking a closer look, we strip the prefix only to re-add it
in `mark_for_removal()`. This is somewhat roundabout, as we can instead
call `refs_for_each_fullref_in()` to not do any stripping at all. Do so
to simplify the code a bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We declare the refs_for_removal string_list as NODUP, forcing us to
manually allocate strings we insert. And then when it comes time to
clean up, we set strdup_strings so that string_list_clear() will free
them for us.

This is a confusing pattern, and can be done much more simply by just
declaring the list with the DUP initializer in the first place.

It was written this way originally because one of the callsites
generated the item using xstrfmt(). But that spot switched to a plain
xstrdup() in the preceding commit. That means we can now just let the
string_list code handle allocation itself.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've got a couple of tests that exercise Git with different encodings,
typically around commit messages. All of these tests depend on the ICONV
prerequisite, which is set when Git was built with support for iconv.

Many of those tests also end up using the iconv(1) executable to
reencode text. But while tests can rely on the fact that Git does have
support for iconv, they cannot assume that the iconv(1) executable
exists. The consequence is thus that tests will break in case Git is
built with iconv, but the executable doesn't exist. In fact, some of the
tests even use the iconv(1) executable unconditionally, regardless of
whether or not the ICONV prerequisite is set.

Git for Windows has recently (unintentionally) shipped a change where
the iconv(1) binary is not getting installed anymore [1]. And as we use
Git for Windows directly in MSVC+Meson jobs in GitLab CI this has caused
such tests to break. The missing iconv(1) binary is considered a bug
that will be fixed in Git for Windows. But regardless of that it makes
sense to not assume the binary to always exist so that our test suite
passes on platforms that don't have iconv at all.

Extend the ICONV prerequisite so that we know to skip tests in case the
iconv(1) binary doesn't exist. We'll adapt tests that are currently
broken in subsequent commits.

[1]: git-for-windows#6083

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've got a couple of tests related to diffs in t40xx that use the
iconv(1) executable to convert the encoding of a commit message. All of
these tests are prepared to handle a missing ICONV prereq, in which case
they will simply use UTF-8 encoding.

But even if the ICONV prerequisite has failed we try to use the iconv(1)
executable, even though it's not safe to assume that the executable
exists in that case. And besides that, it's also unnecessary to use
iconv(1) in the first place, as we would only use it to convert from
UTF-8 to UTF-8, which should be equivalent to a no-op.

Fix the issue and skip the call to iconv(1) in case the prerequisite is
not set. This makes tests work on systems that don't have iconv at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t4205 we have a bunch of tests that depend on the iconv prereq. This
is for most of the part because we format commit messages that have been
encoded in an encoding different than UTF-8.

Those tests fall into two classes though:

  - One class of tests outputs the data as-is without reencoding.

  - One class of tests outputs the data with "i18n.logOutputEncoding" to
    reencode it.

Curiously enough, both of these classes are marked with the ICONV
prereq, even though one might expect that the first class wouldn't need
the prereq. This is because we unconditionally use ISO-8859-1 encoding
for the initial commit message, and thus we depend on converting to
UTF-8 indeed.

This creates another problem though: when the iconv(1) executable does
not exist the test setup fails, even in the case where the ICONV prereq
has not been set.

Fix these issues by making the test encoding conditional on ICONV: if
it's available we use ISO-8859-1, otherwise we use UTF-8. This fixes the
test setup on platforms without iconv(1), and it allows us to drop the
ICONV prereq from a bunch of tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've got a bunch of tests in t5550 that connect to "$HTTPD_URL/error"
to ensure that error messages are properly forwarded. This URL executes
the "t/lib-httpd/error.sh" script, which in turn depends on the iconv(1)
executable to reencode the message.

This executable may not exist on platforms, which will make the tests
fail. Guard them with the ICONV prereq to fix such failures.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two tests in t6006 depend on the iconv(1) prerequisite to reencode a
commit message. This executable may not even exist though in case the
prereq is not set, which will cause the tests to fail.

Fix this by using UTF-8 instead when the prereq is not set.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a strip option to the %(refname) placeholder is asked to leave N
path components, we first count up the path components to know how many
to remove. That happens with a loop like this:

	/* Find total no of '/' separated path-components */
	for (i = 0; p[i]; p[i] == '/' ? i++ : *p++)
		;

which is a little hard to understand for two reasons.

First, the dereference in "*p++" is seemingly useless, since nobody
looks at the result. And static analyzers like Coverity will complain
about that. But removing the "*" will cause gcc to complain with
-Wint-conversion, since the two sides of the ternary do not match (one
is a pointer and the other an int).

Second, it is not clear what the meaning of "p" is at each iteration of
the loop, as its position with respect to our walk over the string
depends on how many slashes we've seen. The answer is that by itself, it
doesn't really mean anything: "p + i" represents the current state of
our walk, with "i" counting up slashes, and "p" by itself essentially
meaningless.

None of this behaves incorrectly, but ultimately the loop is just
counting the slashes in the refname. We can do that much more simply
with a for-loop iterating over the string and a separate slash counter.

We can also drop the comment, which is somewhat misleading. We are
counting slashes, not components (and a comment later in the function
makes it clear that we must add one to compensate). In the new code it
is obvious that we are counting slashes here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git repo info" learns "--keys" action to list known keys.

* lo/repo-info-keys:
  repo: add new flag --keys to git-repo-info
  repo: rename the output format "keyvalue" to "lines"
A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.

* ps/for-each-ref-in-fixes:
  bisect: simplify string_list memory handling
  bisect: fix misuse of `refs_for_each_ref_in()`
  pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
  pack-bitmap: deduplicate logic to iterate over preferred bitmap tips
UI improvements for "git history reword".

* ps/history-ergonomics-updates:
  Documentation/git-history: document default for "--update-refs="
  builtin/history: rename "--ref-action=" to "--update-refs="
  builtin/history: replace "--ref-action=print" with "--dry-run"
  builtin/history: check for merges before asking for user input
  builtin/history: perform revwalk checks before asking for user input
Code clean-up.

* jk/ref-filter-lrstrip-optim:
  ref-filter: clarify lstrip/rstrip component counting
  ref-filter: avoid strrchr() in rstrip_ref_components()
  ref-filter: simplify rstrip_ref_components() memory handling
  ref-filter: simplify lstrip_ref_components() memory handling
  ref-filter: factor out refname component counting
"git switch <name>", in an attempt to create a local branch <name>
after a remote tracking branch of the same name gave an advise
message to disambiguate using "git checkout", which has been
updated to use "git switch".

* jc/checkout-switch-restore:
  checkout: tell "parse_remote_branch" which command is calling it
  checkout: pass program-readable token to unified "main"
It does not make much sense to apply the "incomplete-line"
whitespace rule to symbolic links, whose contents almost always
lack the final newline.  "git apply" and "git diff" are now taught
to exclude them for a change to symbolic links.

* jc/whitespace-incomplete-line:
  whitespace: symbolic links usually lack LF at the end
CI update.

* ps/ci-gitlab-msvc-updates:
  gitlab-ci: handle failed tests on MSVC+Meson job
  gitlab-ci: use "run-test-slice-meson.sh"
  ci: make test slicing consistent across Meson/Make
  github: fix Meson tests not executing at all
  meson: fix MERGE_TOOL_DIR with "--no-bin-wrappers"
  ci: don't skip smallest test slice in GitLab
  ci: handle failures of test-slice helper
Some tests assumed "iconv" is available without honoring ICONV
prerequisite, which has been corrected.

* ps/tests-wo-iconv-fixes:
  t6006: don't use iconv(1) without ICONV prereq
  t5550: add ICONV prereq to tests that use "$HTTPD_URL/error"
  t4205: improve handling of ICONV prerequisite
  t40xx: don't use iconv(1) without ICONV prereq
  t: don't set ICONV prereq when iconv(1) is missing
Extend the alias configuration syntax to allow aliases using
characters outside ASCII alphanumeric (plus '-').

* jh/alias-i18n:
  completion: fix zsh alias listing for subsection aliases
  alias: support non-alphanumeric names via subsection syntax
  alias: prepare for subsection aliases
  help: use list_aliases() for alias listing
"git format-patch --from=<me>" did not honor the command line
option when writing out the cover letter, which has been corrected.

* mf/format-patch-honor-from-for-cover-letter:
  format-patch: fix From header in cover letter
Doc update.

* db/doc-fetch-jobs-auto:
  doc: fetch: document `--jobs=0` behavior
Test clean-up.

* ap/use-test-seq-f-more:
  t: use test_seq -f and pipes in a few more places
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@abraithwaite abraithwaite force-pushed the alan/fetch-blob-size-limit branch 2 times, most recently from 4bf3e1e to fa1ea69 Compare March 6, 2026 23:38
@abraithwaite abraithwaite force-pushed the alan/fetch-blob-size-limit branch 2 times, most recently from a63a589 to 480453b Compare March 15, 2026 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants