Skip to content

Conversation

@jackye1995
Copy link
Contributor

@jackye1995 jackye1995 commented Oct 28, 2025

This PR moves the release mechanism to follow the semantic versioning standard with a streamlined RC voting process and automated breaking change detection. See release_process.md for the updated workflow and examples.

@github-actions github-actions bot added the ci Github Action or Test issues label Oct 28, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@jackye1995 jackye1995 force-pushed the fix-release-minor branch 2 times, most recently from d32a12a to 6fae7eb Compare November 4, 2025 19:33
@jackye1995 jackye1995 changed the title ci: fix version bump from preview to stable ci!: move to semantic versioning release mechanism Nov 4, 2025

### Standard Major/Minor Release

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe?

Suggested change
```bash
```markdown

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this seems like too many scripts and workflows. Can we consolidate this into a single script?

python ci/bump_version.py beta
python ci/bump_version.py rc
python ci/bump_version.py stable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I shared the same concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that we have now a few different workflows that do some overlapping but overall quite different steps, if you read the release process doc for different types of releases. I can definitely squeeze everything to one or two scripts, but I feel having a 1-1 relationship with the GitHub workflow and match what user would eventually do seems to make more sense.

When Will commented on this I was trying to just split all steps to different scripts and that was definitely too much, but this current state is already a version that tries to reuse things as much as possible.

@jackye1995 jackye1995 marked this pull request as draft November 4, 2025 21:59
@jackye1995 jackye1995 marked this pull request as ready for review November 5, 2025 07:49
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 53 to 58
- name: Create RC
id: create_rc
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY: ${{ github.repository }}
GITHUB_OUTPUT: ${{ github.output }}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid overriding GITHUB_OUTPUT in release workflows

The new release workflows set GITHUB_OUTPUT: ${{ github.output }} for the steps that run ci/create_major_minor_rc.sh, ci/create_rc.sh, ci/promote_to_stable.sh, and ci/publish_beta.sh. There is no github.output context, so this assignment replaces GitHub’s automatically populated $GITHUB_OUTPUT file path with an empty string. Each script then writes its outputs to that variable, but the redirections silently fail, leaving the step outputs unset. Subsequent steps attempt to push branches and tags using steps.<id>.outputs.*, which will be blank and cause the workflow to fail (git push origin "" etc.) whenever the job is run. Drop the manual GITHUB_OUTPUT assignment and let Actions supply the variable so that the scripts can publish outputs correctly. This misconfiguration appears in all four newly added workflows.

Useful? React with 👍 / 👎.

Comment on lines 3 to 12
Lance uses a semantic versioning release workflow with Release Candidates (RC) for voting before stable releases. The process features linear commit history on branches.

## Overview

The release process follows these stages:
1. **Development** on `main` branch with beta versions
2. **Preview Releases** published from beta versions (beta.1+)
3. **Release Candidates** created for voting (rc.1, rc.2, etc.)
4. **Stable Releases** promoted from approved RCs
5. **Patch Releases** managed through release branches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a suggested summary:

Suggested change
Lance uses a semantic versioning release workflow with Release Candidates (RC) for voting before stable releases. The process features linear commit history on branches.
## Overview
The release process follows these stages:
1. **Development** on `main` branch with beta versions
2. **Preview Releases** published from beta versions (beta.1+)
3. **Release Candidates** created for voting (rc.1, rc.2, etc.)
4. **Stable Releases** promoted from approved RCs
5. **Patch Releases** managed through release branches
Lance uses [semantic versioning](https://semver.org/) and maintains a linear commit history.
* All pull requests are merged into the `main` branch.
* Beta releases (or preview releases) are created on-demand from the `main` branch.
* Stable releases (non-prereleases) are created only after a voting process and come from a release branch `release/vX.Y`. These are typically created once every two weeks.
* Release Candidates (RC) are created from release branches prior to voting.
* Patch releases are created by cherry-picking fixes from `main` into the release branch, voting on a new RC, and releasing.
```mermaid
gitGraph
commit
branch feature
checkout feature
commit
checkout main
merge feature
branch bugfix
checkout bugfix
commit id: "bugfix"
checkout main
branch "release/v1.4"
checkout "release/v1.4"
commit tag: "1.4.0-rc.1" tag: "1.4.0"
checkout main
merge bugfix
commit id: "merged"
checkout "release/v1.4"
cherry-pick id: "merged"
commit tag: "1.4.1-rc.1" tag: "1.4.1"
checkout main
commit tag: "1.5.0-beta.1"
```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make "These are typically created once every two weeks." more specific like minor version typically every two weeks, and major version typically every 2 months?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly the major version might be more often, as I realized DataFusion often does major releases once a month. I say let's wait on giving specifics for major releases until we see how often it becomes in practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also updated the patch release case as "Patch releases are created by committing fixes directly to the release branch, voting on a new RC, and releasing." since there could be cases that we commit a simpler fix in the release branch, and do a larger refactoring in the main branch.

Copy link
Contributor Author

@jackye1995 jackye1995 Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think based on the latest discussion "* Patch releases are created by cherry-picking fixes from main into the release branch, voting on a new RC, and releasing." is the expected behavior, changed back to that, directly patching release is unrecommended except for emergent cases, but we don't need to write about exceptions

Comment on lines 33 to 49
**Stable Releases (no suffix)**:
- **Rust**: crates.io
- **Python**: PyPI
- **Java**: Maven Central
- **Protobuf**: Buf Schema Registry

**RC Releases (`-rc.N`)**:
- **Rust**: Not published (use git tag)
- **Python**: fury.io
- **Java**: Maven Central (pre-release)
- **Protobuf**: Buf Schema Registry (pre-release)

**Beta Releases (`-beta.N`)**:
- **Rust**: Not published (use git tag)
- **Python**: fury.io
- **Java**: Maven Central (pre-release)
- **Protobuf**: Buf Schema Registry (pre-release)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a markdown table:

Suggested change
**Stable Releases (no suffix)**:
- **Rust**: crates.io
- **Python**: PyPI
- **Java**: Maven Central
- **Protobuf**: Buf Schema Registry
**RC Releases (`-rc.N`)**:
- **Rust**: Not published (use git tag)
- **Python**: fury.io
- **Java**: Maven Central (pre-release)
- **Protobuf**: Buf Schema Registry (pre-release)
**Beta Releases (`-beta.N`)**:
- **Rust**: Not published (use git tag)
- **Python**: fury.io
- **Java**: Maven Central (pre-release)
- **Protobuf**: Buf Schema Registry (pre-release)
| Language | Stable release | RC release | Beta release |
|----------|----------------|------------|--------------|
| Rust | [crates.io](https://crates.io/crates/lance) | Tag on GitHub [^1] | Tag on GitHub [^1] |
| Python | [PyPI](https://pypi.org/project/pylance/) [^2] | fury.io [^3] | fury.io [^3] |
| Java | [Maven Central](https://mvnrepository.com/artifact/com.lancedb/lance-core) | [Maven Central](https://mvnrepository.com/artifact/com.lancedb/lance-core) | [Maven Central](https://mvnrepository.com/artifact/com.lancedb/lance-core) |
| Protobuf | [Buf Schema Registry](https://buf.build/lancedb/lance) | [Buf Schema Registry](https://buf.build/lancedb/lance) | [Buf Schema Registry](https://buf.build/lancedb/lance) |
[^1]: Since Rust is a source-only release, with no built artifacts, we just publish non-stable releases as source code on GitHub.
[^2]: Note: there is another project called "lance" on PyPI, but it is not affiliated with Lance. https://pypi.org/project/lance/
[^3]: To install from fury.io, pass `--pre --extra-index-url https://pypi.fury.io/lancedb/`

### GitHub Releases and Release Notes

**RC Releases (`-rc.N`)**:
- Created as GitHub Pre-Releases with custom-generated release notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "custom-generated" release notes unclear. Does that mean a person wrote them? Or a script?

I think the goal of this section should be to make clear how and when release notes are created. If there is a script that generates them, we should link to that.

Copy link
Contributor Author

@jackye1995 jackye1995 Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is just from which commit to which commit should the release note cover. I changed it to a markdown table as well, let me know if that looks more clear

on fury.io. This allows us to release frequently and get feedback on new features
while keeping under the PyPI project size limits.
**Flow explanation:**
- **Main branch** at `1.3.0-beta.2` (has release-root tag) → create RC checks for breaking changes → if none, RC is `1.3.0-rc.1`, main bumps to `1.4.0-beta.0` with new release-root tag → `1.4.0-beta.1` (preview, tagged) → create RC for v1.4.0 → main bumps to `1.5.0-beta.0`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you think of my earlier idea of making RC's just tags, but not reflected in the version field?

So instead, the release branch immediately changes version to stable 1.3.0, but tags the commit with v1.3.0-rc.1. Once the vote passes, then add another tag v1.3.0 and trigger the final release.

Is that just hard to do with the current release artifacts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you think of my earlier idea of making RC's just tags, but not reflected in the version field?

Yeah I was implementing that but I had to revert to have dedicated RC commits. The main reason is basically the release artifacts. I want to be able to actually publish the python/java artifacts to fury and MavenCentral that corresponds to the version, and then we can ask codex to create a branch for testing that rc in lancedb, as well as internally. If I just tag it, then the artifact will have name 1.3.0 directly, it would cause potential issue with resolution if someone configures both the public and fury repo (like geneva), and also if we have multiple RCs then they will end up publishing to the same version, which I think would fail? Even if it can be overwritten it might cause confusion down the road if some artifacts in some platforms fail and the artifacts are in a mixed state.

I also experimented with making the branch just holding beta releases, so each RC just bumps up the beta release version, and we just tag a specific beta release is for RC.N (e.g. 1.3.0-beta.2 is for rc.1). That would solve the issue above, but I ended up feeling if we are doing that mapping of which RC maps to which beta release, it's probably cleaner to just have a dedicated rc commit.

Let me know what you think about it and if there are better ways

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this seems fine. We might find a way later to simplify in practice. I agree that the artifacts make it necessary to do the RCs for now.


There are three conditions that can trigger a full release:
1. **publish-beta.yml** - Publish beta preview releases from any branch
2. **create-major-minor-rc.yml** - Create initial RC for new major/minor version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead call this create-release-branch? I think that's more the action I would think of wanting to do. "create-major-minor-rc" feels more wordy and unclear.

1. **publish-beta.yml** - Publish beta preview releases from any branch
2. **create-major-minor-rc.yml** - Create initial RC for new major/minor version
3. **create-rc.yml** - Create RC on existing release branch (for new patch release RC or iterations of an existing RC)
4. **promote-rc-to-stable.yml** - Promote any RC to stable (works for all release types)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be renamed to "approve RC".

Suggested change
4. **promote-rc-to-stable.yml** - Promote any RC to stable (works for all release types)
4. **approve-rc.yml** - Promote any RC to stable (works for all release types)

To publish preview releases for testing before creating RCs:

## Make a preview release
### Publish Beta Preview from Main
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following sections are really wordy. I think we should try to consolidate into three brief step-by-step guides:

  1. Create a beta / preview release
  2. Create a major / minor release
  3. Create a patch / bugfix release

If we can, let's put unnecessary detail in a <details> block that's collapsed by default.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve of release_process.md

Comment on lines 10 to 12
3. **Release Candidates** created for voting (rc.1, rc.2, etc.)
4. **Stable Releases** promoted from approved RCs
5. **Patch Releases** managed through release branches
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear here (and maybe it doesn't need to be) if the release branches are introduced at the release candidates stage or the patch releases stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superseded by #5089 (comment) which mentions the release branch

Comment on lines +25 to +51
- **beta.0**: Unreleased version (exists on branch but not published)
- Created after cutting an RC to mark the next unreleased version
- Indicates no preview has been published yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a beta.0? Is this an actual tag or branch? Or is it just a placeholder in Cargo.toml when something hasn't been released?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's just a placeholder for unreleased content since the last release branch cut. We always first commit the version bump, and then tag on that bumped version, and then trigger all the publish workflows with it, so there won't really be beta.0 release version, the first beta version will always be x.y.z-beta.1.

An alternative could be to reverse the order so we tag the latest main first and then bump version. But that created other complications in detecting breaking changes because you could be in x.y.z-beta.1 and have to tag it with a (x+1).0.0-beta.1 because it actually contains breaking changes. So I ended up using this beta.0 approach that feels the most logical when user sees the code at any commit version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of always use 1.3.0-dev on main and only update the version while doing release. So

  • dev version never been released.
  • we don't need to have a beta.0.

Comment on lines +20 to +45
- **Beta**: `X.Y.Z-beta.N` (e.g., `1.3.0-beta.1`, `1.3.0-beta.2`)
- **RC**: `X.Y.Z-rc.N` (e.g., `1.3.0-rc.1`, `1.3.0-rc.2`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean if there is both a 1.3.0-beta.2 and a 1.3.0-rc.2? Is it safe to assume the rc comes after the beta?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because when there is 1.3.0-rc.2 it means we already have cut a release branch release/1.3 and the main will be bumped to 1.4.0-beta.0 or 2.0.0-beta.0.

### Main Branch
- Always contains the latest development work
- Version format: `X.Y.Z-beta.N`
- After RC creation, bumped to next minor version with `-beta.0` (unreleased)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I look in main and I look at the Cargo.toml do I see "the version most recently released"? Or do I see "the next version to be released?" Or do I see something else?

Copy link
Contributor Author

@jackye1995 jackye1995 Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see -beta.0 that means there was just a release branch cut, and this is there is no any new beta releases or new branch cuts yet.

After a first beta release, you see the version of the beta release that was released previously.

The main reasons I did not end up using the released version but use the next version + -beta.0 are:

  1. for beta releases, you see the version and you know this is that release + some new commits. But for the released version, because of the divergence after the branch cut, the content in main is no longer exactly the released version + some new commits, so I wanted to make that distinction.
  2. if I use the released version on main as the new version after cutting the branch, then I need to handle one more case both when I cut the next release branch and when I do a beta release, which becomes quite complicated compared to the current setup.
  3. in this setup basically when people build any commit that is not tagged as stable, it will be shown as some sort of beta or rc. When it is recorded in the writer version we will have an idea that it is building some version off the main branch instead of using the released version.

@jackye1995 jackye1995 force-pushed the fix-release-minor branch 3 times, most recently from fee5a40 to af6ab97 Compare November 6, 2025 07:16
Copy link
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, only some questions.

* Beta releases (or preview releases) are created on-demand from the `main` branch.
* Stable releases (non-prereleases) are created only after a voting process and come from a release branch `release/vX.Y`. These are typically created once every two weeks.
* Release Candidates (RC) are created from release branches prior to voting.
* Patch releases are created by committing fixes directly to the release branch, voting on a new RC, and releasing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we always require a commit on main first, then backport to the release branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with that

Comment on lines +25 to +51
- **beta.0**: Unreleased version (exists on branch but not published)
- Created after cutting an RC to mark the next unreleased version
- Indicates no preview has been published yet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of always use 1.3.0-dev on main and only update the version while doing release. So

  • dev version never been released.
  • we don't need to have a beta.0.


| Language | Stable release | RC release | Beta release |
|--------------|---------------------|-----------------------------------|-----------------------------------|
| **Rust** | crates.io | Not published (use git tag) | Not published (use git tag) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of always release them so downstream can just use version instead of playing with git tag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed a bit, landed with let's use git for now, publishing everything could make the page look messy, also we don't want to make beta releases too easily consumable that people start to use that in prod unintentionally.

| **RC (Major/Minor)** | Pre-Release | `release-root/X.Y.0-beta.N` | `vX.Y.0-rc.N` | All changes for the release |
| **RC (Patch)** | Pre-Release | `vX.Y.(Z-1)` | `vX.Y.Z-rc.N` | Only changes in this patch release |
| **RC (Iterations)** | Pre-Release | `vX.Y.(Z-1)` | `vX.Y.Z-rc.N` | Only changes in this patch release (not changes against previous RC) |
| **Stable (Major/Minor)** | Release | `release-root/X.Y.0-beta.N` | `vX.Y.0` | All changes from main + RC fixes |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's place stable release first: stable -> rc -> beta

- Format: `release/v{major}.{minor}` (e.g., `release/v1.3`)
- Created when cutting initial RC for major/minor release
- Maintained for patch releases
- Version progression: `rc.1``rc.2` → stable → `beta.0``rc.1` (for patches)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same: if we are using v1.3.0-dev always, we don't need to commit rc.1, beta.0 in repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is basically the same issue with #5089 (comment), that it would mess up with release artifacts.

The previous implementation tried to parse the UTC time string and convert it,
which failed on GitHub Actions runners. Now we calculate both UTC and Pacific
times directly using the date command with timezone support.
- Move release-root tags to correct commits (they point backwards to commits before RC creation)
- Show M0 has both release-root/1.3.0-beta.N and release-root/1.4.0-beta.N
- Show M2 gets release-root/1.5.0-beta.N when cutting next RC
- Update flow explanation to clarify tag placement
- Fix technical details section to specify tags point to main commit, not RC commit
- Remove unnecessary release-root/1.3.0-beta.N from M0
- Change 1.5.0-beta.0 to 2.0.0-beta.1 (breaking changes scenario)
- Show both release-root/1.4.0-beta.N and release-root/2.0.0-beta.N on M0 (same commit)
- Update explanation to clarify both tags point to same commit
@jackye1995
Copy link
Contributor Author

Thanks everyone for the review!

@jackye1995 jackye1995 merged commit 3f74e85 into lance-format:main Nov 7, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change ci Github Action or Test issues java python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants