Skip to content

Fix #18654: Add safe_numel()#18698

Open
JiwaniZakir wants to merge 1 commit intopytorch:mainfrom
JiwaniZakir:fix/18654-add-safe-numel
Open

Fix #18654: Add safe_numel()#18698
JiwaniZakir wants to merge 1 commit intopytorch:mainfrom
JiwaniZakir:fix/18654-add-safe-numel

Conversation

@JiwaniZakir
Copy link
Copy Markdown

Closes #18654

Summary

Adds TensorImpl::safe_numel(), a bounds-checked alternative to numel() that returns a Result<ssize_t> instead of silently overflowing.

The existing numel_ member is computed in the TensorImpl constructor and stored as a plain integer, meaning overflow goes undetected and can lead to heap overflows. Because numel() is called directly in constructors and must remain ABI/BC-compatible with ATen, we cannot change its return type. Instead, safe_numel() is introduced as a new method that callers can opt into when they need overflow safety.

Changes:

  • runtime/core/portable_type/tensor_impl.h: Adds the Result<ssize_t> safe_numel() const declaration and includes result.h.
  • runtime/core/portable_type/tensor_impl.cpp: Implements safe_numel() by iterating over sizes_ and checking for overflow before each multiply using numel <= std::numeric_limits<ssize_t>::max() / sizes_[i]. Returns InvalidArgument if any dimension is negative or if the product would overflow ssize_t. Includes <limits> for std::numeric_limits.

Test plan

Three new unit tests added in runtime/core/portable_type/test/tensor_impl_test.cpp:

  • SafeNumelReturnsCorrectValue: verifies that a {3, 2} tensor returns 6.
  • SafeNumelScalar: verifies that a 0-dim tensor (no sizes) returns 1.
  • SafeNumelOverflowReturnsError: constructs a tensor with dimensions {INT32_MAX, INT32_MAX, 3}, whose product overflows ssize_t on both 32-bit and 64-bit platforms, and asserts that safe_numel() returns a non-ok Result.

This PR was created with AI assistance (Claude). The changes were reviewed by quality gates and a critic model before submission.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 4, 2026 03:05
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 4, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18698

Note: Links to docs will display an error until the docs builds have been completed.

❌ 11 Awaiting Approval, 1 New Failure

As of commit 9e8f017 with merge base 6020c29 (image):

AWAITING APPROVAL - The following workflows need approval before CI can run:

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 4, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new TensorImpl::safe_numel() API to compute element counts with explicit overflow/validation checks, returning executorch::runtime::Result<ssize_t> instead of relying on the existing numel_ value (which can overflow).

Changes:

  • Introduces TensorImpl::safe_numel() declaration returning Result<ssize_t>.
  • Implements safe_numel() with per-dimension validation and pre-multiply overflow checks against std::numeric_limits<ssize_t>::max().
  • Adds unit tests covering correct results, scalar behavior, and overflow error reporting.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
runtime/core/portable_type/tensor_impl.h Declares safe_numel() and includes result.h.
runtime/core/portable_type/tensor_impl.cpp Implements overflow-checked safe_numel() using sizes_ and std::numeric_limits.
runtime/core/portable_type/test/tensor_impl_test.cpp Adds tests validating safe_numel() success cases and an overflow error case.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +154 to +155
* Returns the number of elements in the tensor, or an error if the result
* would overflow ssize_t.
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment for safe_numel() says it errors only on overflow, but the implementation also returns InvalidArgument when any dimension size is negative. Please update the comment to reflect all error conditions (e.g., negative sizes and overflow) so callers know what to expect.

Suggested change
* Returns the number of elements in the tensor, or an error if the result
* would overflow ssize_t.
* Returns the number of elements in the tensor, or an error if any
* dimension size is negative or if computing the result would overflow
* ssize_t.

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +80
Result<ssize_t> TensorImpl::safe_numel() const {
ssize_t numel = 1;
for (const auto i : c10::irange(dim_)) {
ET_CHECK_OR_RETURN_ERROR(
sizes_[i] >= 0,
InvalidArgument,
"Size must be non-negative, got %zd at dimension %zd",
static_cast<ssize_t>(sizes_[i]),
i);
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe_numel() duplicates much of compute_numel()’s logic (iteration + non-negative size checks). Consider factoring the shared computation into a single helper (e.g., a checked-multiply routine) so future changes to size validation/numel semantics don’t have to be kept in sync in multiple places.

Copilot uses AI. Check for mistakes.
sizes_[i] == 0 ||
numel <= std::numeric_limits<ssize_t>::max() / sizes_[i],
InvalidArgument,
"Tensor numel overflows ssize_t");
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overflow failure path logs/returns "Tensor numel overflows ssize_t" without any context (which dimension triggered it, the offending size, or the partial product). Including at least the dimension index and size value (and optionally the partial numel) would make this error actionable when debugging shape issues.

Suggested change
"Tensor numel overflows ssize_t");
"Tensor numel overflows ssize_t at dimension %zd: size=%zd, partial_numel=%zd",
i,
static_cast<ssize_t>(sizes_[i]),
numel);

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +65
// Three large dimensions whose product overflows ssize_t on any platform:
// On 64-bit: INT32_MAX^2 * 3 > INT64_MAX; on 32-bit: INT32_MAX^2 > INT32_MAX.
SizesType sizes[3] = {
std::numeric_limits<SizesType>::max(),
std::numeric_limits<SizesType>::max(),
3};
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test assumes ssize_t is at most 64-bit (so INT32_MAX^2*3 always overflows). On platforms with wider ssize_t (or if the project ever targets such), safe_numel() could legitimately succeed and this test would fail. Consider constructing the shape to overflow based on std::numeric_limits<ssize_t>::max() (e.g., increasing the number of max-sized dimensions until overflow is guaranteed) or guarding the expectation with a static/runtime check of ssize_t width.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 4, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add safe_numel()

3 participants