Skip to content

[RV64_DYNAREC] config vector before VLE in sse_get_reg_vector#3624

Open
zqb-all wants to merge 1 commit intoptitSeb:mainfrom
zqb-all:rv64-fix-sse_get_reg_vector
Open

[RV64_DYNAREC] config vector before VLE in sse_get_reg_vector#3624
zqb-all wants to merge 1 commit intoptitSeb:mainfrom
zqb-all:rv64-fix-sse_get_reg_vector

Conversation

@zqb-all
Copy link
Contributor

@zqb-all zqb-all commented Mar 7, 2026

No description provided.

@zqb-all
Copy link
Contributor Author

zqb-all commented Mar 7, 2026

INST_NAME("PMOVSXBW Gx, Ex");
nextop = F8;
if (cpuext.xtheadvector) {
SET_ELEMENT_WIDTH(x1, VECTOR_SEW8, 1);
GETEX_vector(q1, 0, 0, VECTOR_SEW8);
GETGX_empty_vector(q0);
v0 = fpu_get_scratch_lmul(dyn, VECTOR_LMUL2);
vector_vsetvli(dyn, ninst, x1, VECTOR_SEW8, VECTOR_LMUL1, 0.5);
VWADD_VX(v0, q1, xZR, VECTOR_UNMASKED);
SET_ELEMENT_WIDTH(x1, VECTOR_SEW16, 1);
VMV_V_V(q0, v0);
} else {
if (!MODREG) SET_ELEMENT_WIDTH(x1, VECTOR_SEW8, 1);
GETEX_vector(q1, 0, 0, VECTOR_SEW8);

Line 520: The if condition is not met, not SET_ELEMENT_WIDTH here
Line 521: GETEX_vector

#define GETEX_vector(a, w, D, sew) \
if (MODREG) { \
a = sse_get_reg_vector(dyn, ninst, x1, (nextop & 7) + (rex.b << 3), w, sew); \
} else { \
SMREAD(); \
addr = geted(dyn, addr, ninst, nextop, &ed, x3, x2, &fixedaddress, rex, NULL, 0, D); \
a = fpu_get_scratch(dyn); \
VLE_V(a, ed, sew, VECTOR_UNMASKED, VECTOR_NFIELD1); \
}

Line 456: GETEX_vector calls sse_get_reg_vector.
then sse_get_reg_vector may do VLE without SET_ELEMENT_WIDTH

My initial thought was to remove the if (!MODREG) condition in instructions like PMOVSXBW, always SET_ELEMENT_WIDTH. However, it seems cleaner to add SET_ELEMENT_WIDTH directly inside sse_get_reg_vector, and it also avoids potential bugs in other places that call sse_get_reg_vector without SET_ELEMENT_WIDTH first.
If modifying sse_get_reg_vector introduces issues I'm not currently aware of now, we can fall back to the approach of removing the if (!MODREG) in PMOVSXBW.

@ptitSeb ptitSeb requested a review from ksco March 7, 2026 14:41
Copy link
Collaborator

@ksco ksco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought was to remove the if (!MODREG) condition in instructions like PMOVSXBW, always SET_ELEMENT_WIDTH. However, it seems cleaner to add SET_ELEMENT_WIDTH directly inside sse_get_reg_vector, and it also avoids potential bugs in other places that call sse_get_reg_vector without SET_ELEMENT_WIDTH first.

Thanks. SET_ELEMENT_WIDTH is not free, I think it's better to keep it explicit, so replacing all the if (!MODREG) SET_ELEMENT_WIDTH(x1, VECTOR_SEW8, 1); with SET_ELEMENT_WIDTH(x1, VECTOR_SEW8, 1); is preferred.

Copy link
Collaborator

@ksco ksco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, wrong approval selected.

@zqb-all
Copy link
Contributor Author

zqb-all commented Mar 9, 2026

I have another concern:
PMOVSXBW is supposed to read only 8 bytes, but SET_ELEMENT_WIDTH(x1, VECTOR_SEW8, 1) will in fact cause 16 bytes to be read (as vl is 16). A better way to control this behavior would be something similar to vector_vsetvli(dyn, ninst, x1, VECTOR_SEW8, VECTOR_LMUL1, 0.5); as done in the xtheadvector branch.

what do you think?

@ksco
Copy link
Collaborator

ksco commented Mar 9, 2026

I have another concern: PMOVSXBW is supposed to read only 8 bytes, but SET_ELEMENT_WIDTH(x1, VECTOR_SEW8, 1) will in fact cause 16 bytes to be read (as vl is 16). A better way to control this behavior would be something similar to vector_vsetvli(dyn, ninst, x1, VECTOR_SEW8, VECTOR_LMUL1, 0.5); as done in the xtheadvector branch.

what do you think?

Oh yeah, indeed. We need some GETEX64_vector/GETEX32_vector/GETEX16_vector macros.

@zqb-all
Copy link
Contributor Author

zqb-all commented Mar 9, 2026

thanks, make sense, I will add GETEX64_vector/GETEX32_vector/GETEX16_vector

PMOV not read 128bit, need GETEX64/GETEX32/GETEX16
@zqb-all zqb-all force-pushed the rv64-fix-sse_get_reg_vector branch from 3548b05 to 8c7bff7 Compare March 9, 2026 07:40
@zqb-all zqb-all requested a review from ksco March 9, 2026 08:59

// Get EX as a quad, (x1 is used)
#define GETEX_vector(a, w, D, sew) \
SET_ELEMENT_WIDTH(x1, sew, 1); \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GETEX_PARTIAL_vector internally contains SET_ELEMENT_WIDTH, should we also add SET_ELEMENT_WIDTH here? If so, the next step would be to remove all SET_ELEMENT_WIDTH calls that appear before any invocation of GETEX_vector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants