-
Notifications
You must be signed in to change notification settings - Fork 932
[sw,otbnsim] Add KMAC interface to OTBNsim #29025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
93a47e0 to
860684f
Compare
etterli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this first implementation! It looks pretty good.
I reviewed the CSR/WSR and integration part but not yet completely the kmac.py as well as the software test. My feedback is mostly about error handling, otherwise the concept looks good to me.
| if self._custom_read: | ||
| return self._custom_read(self) | ||
| # Otherwise, use the parent logic. | ||
| return super().read_unsigned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the parent's read_unsigned() will throw a NotImplementedError as it is nowhere overwritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup correct. For now I decided to crash because it's easier to debug. At some point it could be smart to either add an error bit or just return 0. Since I wasn't sure which one I wanted to do I just left it like this for now.
| self._custom_write(self, value) | ||
| # Otherwise, use the parent logic. | ||
| else: | ||
| super().write_unsigned(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem as for the super().read_unsigned() case.
| if mask & ~valid_mask: | ||
| raise ValueError(f"Invalid reset mask {hex(mask)}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this error triggers? I think the whole simulator does crash?
Shouldn't an invalid mask simply be ignored? Because I think it is a possibility that a program will write an invalid mask (e.g. the random instruction generator). On the RTL side, such an invalid mask will either lead to an exception or the write will be ignored. If we decided that it should lead to an exception the simulator should also generate an OTBN exception. But I would prefer if an invalid mask would simply be ignored.
But maybe I don't see the intention of this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I agree. I'll make sure the simulation doesn't crash anymore once we are satisfied with the overall behaviour.
| raise RuntimeError( | ||
| f"Configuration value {hex(mask)} exceeds the allowed {KMAC_INTR_BITS}-bit range." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for the set_if_status check.
| if not (0 <= value < (1 << KMAC_CFG_BITS)): | ||
| raise RuntimeError( | ||
| f"Configuration value {hex(value)} exceeds the allowed {KMAC_CFG_BITS}-bit range." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for the set_if_status check.
| if not (0 <= value < (1 << KMAC_CMD_BITS)): | ||
| raise RuntimeError( | ||
| f"Command value {hex(value)} exceeds the allowed {KMAC_CMD_BITS}-bit range." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar case as for set_if_status. But here I can see a reason that an invalid command should raise an exception. However, crashing the simulator is probably not the right way to do. I think writing an invalid command should be handled in one of the following two ways:
- Raise an exception similar to how an instruction reading/writing an invalid CSR crashes (as for example here)
- Ignore the invalid command and set an error bit in the
kmac_if_statusCSR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I believe for this we have the KMAC_INTR and KMAC_ERROR registers. I think we should just ignore all the bits except for the KMAC_CMD_BITS. I'll get rid of runtime errors once everyone is happy with how the model works conceptually.
| raise RuntimeError( | ||
| f"Strobe value {hex(value)} exceeds the allowed {KMAC_STROBE_BITS}-bit range." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as for set_cmd.
| if not (0 <= value < (1 << KMAC_WSR_BITS)): | ||
| raise RuntimeError( | ||
| f"Message value {hex(value)} exceeds the allowed {KMAC_WSR_BITS}-bit range." | ||
| ) | ||
|
|
||
| if share not in (0, 1): | ||
| raise RuntimeError("Share must be 0 or 1.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crashing here and in _get_data makes sense as these functions are not directly called by an instruction and all checks cover a state which should be unreachable.
| if self._custom_read: | ||
| return self._custom_read(self) | ||
| # Otherwise, use the parent logic. | ||
| return super().read_unsigned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problematic as for CustomCSR.
| self._custom_write(self, value) | ||
| # Otherwise, use the parent logic. | ||
| else: | ||
| super().write_unsigned(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problematic as for CustomCSR.
This commit adds a new KMAC model that can be used to model a KMAC-OBTN interface. This model can be used to interface KMAC from OTBN. This model should behave like the KMAC HW IP. Signed-off-by: Hakim Filali <[email protected]>
Use the new KMAC OTBNsim model to extend OTBNsim. This commit adds new WSRs and CSRs to OTBNsim and connects them to the new KMAC model. Signed-off-by: Hakim Filali <[email protected]>
This commit adds an assembly test that tests all allowed combinations of KMAC modes and kstrengths. This commit shall also serve as a first example of how to use the interface. Signed-off-by: Hakim Filali <[email protected]>
|
Thanks @etterli for your comments. Nice finds! I added some TODOs to make sure your comments aren't forgotten. I also adapted the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should try to simplify this by first restricting the number of bits to squeezed out of SHA3 rate buffer to 64 instead of 256. Supporting reasons:
- No more awkward alignment issues because the rate buffer can always be fully drained with 64-bit chunks. This significantly simplifies the implementation of arbitrary logic that will make use of the KMAC interface. The latency overhead might be insignificant given that the (large) rate buffer can be read quickly.
- This is only a point of contention for the XOFs that need to potentially squeeze thousands of bytes, the running time for KMAC and the hash functions with fixed-length digests will be dominated by the data absorption stage.
- The induced hardware circuit will be of similar complexity if not simpler.
- Increased re-usability through a leaner API. The example below is fine as a correctness test but is not sufficient to convey how the interface could be efficiently used for arbitrary use cases. The usage of the API is now a major engineering challenge even though the SHA3 function in itself is simply a building block of much larger applications. A complex to use API could be fine if it can be hidden in a black box but due to the lack abstraction and expressiveness of OTBN assembly this is a void hope. Every application needs to implement its custom logic and for that to be feasible the API should be simple and straightforward.
| * | ||
| * clobbered registers: x22 | ||
| */ | ||
| run_kmac_test_vector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split this into several routines.
kmac_{sha,shake,kmac}_initkmac_absorb_{masked,unmasked}kmac_squeeze_{masked,unmasked}kmac_finish
Here is a leaner absorption routine:
/**
* Absorb a message of size n bytes.
*
* This routine must only be called after the initialization of the KMAC
* interface (see `xof_{sha3,shake128,shake256,kmac}_init`). The message can be
* in masked (2 Boolean shares at DMEM[x21], DMEM[x22] with entrypoint
* `xof_absorb_masked`) or unmasked (at DMEM[x21] with entrypoint
* `xof_absorb_unmasked`) form.
*
* @param[in] x20: n, size of the input message.
* @param[in] x21: DMEM address of the 1st message share.
* @param[in] x22: DMEM address of the 2nd message share (ignored when unmasked)
*/
xof_absorb_unmasked:
/*
* When the message is unmasked, we use a constant zero word as the stand-in
* for the second message share and set the address for increment for it to 0.
*/
la x22, _xof_zero_word
addi x23, x0, 0
jal x0, _xof_absorb_loop
xof_absorb_masked:
/*
* Use a 32-byte (one 256-bit WDR) as the increment for the second message
* share.
*/
addi x23, x0, 32
/*
* In each iteration, the KMAC interface absorbs min(k, 32) bytes, where k is
* the number of bytes left.
*/
_xof_absorb_loop:
/* Exit the absorption loop, if k == 0. */
beq x20, x0, _xof_absorb_end
/*
* Set the strobe register value such that
*
* KMAC_BYTE_STROBE = 2^32-1 >> (32 - x),
*
* where x = 0, if k >= 32, else x = k.
*/
/* Clamp k: x = 0, if k >= 32, else x = k. */
addi x24, x20, -32
srai x24, x24, 31
and x24, x20, x24
/* (32 - x). */
addi x25, x0, 32
sub x25, x25, x24
/* 2^32-1 >> (32 - x). */
addi x24, x0, -1
srl x24, x24, x25
csrrw x0, KMAC_BYTE_STROBE, x24
/* Make sure KMAC is ready to absorb data. */
addi x24, x0, KMAC_IF_STATUS_MSG_WRITE_RDY
jal x1, _xof_kmac_if_status_poll
/* Write the message word to the KMAC interface, ensuring that the shares are
not combined in an unsecure fashion. */
addi x24, x0, 24
bn.lid x24++, 0(x21)
bn.wsrw KMAC_DATA_S0, w24
bn.xor w31, w31, w31 /* Dummy instruction */
bn.lid x24++, 0(x22)
bn.wsrw KMAC_DATA_S1, w25
/* Trigger the absorption of the written message word. */
addi x24, x0, 1
csrrw x0, KMAC_MSG_SEND, x24
/*
* Decrement the number of remaining message bytes such that
*
* k = max(k - 32, 0).
*/
/* (k - 32) */
addi x20, x20, -32
/* k = max(k - 32, 0). */
srai x24, x20, 31
addi x25, x0, -1
xor x24, x24, x25
and x20, x20, x24
/* Advance the message share DMEM addresses. */
addi x21, x21, 32
add x22, x22, x23
jal x0, _xof_absorb_loop
_xof_absorb_end:
/* Overwrite the message registers with random bits before exiting. */
bn.wsrr w24, RND
bn.wsrw KMAC_DATA_S0, w24
bn.wsrr w24, RND
bn.wsrw KMAC_DATA_S1, w24
ret| self.KeyS1H = KeyWSR('KeyS1H', 256, self.KeyS1) | ||
| self.KMAC_DATA_S0 = CustomWSR( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: The self.KMAC_DATA_S0 should probably follow the existing naming scheme and should be renamed to self.KmacDataS0. This also applies to KMAC_DATA_S1.
This PR adds a new KMAC interface to OTBNsim.
The first commit adds an implementation for the model.
The second commit connects the KMAC model to OTBN.
The third commit contains a working assembly example.