libswdg: fix potential sleeping forever#63
Conversation
After calling channel callback, it was possible to sleep on condWait without proper timeout. YT: NIL-1079
Summary of ChangesHello @mateuszkobak, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical bug in the software watchdog (libswdg) where the system could enter an indefinite sleep state. The fix ensures that the watchdog's deadline is consistently updated, preventing miscalculations that could lead to the watchdog failing to trigger or reset as expected, thereby enhancing system reliability. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request fixes a bug where the watchdog thread could sleep forever if a watchdog channel timed out by correctly recalculating the next timeout value after a channel's callback is executed. However, a medium-severity vulnerability was identified: a potential integer overflow that could lead to incorrect watchdog behavior, effectively a denial of service of the watchdog functionality. Furthermore, the new logic introduces a critical issue where a busy-loop can occur if a watchdog channel is configured with a negative timeout limit, as diff would become negative and cause condWait to misbehave. Specific comments with suggested fixes and remediations have been provided for these issues.
|
|
||
| diff = deadline - now; | ||
| DEBUG("Channel %zu diff = %llu\n", i, diff); | ||
| if ((timeout == 0) || (timeout > diff)) { |
There was a problem hiding this comment.
The variable diff can become negative if swdg_common.chan[i].limit is a negative value. time_t is signed, and swdg_chanConfig() doesn't validate the limit parameter. A negative diff assigned to timeout would cause condWait() to likely return immediately, resulting in a busy-loop (100% CPU usage). This is a critical issue.
To safeguard against this, you should ensure diff is positive before updating timeout. While the root cause is in swdg_chanConfig(), a local fix here can prevent the busy-loop.
if (diff > 0 && ((timeout == 0) || (timeout > diff))) {There was a problem hiding this comment.
@gemini-code-assist i think the real isssue here is using time_t for timeout-like variables.
I think that valid solution for that situation is to change the API.
If we don't want to change API for some reason, validate limit arg to swdg_chanConfig should be check if it's > 0
It's not directly connected with fixed issue in this commit though
|
|
||
| /* Reload if there was no reset */ | ||
| _swdg_reload(i, now); | ||
| deadline = now + swdg_common.chan[i].limit; |
There was a problem hiding this comment.
The addition on this line is vulnerable to a signed integer overflow. Both now and limit are of type time_t (a signed integer). If their sum exceeds the maximum value for time_t, the result (deadline) will wrap around and become a negative number. This can lead to incorrect watchdog behavior, either triggering the callback prematurely or causing the watchdog thread to sleep for an unintended duration, effectively disabling the watchdog mechanism. It's recommended to add a check to prevent the overflow. You may need to include <limits.h> to use LLONG_MAX.
if ((swdg_common.chan[i].limit > 0) && (now > LLONG_MAX - swdg_common.chan[i].limit)) {
deadline = LLONG_MAX;
} else {
deadline = now + swdg_common.chan[i].limit;
}
anglov
left a comment
There was a problem hiding this comment.
Agree with gemini. It require major refactor. Not in scope of current issue fix.
After calling channel callback, it was possible to sleep on condWait without proper timeout.
YT: NIL-1079
Description
Motivation and Context
Types of changes
How Has This Been Tested?
Checklist:
Special treatment