plat-qcom: Add diagnostic logging support#7
plat-qcom: Add diagnostic logging support#7zelvam95 wants to merge 3 commits intoqualcomm-linux:qcom-nextfrom
Conversation
cd68754 to
995ec2b
Compare
@zelvam95 how is this allocation done on IMEM for the diag buffer? |
Hi @b49020, I referred to QTEE Code & in QTEE, this region is being used for DIAG Buffer. Its reserved/marked down for that purpose. Have we changed the layout/do we have a separate layout for our current open source boot loader stack? |
This DIAG buffer is internal to QTEE memory layout where it is put on IMEM I think. How does the crashdump flow gets to know this address? Is it hard coded in the tooling somewhere for each SoC? |
Its not hard coded; Its rather retrieved from another IMEM address. For QTEE DIAG Log, they write the DIAG Buffer start address at offset 0x720 & crashscope/tools read that offset, understand where the DIAG Log is present and then does the parsing. |
Does that mean whole of IMEM is reserved for the crashdump tooling? Isn't XBL using any of the IMEM regions too? I think the memory layout has to be studied further. Would you be able to put the memory layout on a confluence page such that we can have further alignment? |
|
Kodiak: 100K starting 0x14680000 Start Address Page Size Region Lemans: 200K starting 0x14680000 Start Address Page Size Region |
|
Hi @b49020, This info is documented and available internally already. I have shared high level split above. The OCIMEM is split across different sub systems for different use-cases as required. We can review this and align on this. |
|
@zelvam95 thanks for gathering that info, I have updated TF-A BL2 for Lemans to load and execute from pIMEM instead here: qualcomm-linux/trusted-firmware-a#15. Feel free to test your changes with that. |
|
Also, TF-A BL2 on Kodiak was already executing from pIMEM as you can see here: https://github.com/qualcomm-linux/trusted-firmware-a/blob/qcom-next/plat/qti/hoya/kodiak/rb3gen2/inc/platform_def.h#L15 |
| return (reg_value == DLOAD_MAGIC_COOKIE); | ||
| } | ||
|
|
||
| void qcom_diag_log_init(void) |
There was a problem hiding this comment.
should this function return an error code?
There was a problem hiding this comment.
Even if we return error code, all we can do is log it in calling place... I think since we're already logging here during successful init + magic is updated to indicate any initialization failure which can be used by tooling/HLOS drivers as required, so I think we needn't necessarily return an error here.
Let me know if you feel its better to return for any reason? & we can plan accordingly;
There was a problem hiding this comment.
yes, at least do generate a WARN on diag being configured but not available.
Tracing errors is important on secure platforms.
core/drivers/qcom_diag_log.c
Outdated
| #define DLOAD_MAGIC_COOKIE 0x10 /* Download mode detection value */ | ||
|
|
||
| /* | ||
| * struct diag - Main diagnostic region structure |
There was a problem hiding this comment.
no need for these comments. IMO we should use standard semantics for the circular buffer instead of these definitions.
Would you mind using the usual head, tail? maybe check how linux implements a simple one (include/linux/circ_buf.h)
struct diag {
uint32_t magic;
uint32_t ring_len; /* must be power of 2 */
uint32_t head; /* write index, monotonically increasing */
uint32_t tail; /* read index, monotonically increasing */
uint32_t overrun;
struct mutex lock;
uint8_t log_buf[];
};
could we also provide a reader or it makes no sense?
Otherwise, if we are not going to provide a reader lets not call them overruns and we dont need the tail...they are just wraps I think?
for (p = str; *p; p++) {
head = diag->head++;
if (head && !(head & (diag->ring_len - 1)))
diag->wrap++;
diag->log_buf[head & (diag->ring_len - 1)] = *p;
}
dsb();
mutex_unlock(&diag->lock);
There was a problem hiding this comment.
Reader not required. version was suggested during wider discussion with SSG Team.
This log driver would be used across arch + there will be tooling built around this. We don't need mutex around this. Mutex is there in the calling place for diag_log_putc (plat_trace_ext_puts). Similarly tail is not required.
ring_offset & ring_len also have a purpose & it will be used by HLOS/tooling infra built around this. We can potentially introduce more variables in the future version before log_buf & ring_offset would print to exact start of log_buf.
Also, ring_len gives out the log buf size which can be used by tooling infra.
This struct was aligned on based on discussion with SSG Team & we plan to re-use same infra/structure across TF-A & OP-TEE. I got a comment to make both version & magic as uint32_t which I will be updating.
There was a problem hiding this comment.
I dont understand your points
- the diag region is global memory. every write access to it must be protected by a mutex. the mutex must belong to the structure.
- the identifiers/names in the structure seem to come from a school primer on computer science. we can/should do better. if there is no reader this is not really a ring buffer, it is just a circular buffer (at least for me).
These sort of definitions seem more appropriate to me (layout may vary)
struct circ_wo_buf {
uint8_t *buf;
size_t size;
unsigned int head;
unsigned int wrap;
struct mutex lock;
};
struct diag_buf {
uint16_t version;
uint 16_t magic
struct circ_wo_buf wo_cbuf; /*maybe also *wo_cbuf in case address varies with version */
}
There was a problem hiding this comment.
Hi Jorge,
Reg. variable naming/structure definition:
- I was clarifying on the layout as to why its required and about the tooling requirements downstream for retaining a particular layout.
- Also, the current struct variables like offset, len etc. namings are already used in BCM's implementation which is available upstream. Its a local variable there & I have made it as part of the header, thats the main diff between QCOM vs BCM's impl..
- I feel different persons might have different perspective/preferences in this. Do you feel the BCM's impl is not upto the mark and their naming is also like from a school primer? I used that as reference since thats the only implementation available upstream which uses this plat_trace_ext_puts.
- Anyways, I am open to make minor modifications as suggested above with nested structure + slight name changes..
Reg. mutex - Your concern about global memory access is valid in general, but in this specific case:
- The trace infrastructure already provides serialization: All calls to
qcom_diag_log_puts()go throughtrace_ext_puts()which ensures serialized access. - Single controlled entry point: Unlike general-purpose drivers, this has a single entry point through the trace infrastructure, which acts as the serialization gatekeeper.
- This follows the established pattern in OP-TEE:
-
- BCM's elog implementation (
plat-bcm/bcm_elog.c) uses the same approach: global memory buffer, no mutex, relies on trace infrastructure
- BCM's elog implementation (
Note: Would request you to please check the trace_ext_puts function -> where plat_trace_ext_puts is called & also check the BCM's implementation of log driver where a similar diag log impl is already available and upstreamed.
There was a problem hiding this comment.
struct circ_wo_buf {
unsigned int buf_off;
size_t size;
unsigned int head;
unsigned int wrap;
uint8_t buf[];
};
struct diag_buf {
uint32_t version;
uint32_t magic;
struct circ_wo_buf wo_cbuf;
};
There was a problem hiding this comment.
Planning to update above way now. We're using "flexible array" at the end & so can't update as uint8_t* buf at the start. buf_off is to give the "offset" to buf[]. In future, even if we add new fields, the external toolings etc would read the buf_off to identify buf[] start and so wouldn't have issues.
Let me know what you think of this struct + mutex requirements.
There was a problem hiding this comment.
WRT to the lock, yes, drop it from the circular buffer definition - I didnt notice we were using the trace infrastructure (I should have clone the repo and read it)
WRT to the names, yes, do use head for semantics and define a circular buffer explicitly.
These are not slight name changes - these are software semantics that make the code readable - the distance between the word "offset" or "head" is similar to the difference between an atom and a horse.
I am not part of the SSG discussions on the structure defintions, but do not use a flat layout - use abstractions...ie see below for another example....
struct diag_hdr {
uint16_t version;
uint16_t magic;
};
struct circ_buf_conf {
uint32_t base;
uint32_t size;
};
struct circ_wo_buf {
uint32_t wrap;
uint32_t head;
uint8_t buf[];
};
struct diag {
struct diag_hdr hdr;
struct circ_buf_conf conf;
struct circ_wo_buf wo_cbuf;
};
There was a problem hiding this comment.
Yes, I am using trace infra. Mutex part is clear now :).
Reg. structure variable naming:
I used "BCMs" implementation of the log driver as reference for this & in their implementation if you check, you'd notice they are using "offset" instead of "head" for a very similar log driver that they have. I assumed what's already upstream available can be used as a standard for reference but probably not all changes upstream follow these software semantics. I mentioned them as slight name changes because otherwise if it was considered as part of sw semantics, then they should have knocked down the BCMs impl when it was posted for review?
Note - I am just trying to highlight the point of reference - that it was already avail upstream.
Anyways, You can check the latest patch where I have modified to using head instead of offset. I will incrementally do changes as suggested above with abstractions & post same for review.
There was a problem hiding this comment.
yeah, don't rely on what's upstream :) as you know (better than I do!) we have many platforms to upstream so the better our abstractions (the more indirections) the easier will be to extend and maintain.
995ec2b to
8038358
Compare
1e67f46 to
33edc8f
Compare
Add diagnostic ring buffer driver for Qualcomm platforms to enable persistent logging across system resets and crashes. The driver provides a circular buffer in reserved memory that persists across warm resets, allowing post-mortem analysis of system state before crashes or unexpected resets. Key features: - Circular ring buffer with wraparound tracking - Download mode detection to preserve logs during crash dumps Signed-off-by: Selvam Sathappan Periakaruppan <speriaka@qti.qualcomm.com>
Platform-specific target headers with memory configurations for Kodiak (0x14696000) and Lemans (0x146AF000) Signed-off-by: Selvam Sathappan Periakaruppan <speriaka@qti.qualcomm.com>
Enable the diagnostic ring buffer driver for Kodiak and Lemans platforms. Integration points: - plat_trace_ext_puts(): Routes trace output to diagnostic buffer - plat_console_init(): Initializes diagnostic buffer before UART console - CFG_QCOM_DIAG_LOG: Enabled in debug builds (CFG_TEE_CORE_DEBUG=y) The diagnostic buffer is initialized early in the boot process to capture logs from the earliest stages, providing valuable debugging information for system crashes and unexpected resets. Signed-off-by: Selvam Sathappan Periakaruppan <speriaka@qti.qualcomm.com>
33edc8f to
c302bd4
Compare
|
|
||
| for (p = str; *p; p++) { | ||
| diag->wo_cbuf.buf[diag->wo_cbuf.head++] = *p; | ||
| if (diag->wo_cbuf.head >= diag->wo_cbuf.size) { |
There was a problem hiding this comment.
it is safer to mask rather than this sort of checks to avoid corrupting memory on glitches.
we can use this sort of construct as long as we can guarantee the size of the buffer is ^2 I think. then there is zero risk or running over the memory
for (p = str; *p; p++) {
head = diag->head++;
if (head && !(head & (diag->wo_cbuf.size - 1)))
diag->wrap++;
diag->log_buf[head & (diag->wo_cbuf.size - 1)] = *p;
}
There was a problem hiding this comment.
Unfortunately, we cannot guarantee power-of-2 as size for these buffers. There are many existing platforms where this buffer size is not power of 2. Also, We have some header which we have to reduce from DIAG SIZE. So, lets say even if DIAG Size overall is 8KB (power of 2). If we subtract the header. Then the actual circ buffer size would become 4KB.
In our current case, the DIAG size is 12KB
Wasted space: Rounds down to power-of-2 (e.g., 12KB → 8KB usable)
~33% loss: For 12KB DIAG_SIZE, loses 4KB
For 8KB diag log we'd lose 50% and so on.
This region is ideally expected to be access protected (write access is restricted to AP-S). So, would it be safe to assume SEL1/SEL3 will not corrupt this? Or at least the risk is comparatively low? (Though ideally it is possible in some cases?)
There was a problem hiding this comment.
um, ok...it is a shame that the buffers cant be a power of 2.
|
Please remove the driver initialization from main.c - just initialize it in the driver itself (driver_init, early_init or whatever is appropiate). then qcom_diag_log_puts should keep a static pointer to a valid diag. And we shouldnt need to check for the magic every time... |
I would ideally prefer if this diag log is available as early as possible & I assumed that it'd be good to have this at same point of console init. Even BCM's implementation which is upstream, follows exact same approach where its init is done from main.c. In PRODUCTION -> Console logs would be disabled & DIAG Logs could be our only way for debug & ideally it'd be better to have this enabled as early as possible to capture everything?
This part of the comment I will check further & try to handle this. |
so why a driver? why not just a log feature...drivers (except the console) typically have the syscalls to decide when to load. @b49020 also do we need to integrate the tracing feature with the FFA console? I dont see any hooks there...do you know why? |
There isn't anything like FF-A console since there isn't any ABI requirement with TF-A there. But as you said for this diag feature, let's add support in the common platform code rather than a driver since it's not really running in a driver context but all the OP-TEE contexts. |
there is exactly an FF-A console (as an alternative to the console or the semihosting console)
none provide the hooks to trace. |
Ah I see, that's news to me. Not sure how that FF-A console manages sync among different SPs as each can clobber other console logs. Maybe there is SP identifier. Anyhow, let's not tie this with FF-A since this feature is expected to work on platforms which don't support FF-A too. BTW, this is related to Qualcomm crashdump tooling as well, so we need to support diagnostic logs that the tooling understands during a crash. Also, thinking about it more the driver vs platform code. I think similar to UART drivers, let's keep the diag as a special driver code. |
| #ifndef DIAG_TARGET_H | ||
| #define DIAG_TARGET_H | ||
|
|
||
| #define IMEM_BASE UL(0x14680000) |
There was a problem hiding this comment.
Let's move this macro to platform specific config file instead.
|
|
||
| #define TCSR_BOOT_MISC_DETECT UL(0x1FD3000) | ||
|
|
||
| #define DIAG_BASE UL(0x14696000) |
There was a problem hiding this comment.
Let's move this macro to platform specific config file too. After that all other macros can be moved to common header and avoid any duplication.
I dont think it should be a driver...it is just writing raw to a qcom specific memory region (no protocols, no registers, no locking, nothing...just init it early and it is good to go.) |
I originally thought its similar to UART driver, and kept it here; But as Jorge pointed out, it doesn't really have any registers + no locking/etc. as well & the current logging drivers avail in upstream also has put it under plat-*/ and not drivers/... So I think moving it there makes sense & planning to do that change. @b49020, Hope thats fine. If incase there are any stronger reasons to put it under driver, please do let us know so we can plan accordingly; |
|
just for future reference: OP-TEE/optee_os#7775 |

This patch series introduces a diagnostic ring buffer driver for Qualcomm
platforms, enabling persistent early boot logging that survives warm resets
and facilitates post-mortem analysis of system failures.
The implementation is structured as two patches:
Key capabilities:
Memory layout:
This implementation enables comprehensive system debugging by providing
persistent logs accessible to bootloaders and diagnostic tools across
system resets.