-
Notifications
You must be signed in to change notification settings - Fork 54
FEAT: rpma_cq_wait() performance optimization #1697
Description
FEAT: rpma_cq_wait() performance optimization
Rationale
ibv_ack_cq_events() seems to be the main bottleneck for the librpma library when the completion event channel is used.
To avoid the problem ibv_ack_cq_events() shall be called less frequently.
It is also wise to call it before ibv_get_cq_event() as it is more possible that we still have some spare time before a new event will be ready to obtain via ibv_get_cq_event().
Description
The struct rpma_cq shall be extended with a field unsigned int unack_cqe; and set to 0 in rpma_cq_new().
(*cq_ptr)->cq = cq;
(*cq_ptr)->unack_cqe = 0;
unack_cqe shall be increased every time ibv_get_cq_event returns a valid event in rpma_cq_wait().
rpma_cq_wait(struct rpma_cq *cq)
{
...
if (ibv_get_cq_event(cq->channel, &ev_cq, &ev_ctx))
return RPMA_E_NO_COMPLETION;
++cq->unack_cqe;
As minimum the ibv_ack_cq_events() shall be called before ibv_cq is deleted inside rpma_cq_delet():
if (cq->unack_cqe)
(void) ibv_ack_cq_events(cq->cq, cq->unack_cqe);
errno = ibv_destroy_cq(cq->cq);
but it also must be called cyclically as part of rpma_cq_wait (Please observe that ibv_ack_cq_events() operation is moved before ibv_get_cq_event()):
/*
* cq.c -- librpma completion-queue-related implementations
*/
...
#define RPMA_MAX_UNACK_CQE UINT_MAX
...
int
rpma_cq_wait(struct rpma_cq *cq)
{
...
/*
* ACK the collected CQ event.
*/
if (cq->unack_cqe >= RPMA_MAX_UNACK_CQE) {
ibv_ack_cq_events(cq->cq, cq->unack_cqe);
cq->unack_cqe = 0;
}
/* wait for the completion event */
struct ibv_cq *ev_cq; /* unused */
void *ev_ctx; /* unused */
if (ibv_get_cq_event(cq->channel, &ev_cq, &ev_ctx))
return RPMA_E_NO_COMPLETION;
...
}
...