scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
authorJustin Tee <justin.tee@broadcom.com>
Thu, 12 Sep 2024 23:24:40 +0000 (16:24 -0700)
committerMartin K. Petersen <martin.petersen@oracle.com>
Fri, 13 Sep 2024 01:21:18 +0000 (21:21 -0400)
During HBA stress testing, a spam of received PLOGIs exposes a resource
recovery bug causing leakage of lpfc_sqlq entries from the global
phba->sli4_hba.lpfc_els_sgl_list.

The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover
outstanding ELS sgls when walking the txcmplq.  Only CMD_ELS_REQUEST64_CRs
and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists.  A check
for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of
the phba->sli4_hba.lpfc_els_sgl_list too.

Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when
adding WQEs to the abort and cancel list in lpfc_els_flush_cmd().  Also,
update naming convention from CRs to WQEs.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/lpfc/lpfc_els.c

index de0ec945d2f1ea088d1a70ca180322b8cfd42312..7219b1ada1ea3f57b657cf6d0d46a9863f07f2de 100644 (file)
@@ -9658,11 +9658,12 @@ lpfc_els_flush_cmd(struct lpfc_vport *vport)
                if (piocb->cmd_flag & LPFC_DRIVER_ABORTED && !mbx_tmo_err)
                        continue;
 
-               /* On the ELS ring we can have ELS_REQUESTs or
-                * GEN_REQUESTs waiting for a response.
+               /* On the ELS ring we can have ELS_REQUESTs, ELS_RSPs,
+                * or GEN_REQUESTs waiting for a CQE response.
                 */
                ulp_command = get_job_cmnd(phba, piocb);
-               if (ulp_command == CMD_ELS_REQUEST64_CR) {
+               if (ulp_command == CMD_ELS_REQUEST64_WQE ||
+                   ulp_command == CMD_XMIT_ELS_RSP64_WQE) {
                        list_add_tail(&piocb->dlist, &abort_list);
 
                        /* If the link is down when flushing ELS commands