MT7996 WiFi error -11 — why rmmod won't fix it (BPI-R4 / Pro 8X)

MT7996 WiFi error -11 — why rmmod won't fix it (BPI-R4 / Pro 8X)

TL;DR: The -11 error means the WiFi MCU left a hardware semaphore locked. Neither rmmod/modprobe nor any userspace tool can clear it. A proper SW reset requires a GPIO wired to the WiFi NIC 12V rail — MTK implemented this on their reference board (RFB), but Sinovoip did not wire it on BPI-R4 or BPI-R4 Pro 8X.

The symptom

WiFi on BPI-R4 / Pro 8X occasionally stops working completely and only a full power cycle restores it. The full kernel log sequence — PCIe comes up fine, then the MCU stops responding:

mt7996e_hif 0001:01:00.0: assign IRQ: got 131
mt7996e_hif 0001:01:00.0: enabling device (0000 -> 0002)
mt7996e_hif 0001:01:00.0: enabling bus mastering
mt7996e_hif 0001:01:00.0: disabling ASPM L0s L1
mt7996e 0000:01:00.0: assign IRQ: got 128
mt7996e 0000:01:00.0: enabling device (0000 -> 0002)
mt7996e 0000:01:00.0: enabling bus mastering
mt7996e 0000:01:00.0: disabling ASPM L0s L1
mt7996e 0000:01:00.0: attaching wed device 0 version 3.0
mt7996e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20260311120419a
mt7996e 0000:01:00.0: Message 00000007 (seq N) timeout
mt7996e 0000:01:00.0: Failed to start patch
mt7996e 0000:01:00.0: Message 00000010 (seq N+1) timeout
mt7996e 0000:01:00.0: Failed to release patch semaphore
mt7996e 0000:01:00.0: probe with driver mt7996e failed with error -11

After this, every subsequent rmmod + modprobe fails immediately with:

mt7996e 0000:01:00.0: Failed to get patch semaphore
mt7996e 0000:01:00.0: failed to load patch: -11
mt7996e 0000:01:00.0: probe with driver mt7996e failed with error -11

Error -11 = -EAGAIN (resource temporarily unavailable — in practice, permanently until power cycle).

rmmod mt7996e && modprobe mt7996e does not help. The error reappears immediately.

Root cause — hardware semaphore in MCU

The MT7996 firmware download sequence in mt76/mt7996/mcu.c uses a hardware semaphore to coordinate between the host driver and the WiFi MCU:

/* mt76/mt7996/mcu.c — mt7996_load_patch() */
sem = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, 1);  /* acquire */
switch (sem) {
case PATCH_IS_DL:
    return 0;                          /* firmware already loaded, skip */
case PATCH_NOT_DL_SEM_SUCCESS:
    break;                             /* we got the semaphore, proceed */
default:
    dev_err(dev->mt76.dev, "Failed to get patch semaphore\n");
    return -EAGAIN;                    /* ← stuck here on next probe */
}

/* ... firmware download ... */

out:
sem = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, 0);  /* release */

Normal flow: driver acquires semaphore → starts patch → downloads firmware → releases semaphore.

Stuck flow: MCU stops responding during firmware start → driver sends release command → MCU is dead, two 5-second timeouts → Failed to release patch semaphore → semaphore stays locked in hardware.

rmmod unloads the driver and PCIe config space is reset, but the semaphore register inside the MT7996 MCU block is not cleared — it is part of the WiFi SoC internal state, not the PCIe config space. modprobe loads the driver again, tries to acquire the semaphore → MCU denies it → -EAGAIN.

The only way to clear it: cut power to the WiFi NIC.

The proper SW fix — GPIO reset

MTK implemented a solution on their reference board (RFB). There are two pieces:

1. PCIe driver patch

mtk-openwrt-feeds: patches-6.6/999-pcie-03-pcie-mediatek-gen3-Add-WIFI-HW-reset-flow.patch
(author: Jianguo Zhang, MediaTek, 2024-11-16)

It adds a GPIO-controlled reset pulse in mtk_pcie_startup_port(), executed before PCIe endpoint detection:

/* pcie-mediatek-gen3.c */
if (pcie->wifi_reset) {
    gpiod_set_value_cansleep(pcie->wifi_reset, 1);   /* assert reset */
    msleep(pcie->wifi_reset_delay_ms);
    gpiod_set_value_cansleep(pcie->wifi_reset, 0);   /* deassert */
    msleep(pcie->wifi_deassert_delay_ms);             /* wait for NIC to boot */
}

The GPIO descriptor is fetched with devm_gpiod_get_optional — boards without the DTS property silently skip this block. No code changes needed per-board.

2. DTS overlay — RFB board

mt7988d-rfb-2pcie.dtso:

/* overlay for pcie@11290000 */
&{/soc/pcie@11290000} {
    wifi-reset-gpios  = <&pio 7 GPIO_ACTIVE_LOW>;
    wifi-reset-msleep = <100>;
    wifi-deassert-msleep = <500>;
    status = "okay";
};

GPIO 7 on the RFB board is connected to the WiFi NIC 12V power/reset rail. Asserting it for 100 ms cuts power to the NIC, which clears the stuck semaphore. The MCU comes back clean and firmware load succeeds.

Why this doesn't work on BPI-R4 and BPI-R4 Pro 8X

On Sinovoip boards the WiFi NIC 12V power is controlled by a physical slide switch (SW4, MSK-12C01-07). There is no GPIO connected to that rail.

The MTK patch uses devm_gpiod_get_optional — if no wifi-reset-gpios is in the DTS, it returns NULL and the reset block is skipped entirely. You could add the DTS property, but without the physical connection on the PCB it does nothing.

The fix on the RFB board required one extra wire from GPIO 7 to the WiFi 12V control line. On the RFB schematic, GPIO 7 (mt7988a_gpio7) is routed to the NIC reset. On BPI-R4 Pro 8X, the schematic shows SW4 in parallel with nothing — no GPIO pad.

To enable SW reset on BPI-R4 / Pro 8X, Sinovoip would need to:

  1. Wire GPIO 7 (or any free GPIO) in parallel with SW4 on the PCB
  2. Add wifi-reset-gpios = <&pio 7 GPIO_ACTIVE_LOW> to the board DTS
  3. The driver patch already supports it — no kernel changes needed

Workaround

Power cycle only. There is no SW workaround. The semaphore is locked inside the MT7996 MCU hardware block — no kernel driver, userspace tool, or debugfs interface can reach it after a failed probe. Only cutting power to the NIC clears it.

On BPI-R4 Pro 8X: use the SW4 slide switch to cut 12V to the WiFi NIC, wait 2 seconds, switch back. On BPI-R4: full power cycle.

Summary

BPI-R4 / Pro 8X MTK RFB
WiFi NIC 12V control SW4 physical switch GPIO 7
DTS wifi-reset-gpios Not possible (no HW) <&pio 7 GPIO_ACTIVE_LOW>
SW recovery after stuck semaphore ❌ Power cycle only rmmod && modprobe triggers GPIO reset
PCB change needed Yes — GPIO pad + trace to SW4 Already done

The infrastructure for SW WiFi reset exists in the kernel and in the MTK feeds. On BPI-R4 and Pro 8X the limiting factor is a missing PCB connection, not software.


Tested on BPI-R4 Pro 8X (MT7988A + MT7996), OpenWrt / kernel 6.12.
Source refs: mt76/mt7996/mcu.c, pcie-mediatek-gen3.c, mt7988d-rfb-2pcie.dtso.