[BPI-R64] PCIe issues

Good to know :slight_smile: I have tried out both ports… same result. The reset algorithm (like mentioned on github/link in my posting above) is likely the cause I think… I hope this can be solved by changing mtk_pcie_startup_port_v2 in pcie-mediatek.c accordingly. https://bugzilla.kernel.org/show_bug.cgi?id=84821#c48

So at least there is something, that can be tried out :slight_smile:

Jasmin

my compex wle900vx was recognized with splitted pci-ports in cn25 and i had open a AP on it

Btw.did you applied this patch? https://github.com/frank-w/BPI-R2-4.14/commit/9a89cdc0a9ac454bbb52ac2b359efdd80a064902

The issue was caused by class code is zero of Google Coral PCIe card, so linux pcie framework didn’t assign resource after scan.

The work-around solution is remove "class == PCI_CLASS_NOT_DEFINED " in setup-bus.c, thanks.

static void __dev_sort_resources(struct pci_dev *dev,
                                 struct list_head *head)
{
        u16 class = dev->class >> 8;

        /* Don't touch classless devices or host bridges or ioapics.  */
        if (class == PCI_CLASS_NOT_DEFINED || class == PCI_CLASS_BRIDGE_HOST)
                return;

ps. please refer to page 8 in below pdf to get more information about class id https://pcisig.com/sites/default/files/files/PCI_Code-ID_r_1_11__v24_Jan_2019.pdf

1 Like

You linked mutex issue…do you mean this or the bar0 unassigned issue happening on coral cards?

I linked to the wrong issue. I meant bar0 unassign issue was caused by class id is 0. it is a invalid value, thanks.

Ok,if it is drivercode,card should not work with other pci-controllers or these controllers allowing classless devices…

Anyone here have this googlecard running on another pcie-controller and which?

As workaround is in generic framework and i guess driver should work with other controller (class maybe set by any read value) it looks like assignment issue.

I wonder why class is shifted to variable compared with none…maybe the bits for class are not on right position?

Just to link issue thread…

I think we can use x86 pc and lspci.command to check class id. Maybe most google cards have valid id and only few card have such issue, thanks.

https://diego.assencio.com/?index=649b7a71b35fc7ad41e03b6d0e825f07

I have apply the pci interrupt patch and pci controller splitting dts patch from frank. And I have tested 10PCS WLE900VX, but not all WLE900VX working with R64.

  1. 3PCS WLE900VX work perfect with R64.
  2. 3PCS WLE900VX can’t detect by R64.
  3. 4PCS WLE900VX sometimes work, sometimes can’t detect by R64.

I also think it’s the RESET signal sequence problem, I try to sleep 1000ms before end reset, but it can’t fix all the problem.

static int mtk_pcie_startup_port_v2(struct mtk_pcie_port *port)
{
        struct mtk_pcie *pcie = port->pcie;
        struct resource *mem = &pcie->mem;
        const struct mtk_pcie_soc *soc = port->pcie->soc;
        u32 val;
        size_t size;
        int err;

        /* MT7622 platforms need to enable LTSSM and ASPM from PCIe subsys */
        if (pcie->base) {
                val = readl(pcie->base + PCIE_SYS_CFG_V2);
                val |= PCIE_CSR_LTSSM_EN(port->slot) |
                       PCIE_CSR_ASPM_L1_EN(port->slot);
                writel(val, pcie->base + PCIE_SYS_CFG_V2);
        }

        /* Assert all reset signals */
        writel(0, port->base + PCIE_RST_CTRL);

        /*
         * Enable PCIe link down reset, if link status changed from link up to
         * link down, this will reset MAC control registers and configuration
         * space.
         */
        writel(PCIE_LINKDOWN_RST_EN, port->base + PCIE_RST_CTRL);

        if (port->slot == 0){
                dev_err(pcie->dev, "pcie port0 sleep 1000ms to wait reset for QCA988X device!!!!\n");
                msleep(1000);
        }

        /* De-assert PHY, PE, PIPE, MAC and configuration reset */
        val = readl(port->base + PCIE_RST_CTRL);
        val |= PCIE_PHY_RSTB | PCIE_PERSTB | PCIE_PIPE_SRSTB |
               PCIE_MAC_SRSTB | PCIE_CRSTB;
        writel(val, port->base + PCIE_RST_CTRL);

        /* Set up vendor ID and class code */
        if (soc->need_fix_class_id) {
                val = PCI_VENDOR_ID_MEDIATEK;
                writew(val, port->base + PCIE_CONF_VEND_ID);

                val = PCI_CLASS_BRIDGE_PCI;
                writew(val, port->base + PCIE_CONF_CLASS_ID);
        }

        /* 100ms timeout value should be enough for Gen1/2 training */
        err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_V2, val,
                                 !!(val & PCIE_PORT_LINKUP_V2), 20,
                                 100 * USEC_PER_MSEC);
        if (err)
                return -ETIMEDOUT;

        /* Set INTx mask */
        val = readl(port->base + PCIE_INT_MASK);
        val &= ~INTX_MASK;
        writel(val, port->base + PCIE_INT_MASK);

        if (IS_ENABLED(CONFIG_PCI_MSI))
                mtk_pcie_enable_msi(port);

        /* Set AHB to PCIe translation windows */
        size = mem->end - mem->start;
        val = lower_32_bits(mem->start) | AHB2PCIE_SIZE(fls(size));
        writel(val, port->base + PCIE_AHB_TRANS_BASE0_L);

        val = upper_32_bits(mem->start);
        writel(val, port->base + PCIE_AHB_TRANS_BASE0_H);

        /* Set PCIe to AXI translation memory space.*/
        val = fls(0xffffffff) | WIN_ENABLE;
        writel(val, port->base + PCIE_AXI_WINDOW0);

        return 0;
}

I will try your way, hope it can support WLE900VX.

Please try only front slot (cn25) because the slot behind (cn8) has hardware limitation

Yes, all test with the CN25 pcie slot

Hi frank-w,

I have updated the setup-bus.c file and removed check class == PCI_CLASS_NOT_DEFINED I Get a resource collision error as follows

pi@bpi-iot-ros-ai:~$ dmesg | grep pci
[    1.494119] mtk-pcie 1a143000.pcie: host bridge /pcie@1a143000 ranges:
[    1.505642] mtk-pcie 1a143000.pcie: Parsing ranges property...
[    1.515992] mtk-pcie 1a143000.pcie:   MEM 0x20000000..0x2fffffff -> 0x20000000
[    1.556439] mtk-pcie 1a143000.pcie: PCI host bridge to bus 0000:00
[    1.568448] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.573946] pci_bus 0000:00: root bus resource [mem 0x20000000-0x2fffffff]
[    1.580841] pci_bus 0000:00: scanning bus
[    1.584897] pci 0000:00:00.0: [14c3:3258] type 01 class 0x060400
[    1.590957] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x1ffffffff 64bit pref]
[    1.599829] pci_bus 0000:00: fixups for bus
[    1.604022] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0
[    1.610729] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    1.618747] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1
[    1.625563] pci_bus 0000:01: scanning bus
[    1.629683] pci 0000:01:00.0: [1ac1:089a] type 00 class 0x0000ff
[    1.635984] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit pref]
[    1.643318] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x000fffff 64bit pref]
[    1.651524] pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x1 link at 0000:00:00.0 (capable of 4.000 Gb/s with 5 GT/s x1 link)
[    1.666709] pci_bus 0000:01: fixups for bus
[    1.670896] pci_bus 0000:01: bus scan returning with max=01
[    1.676472] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.683093] pci_bus 0000:00: bus scan returning with max=01
[    1.688684] pci 0000:00:00.0: BAR 0: no space for [mem size 0x200000000 64bit pref]
[    1.696347] pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x200000000 64bit pref]
[    1.704355] pci 0000:00:00.0: BAR 8: assigned [mem 0x20000000-0x201fffff]
[    1.711151] pci 0000:01:00.0: BAR 2: assigned [mem 0x20000000-0x200fffff 64bit pref]
[    1.718983] pci 0000:01:00.0: BAR 0: assigned [mem 0x20100000-0x20103fff 64bit pref]
[    1.726814] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.731787] pci 0000:00:00.0:   bridge window [mem 0x20000000-0x201fffff]
[    1.738888] mtk-pcie 1a145000.pcie: host bridge /pcie@1a145000 ranges:
[    1.745425] mtk-pcie 1a145000.pcie: Parsing ranges property...
[    1.751269] mtk-pcie 1a145000.pcie:   MEM 0x28000000..0x2fffffff -> 0x28000000
[    1.758508] mtk-pcie 1a145000.pcie: resource collision: [mem 0x28000000-0x2fffffff] conflicts with pcie@1a143000 [mem 0x20000000-0x2fffffff]
[    1.771141] mtk-pcie: probe of 1a145000.pcie failed with error -16
[    8.381659]  pci_enable_resources+0x68/0x18c
[    8.381665]  pcibios_enable_device+0xc/0x14
[    8.381670]  do_pci_enable_device+0x50/0xd8
[    8.381675]  pci_enable_device_flags+0x100/0x15c
[    8.381680]  pci_enable_device+0x10/0x18
[    8.381684]  pci_enable_bridge+0x50/0x78
[    8.381689]  pci_enable_device_flags+0x98/0x15c
[    8.381694]  pci_enable_device+0x10/0x18
[    8.381703]  apex_pci_probe+0x38/0x468 [apex]
[    8.381709]  pci_device_probe+0xa0/0x144
[    8.381755]  __pci_register_driver+0x40/0x48
[    8.381811] pci 0000:00:00.0: Assigned....BAR 8 [mem 0x20000000-0x201fffff]........
[    8.381817] pci 0000:00:00.0: Assigned and claimed....BAR 8 [mem 0x20000000-0x201fffff]........
[    8.381823] pci 0000:00:00.0: enabling device (0000 -> 0002)
[    8.381837] pci 0000:00:00.0: enabling bus mastering
[    8.381880]  pci_enable_resources+0x68/0x18c
[    8.381884]  pcibios_enable_device+0xc/0x14
[    8.381889]  do_pci_enable_device+0x50/0xd8
[    8.381893]  pci_enable_device_flags+0x100/0x15c
[    8.381898]  pci_enable_device+0x10/0x18
[    8.381906]  apex_pci_probe+0x38/0x468 [apex]
[    8.381911]  pci_device_probe+0xa0/0x144
[    8.381953]  __pci_register_driver+0x40/0x48

As a result of this I am unable to run Inference on Google Coral.

How can we resolve this collision error?

I am working on the following repo

In this tree ranges is wrong from tests, please use 5.4-main or change ranges property to values from this tree

Hi franh-w

I have compared the branches 5.4-main and 5.4-dsa w.r.t. the PCIe sources. The memory map range is the same. PCI splitting is also the same.

I am sure it will give me the same memory not claimed issue with 5.4-main. Additionally if I apply the changes to detect PCIe device with class code 0000 it should give me resource collision error.

What is the benefit of using 5.4 main?

I thought there was a difference…

5.4-main will be updated. 5.4-dsa was only testing branch and will be deleted…

Else i have no idea because there i only apply patches i get from mtk/others

Regarding MiniPCIe card issue:

I was able to glean through this thread and put together a patch that works, atleast with WLE900VX modules. Thanks to @frank-w (for PCIe split patches) @bourne_hlm (for identifying RESET signal sequence problem - although I find 3000 msecs sleep works better than 1000msec) . I did not try @jasmin patches though. I’ve attached consolidated patch for helping whoever is hitting this problem. This patch is on top of clean, latest openwrt 4.19.101. With this patch, I see WLE900VX is consistently recognized and ath10k driver is coming up successfully. Phew!!!

patch (7.7 KB)

1 Like

Spoke too soon. Even with 3000 msec sleep, sometimes miniPCIe card is not detected. Sometimes it’s detected. Any other clues I can try ?

Yes, msleep not fix all the problem, sometimes work, sometimes not.

Same issue with @jasmin patch also. Sometimes it’s detected and sometimes it’s not

Hi

I have a version release from bananapi 4.19.81. In this kernel the Coral PCIe is detected and memory is allocated (using slot CN25)

However when I am loading the apex module there is a 32 bit GCB register at the following location

/* Determine if GCB is in reset state. */ static bool is_gcb_in_reset(struct gasket_dev *gasket_dev) { u32 val = gasket_dev_read_32(gasket_dev, APEX_BAR_INDEX, APEX_BAR2_REG_SCU_3);

/* Masks rg_rst_gcb bit of SCU_CTRL_2 */ return (val & SCU3_CUR_RST_GCB_BIT_MASK); }

#define APEX_BAR_INDEX 2 APEX_BAR2_REG_SCU_BASE = 0x1A300 #define APEX_BAR2_REG_SCU_3 (APEX_BAR2_REG_SCU_BASE + 0x18) #define SCU3_CUR_RST_GCB_BIT_MASK 0x10

Can you help me what this register bits indicate. the bit 5 is responsible for resetting the pci port.

For Comparison I have a PCIe Coral inserted in Laptop (ubuntu 16.04 kernel 4.4) Following is the 32 bit value from Laptop 0xe0050804 whereas following is the 32 bit value from R64 0xe0050014

/* Reset the hardware, then quit reset. Called on device open. */ static int apex_reset(struct gasket_dev *gasket_dev) { int ret;

if (bypass_top_level) return 0;

if (!is_gcb_in_reset(gasket_dev)) { /* We are not in reset - toggle the reset bit so as to force * re-init of custom block */ dev_dbg(gasket_dev->dev, “%s: toggle reset\n”, func );

  ret = apex_enter_reset(gasket_dev);

if (ret) return ret;






} ret = apex_quit_reset(gasket_dev);



return ret; }

From this I think it is necessary to reset the PCI driver (apex)when accessed first time. This happens in laptop but not in R64. Can anyone pleasehelp me understand this bitwise structure of the gasket(apex) driver and its impact?