BPi-R3 - Modules to get nvme working? Wrong pinctrl

Trying to get opensuse running on an R3. The kernel loads & runs, but fails to mount the ssd & eventually fails to emergency mode where I can log in.

Kernel is 6.6.2 & I am using the generated dts blob for that kernel.

Loading the nvme & pcie_mediatek_gen3 modules, I would have thought that would be enough to create the devices, but no. There is not even a sniff of /dev/nvme0 or its children.

It would appear that for some reason pinctrl-moorefield is taking precedence over pinctrl-mt7986. Other than rebuilding the kernel, is there anyway of prioritising mt7986 over moorefield?

[    4.890363][   T38] mt7986a-pinctrl 1001f000.pinctrl: pin GPIO_4 already requested by pinctrl_moore:521; cannot claim for 11280000.pcie
[    4.902586][   T38] mt7986a-pinctrl 1001f000.pinctrl: pin-9 (11280000.pcie) status -22
[    4.916687][   T38] mt7986a-pinctrl 1001f000.pinctrl: could not request pin 9 (GPIO_4) from group pcie_clk  on device pinctrl_moore

Guess pin 9 (GPIO_4) is used twice in the device tree:

static int mt7986_pcie_clk_pins[] = { 9, };

So as part of one of the pcie-pins here in the dts:

	pcie_pins: pcie-pins {
		mux {
			function = "pcie";
			groups = "pcie_clk", "pcie_pereset";
		};
	};

But also in:

		reset-key {
			label = "reset";
			linux,code = <KEY_RESTART>;
			gpios = <&pio 9 GPIO_ACTIVE_LOW>;
		};

So no need to rebuild the kernel, only the dts.

SYS_WATCHDOG Seems to be GPIO0 (5) on v1.1 board

Commenting out the section for the reset-key got things a little forward, now I get…

[    4.902633] mtk-pcie-gen3 11280000.pcie: host bridge /soc/pcie@11280000 ranges:
[    4.914826] mtk-pcie-gen3 11280000.pcie:      MEM 0x0020000000..0x002fffffff -> 0x0020000000
[...]
[    5.172502] mtk-pcie-gen3 11280000.pcie: PCIe link down, current LTSSM state: detect.quiet (0x0)
[    5.182086] mtk-pcie-gen3: probe of 11280000.pcie failed with error -110

Ho hum!

Try kernel 6.5. I think I had the same on R3 mini…

Guess you’re out of luck and need to rebuild.

Edit:

I remembered wrong, I had this issue:

[    1.472717] nvme nvme0: pci function 0000:01:00.0
[    1.477447] nvme 0000:01:00.0: enabling device (0000 -> 0002)
[    6.492982] nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0

It was bad in 6.7-rc (net-next), but was ok in 6.5.12.

Edit2:

Now I’m running 6.7.0-rc2 (net-next) and I also do not have aborting initialisation, CSTS=0x0

You could try the patch mentioned in:

i guess the right reset-definition is

		reset-key {
			label = "reset";
			linux,code = <KEY_RESTART>;
			gpios = <&pio 5 GPIO_ACTIVE_LOW>;
		};

so using Pin 5 (GPIO0 in schematic) instead of pin 9

Well, I applied the patch to pcie-mediatek-gen3.c & I just deleted the reset-key section, but it still fails. I also loaded up a shed load of kernel modules to see if something was missing, but still nothing.

Does anyone have a working config for >=6.6 kernels that I can try?

I am sure no one is interested, but doing make modules on the R3 takes about 26 hours on a single core, and it need some swap. The swap was created on the nvme whilst running & building on openwrt.

Build less modules? Cross-compile the kernel?

I run archlinuxarm on the R3. The packages also need to be built on the same architecture. Then I modified the linux-kernel package to use the cross-compiler if makepkg is run from x86_64.

My defconfig is here (it is not a config, but defconfig):

archlinuxarm-repo/defconfig at linux-bpir64-git · ericwoud/archlinuxarm-repo (github.com)

A compiled version here: ftp://woudstra.mywire.org/repo/aarch64/linux-bpir64-git-6.6.6.bpi-1-aarch64.pkg.tar.xz

It is an archlinuxarm package, but it is also simply a .tar.xz archive file, so easy to extract. Filename will change when version updates.

I can double-check if exactly this version works with my nvme (on my R3mini at the moment) later this week, but it should be fine.

Perhapse the hardware is not compatible? We should keep a list of know working nvme devices… Perhaps openwrt adds a quirk for your nvme if there it works ok?

Thanks Eric, I’ll try & manipulate it.

The nvme does work fine under openwrt, it’s a Kingston KINGSTON SNV2S250G.

I think the problem is the pcie bus is not starting as there is nothing defined in /proc/bus/pci/devices.

The clk’s, i2c’s, spi’s, regulators & phy’s are all getting loaded, just pcie_mediatek_gen3 having a hissy fit when I reload it.

I’m building directly on the R3 just for fits’n’giggles, even if it does take a long time.

Well, that self built kernel worked. Now to start dissecting what is required.

You built it with my defconfig? From same source? Same devicetree?

Yes, built with your defconfig. Just copied across, make deconfig, all & install. Same source, same devicetree, nothing else changed. Still a few issues though such as the nvme root being mounted ro, but progress.

by default rootfs is mounted ro, if the partition is not mounted rw afterwards (e.g. in fstab)

I was not concerned about ro/rw, that was easily fixed.

Just cannot seem to get the modules loaded though.

Unless both pcie-mediatek-gen3 & phy-mtk-tphy were baked into the running kernel, the pcie bus would not be brought up.

When loading pcie-mediatek-gen3, order did not matter, the error I got was:

[    5.172502][   T11] mtk-pcie-gen3 11280000.pcie: PCIe link down, current LTSSM state: detect.quiet (0x0)
[    5.182086][   T11] mtk-pcie-gen3: probe of 11280000.pcie failed with error -110

Why not buitin pcie and mtk-tphy?

Trying to get it to run with the standard opensuse install, that is, without having to resort to rebuilding the kernel. Like I said earlier, it’s for fits’n;giggles.

If you want to use some general kernel, this is very unlikely it has the right CONFIG_XXX parameters. 99.99% is has not.

The default arm64 defconfig should have all needed for all plattforms (and maybe all other compiled as module) which makes it also very large…but this is a personal decision…i like as small as possible kernels :slight_smile:

I’ve reached the end of my ken on this.

The loading of the kernel module fails with trying the PCIE_PE_RSTB call.

After applying the 611-pcie-mediatek-gen3-PERST-for-100ms.patch patch, everything, return codes, parameters etc up to the following is identical between builtin & module:

        /* De-assert PERST# signals */
        val &= ~(PCIE_PE_RSTB);
        writel_relaxed(val, pcie->base + PCIE_RST_CTRL_REG);

It does not matter how much the delay is on either pre or post call, I even set it to a full second, still fails.

If I comment out the following err test & return, the pci bus can be seen,but the attached nvme cannot be connected.

Can’t seem to find any docs on this either