Write nand from linux error

I haven’t really used nand before, so I thought, give it a try, writing to nand from linux.

[root@bpir3 ~]# cat /boot/dtbos/nand-enable.dts
/dts-v1/;
/plugin/;

&spi0 {
  #address-cells = <1>;
  #size-cells = <0>;

  spi_nand: flash@0 {
    compatible = "spi-nand";
    reg = <0>;
    spi-max-frequency = <10000000>;
    spi-tx-buswidth = <4>;
    spi-rx-buswidth = <4>;

    partitions {
      compatible = "fixed-partitions";
      #address-cells = <1>;
      #size-cells = <1>;

      partition@0 {
        label = "bl2";
        reg = <0x0 0x200000>;
//      read-only;
      };

      partition@200000 {
        label = "ubi";
        reg = <0x200000 0x7a80000>;
      };
    };
  };
};
[root@bpir3 ~]# cat /proc/mtd 
dev:    size   erasesize  name
mtd0: 07a80000 00020000 "ubi"
mtd1: 00200000 00020000 "bl2"
[root@bpir3 ~]# uname -a
Linux bpir3 6.14.7-bpi #15 SMP PREEMPT Mon May 19 04:07:16 UTC 2025 aarch64 GNU/Linux

Somehow they are in reverse order, but I see no problem there…

[root@bpir3 ~]# dmesg | grep spi
[    2.942816] spi-nand spi0.0: Winbond SPI NAND was found.
[    2.948171] spi-nand spi0.0: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 64
[    2.962687] 2 fixed-partitions partitions found on MTD device spi0.0
[    2.969152] Creating 2 MTD partitions on "spi0.0":
[root@bpir3 ~]# dmesg | grep -i nand
[    1.693575] jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
[    2.942816] spi-nand spi0.0: Winbond SPI NAND was found.
[    2.948171] spi-nand spi0.0: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 64
[    6.816889] mtdblock: MTD device 'ubi' is NAND, please consider using UBI block devices instead.
[    9.299055] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.
[   15.520377] mtdblock: MTD device 'ubi' is NAND, please consider using UBI block devices instead.
[   15.556748] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.
[root@bpir3 ~]# dmesg | grep mtd
[    6.527979] mtdblock: MTD device 'ubi' is NAND, please consider using UBI block devices instead.
[    9.009387] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.
[    9.264195] I/O error, dev mtdblock1, sector 1008 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[    9.275279] I/O error, dev mtdblock1, sector 1008 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[    9.284142] Buffer I/O error on dev mtdblock1, logical block 126, async page read
[   15.159296] mtdblock: MTD device 'ubi' is NAND, please consider using UBI block devices instead.
[   15.183431] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.

Whoops

[root@bpir3 ~]# dd if=/openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-snand-preloader.bin of=/dev/mtdblock1
[  134.322219] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.
[  134.333230] I/O error, dev mtdblock1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
dd: writing to '/dev/mtdblock1': Input/output error
1+0 records in
0+0 records out
0 bytes copied, 0.0109003 s, 0.0 kB/s

But:

[root@bpir3 ~]# nandwrite -p /dev/mtd1 /openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-snand-preloader.bin
Writing data to block 0 at offset 0x0
Writing data to block 1 at offset 0x20000
Bad block at 20000, 1 block(s) will be skipped
Writing data to block 2 at offset 0x40000
Bad block at 40000, 1 block(s) will be skipped
Writing data to block 3 at offset 0x60000

But this did not succeed, bootrom sees nothing valid on nand.

I’ve even put u-boot in the bootchain (without it was at one time a problem with emmc), but it is still the same.

EDIT:

I have build u-boot myself, but it seems MTD configs options are not present in the bpi-r3 default config. Therefore, u-boot is not touching the spi-nand hardware anyway. I’ll change the PKGBUILD for R3 and add all MTD options to try it again with u-boot in the chain. Then I can also try the mtd command from u-boot and see if this works.

Looking at the devicetree’s of the R3 in u-boot and in linux, I noticed a small difference, that might explain the situation, but I have yet to test and check this out.

UBOOT:
	spi_flash_pins: spi0-pins-func-1 {
		mux {
			function = "flash";
			groups = "spi0", "spi0_wp_hold";
		};

		conf-pu {
			pins = "SPI2_CS", "SPI2_HOLD", "SPI2_WP";
			drive-strength = <MTK_DRIVE_8mA>;
			bias-pull-up = <MTK_PUPD_SET_R1R0_00>;
		};

		conf-pd {
			pins = "SPI2_CLK", "SPI2_MOSI", "SPI2_MISO";
			drive-strength = <MTK_DRIVE_8mA>;
			bias-pull-down = <MTK_PUPD_SET_R1R0_00>;
		};
	};
LINUX:
	spi_flash_pins: spi-flash-pins {
		mux {
			function = "spi";
			groups = "spi0", "spi0_wp_hold";
		};
	};

So, this seems to me that in u-boot some pins are being configured with milliamps and pull-ups/pull-downs. I suspect the following:

If this is somehow skipped, because u-boot is not used, or u-boot is not using the spi-nand, then the pins are never configured correctly, so this may be why in linux I’m having trouble.

This is all an assumption from an early investigation, I still need to try-out and test. I do not have that much time for it coming weeks, so it may take quite some time before I can confirm.

I’ve noticed while upstreaming r4 that this property should be spi-rx-bus-width,same for tx. But not yet tested writing from linux,only from uboot where i did not get errors.

Thanks, indeed, that is (also) a problem. Searching on github mainline linux, spi-rx-buswidth only finds the R3 and spi-rx-bus-width finds all other hardware…

I think if the pins need configuration, linux should not depend on u-boot, so the dts in llinux may need to add that. Anyway, I still need to confirm this one time…

I applied both fixes, but still i/o errors.

/dts-v1/;
/plugin/;

#define MTK_PUPD_SET_R1R0_00 100

&spi_flash_pins {
    conf-pu {
      pins = "SPI2_CS", "SPI2_HOLD", "SPI2_WP";
      drive-strength = <8>;
      bias-pull-up = <MTK_PUPD_SET_R1R0_00>;
    };

    conf-pd {
      pins = "SPI2_CLK", "SPI2_MOSI", "SPI2_MISO";
      drive-strength = <8>;
      bias-pull-down = <MTK_PUPD_SET_R1R0_00>;
    };
};

&spi0 {
  #address-cells = <1>;
  #size-cells = <0>;

  spi_nand: flash@0 {
    compatible = "spi-nand";
    reg = <0>;
    spi-max-frequency = <10000000>;
    spi-tx-bus-width = <4>;
    spi-rx-bus-width = <4>;

    partitions {
      compatible = "fixed-partitions";
      #address-cells = <1>;
      #size-cells = <1>;

      partition@0 {
        label = "bl2";
        reg = <0x0 0x200000>;
//      read-only;
      };

      partition@200000 {
        label = "ubi";
        reg = <0x200000 0x7a80000>;
      };
    };
  };
};

So my NAND was in bad shape, bad blocks, also the very first block. But that was fixed within uboot with mtd erase.dontskipbad, erasing the entire nand. Then I was able to write a file to the nand within u-boot. This got rid of the i/o errors in linux.

But:

Trying to write in linux, it does not always work. I boot u-boot with distro_boot, which does not access the nand. So when booting, even with u-boot in the chain, I write a file in linux, but then reading it back, there is a difference.

Then I reboot, interrupt u-boot and enter the command mtd list and continue boot with run bootcmd. Now when I write to nand and read it back, there is no difference.

So u-boot is still doing something when the mtd list command is called, that ‘fixes’ writing to mtdblock1 from linux

[root@bpir3 ~]# stat /openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-snand-preloader.bin 
  File: /openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-snand-preloader.bin
  Size: 205560          Blocks: 408        IO Block: 4096   regular file
Device: 179,3   Inode: 266736      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2025-05-20 11:53:14.000000000 +0200
Modify: 2025-05-20 11:52:54.000000000 +0200
Change: 2025-05-20 11:53:14.109532325 +0200
 Birth: -
[root@bpir3 ~]# dd of=/dev/mtdblock1 if=/openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-snand-preloader.bin 
[  571.736433] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.
401+1 records in
401+1 records out
205560 bytes (206 kB, 201 KiB) copied, 0.398897 s, 515 kB/s
[root@bpir3 ~]# dd if=/dev/mtdblock1 of=/dump.bin ; truncate --size=205560 /dump.bin
[  596.849029] mtdblock: MTD device 'bl2' is NAND, please consider using UBI block devices instead.
4096+0 records in
4096+0 records out
2097152 bytes (2.1 MB, 2.0 MiB) copied, 2.44648 s, 857 kB/s
[root@bpir3 ~]# diff /dump.bin /openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-snand-preloader.bin 
Binary files /dump.bin and /openwrt-23.05.2-mediatek-filogic-bananapi_bpi-r3-
snand-preloader.bin differ

But if I do the same with mtd list run from u-boot, then I do not have a difference in the binary files.

Maybe @hackpascal has an idea? As we are planning moving linux dts to of_upstream (still on my list,but first we need mt7988 linux dts) the nand access in linux should not depend on settings done by uboot

Currently I only stick to the linux kernel from openwrt (more precisely, 6.6). With u-boot from mtk-openwrt configured with xxxx_spim_nand_rfb_defconfig, all works fine.

Please note that in u-boot pinconf will be applied only if the device itself is being probed. That is, pinconf will be applied at the first time spi-nand driver initializes it.

The default config in mtk-openwrt/u-boot will have environment configured in ubi and the spi-nand will be initialized during u-boot booting.

I don’t know the exact config of the u-boot @ericwoud is using. Maybe it’s being configured without environment, or from other devices like emmc? If so, spi-nand will not be initialized during u-boot stage and all pinconfs will thus not be applied.

I use linux-rolling-stable (6.14.7) and the latest u-boot stable 2025.04. R3 sd defconfig.

The thing is, it should not matter what u-boot does or does not do. The linux driver should function correctly no matter what u-boot does. That is the linux kernel philosophy. But now it does not always do so.

When I boot with or without the mtd list command, so with or without pinctrl config of the spi-nand, I see no difference (I already added the missing pinctrl dts nodes to the linux devicetree)

cat /sys/kernel/debug/pinctrl/1001f000.pinctrl-pinctrl_moore/pinconf-pins

Is the same in both cases.

Also

cat /sys/kernel/debug/clk/clk_summary

Is the same.

So perhaps there is a hardware register not being initialized in the linux driver, or something like that?

Have you compared spi settings? Speed,buswidth,…

These debugfs entries could help to show if pins are really right and mapped to device(-driver):

# cat /sys/kernel/debug/gpio
# cat /sys/kernel/debug/pinctrl/pinctrl-handles