BPI-R2 IDE/SATA doesn't seem production quality, causes data corruption

Anyone else using SATA extensively with BPI-R2? Wasted nearly a year on this board and keep getting IDE errors. We’ve tried with multiple BPI-R2 boards, manufacturer supplied SATA cables, other OEM SATA cables, different SATA drives, different kernel versions. And we keep getting these errors.

Sometimes they are on first boot, other times it takes 4 months where you think the system is stable but it ends up causing a filesystem corruption and complete data loss…

Is anyone else experiencing this problem? Does anyone have a link to the IDE Spec Sheet? So we can try and debug this problem?

I’m posting this here in hopes of someone that actually works for the manufacturer can assist with the questions above, otherwise we’ll have to switch platforms and scrap R2 as a viable platform.

Thanks

[    8.395118] ata2.00: exception Emask 0x10 SAct 0x8 SErr 0xc00000 action 0x6 frozen
[    8.402632] ata2.00: irq_stat 0x08000000, interface fatal error
[    8.408524] ata2: SError: { Handshk LinkSeq }
[    8.412845] ata2.00: failed command: READ FPDMA QUEUED
[    8.417959] ata2.00: cmd 60/08:18:00:08:00/00:00:00:00:00/40 tag 3 ncq dma 4096 in
[    8.417959]          res 40/00:14:a0:7c:e0/00:00:e8:00:00/40 Emask 0x10 (ATA bus error)
[    8.433405] ata2.00: status: { DRDY }
[    8.437054] ata2: hard resetting link
[    8.766524] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    8.795723] ata2.00: configured for UDMA/133
[    8.799993] ata2: EH complete

[   24.995160] ata2.00: exception Emask 0x10 SAct 0x1000 SErr 0xc00000 action 0x6 frozen
[   25.002945] ata2.00: irq_stat 0x08000000, interface fatal error
[   25.008851] ata2: SError: { Handshk LinkSeq }
[   25.013177] ata2.00: failed command: READ FPDMA QUEUED
[   25.018299] ata2.00: cmd 60/08:60:a8:87:e0/00:00:e8:00:00/40 tag 12 ncq dma 4096 in
[   25.018299]          res 40/00:5c:a8:88:e0/00:00:e8:00:00/40 Emask 0x10 (ATA bus error)
[   25.033868] ata2.00: status: { DRDY }
[   25.037560] ata2: hard resetting link
[   25.366532] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   25.386197] ata2.00: configured for UDMA/133
[   25.386255] ata2: EH complete

[   26.335148] ata2.00: exception Emask 0x10 SAct 0xfc00000 SErr 0xc00000 action 0x6 frozen
[   26.343305] ata2.00: irq_stat 0x08000008, interface fatal error
[   26.349204] ata2: SError: { Handshk LinkSeq }
[   26.353527] ata2.00: failed command: READ FPDMA QUEUED
[   26.358644] ata2.00: cmd 60/00:b0:e0:04:00/01:00:00:00:00/40 tag 22 ncq dma 131072 in
[   26.358644]          res 40/00:b4:e0:04:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   26.374347] ata2.00: status: { DRDY }
[   26.377989] ata2.00: failed command: READ FPDMA QUEUED
[   26.383090] ata2.00: cmd 60/d0:b8:00:84:e0/00:00:e8:00:00/40 tag 23 ncq dma 106496 in
[   26.383090]          res 40/00:b4:e0:04:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   26.398797] ata2.00: status: { DRDY }
[   26.402426] ata2.00: failed command: READ FPDMA QUEUED
[   26.407542] ata2.00: cmd 60/08:c0:d8:84:e0/00:00:e8:00:00/40 tag 24 ncq dma 4096 in
[   26.407542]          res 40/00:b4:e0:04:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   26.423071] ata2.00: status: { DRDY }
[   26.426710] ata2.00: failed command: READ FPDMA QUEUED
[   26.431811] ata2.00: cmd 60/10:c8:e8:84:e0/00:00:e8:00:00/40 tag 25 ncq dma 8192 in
[   26.431811]          res 40/00:b4:e0:04:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   26.447339] ata2.00: status: { DRDY }
[   26.450968] ata2.00: failed command: READ FPDMA QUEUED
[   26.456078] ata2.00: cmd 60/20:d0:00:85:e0/00:00:e8:00:00/40 tag 26 ncq dma 16384 in
[   26.456078]          res 40/00:b4:e0:04:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   26.471697] ata2.00: status: { DRDY }
[   26.475338] ata2.00: failed command: READ FPDMA QUEUED
[   26.480440] ata2.00: cmd 60/a8:d8:28:85:e0/00:00:e8:00:00/40 tag 27 ncq dma 86016 in
[   26.480440]          res 40/00:b4:e0:04:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[   26.502825] ata2.00: status: { DRDY }
[   26.506483] ata2: hard resetting link
[   26.836520] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   26.856185] ata2.00: configured for UDMA/133
[   26.860572] ata2: EH complete
[   27.255125] ata2: limiting SATA link speed to 3.0 Gbps
[   27.260234] ata2.00: exception Emask 0x10 SAct 0x1c00000 SErr 0xc00000 action 0x6 frozen
[   27.275110] ata2.00: irq_stat 0x0c000000, interface fatal error
[   27.280982] ata2: SError: { Handshk LinkSeq }
[   27.285320] ata2.00: failed command: READ FPDMA QUEUED
[   27.290423] ata2.00: cmd 60/08:b0:00:86:e0/00:00:e8:00:00/40 tag 22 ncq dma 4096 in
[   27.290423]          res 40/00:b4:00:86:e0/00:00:e8:00:00/40 Emask 0x10 (ATA bus error)
[   27.305955] ata2.00: status: { DRDY }
[   27.309584] ata2.00: failed command: READ FPDMA QUEUED
[   27.314685] ata2.00: cmd 60/50:b8:10:86:e0/00:00:e8:00:00/40 tag 23 ncq dma 40960 in
[   27.314685]          res 40/00:b4:00:86:e0/00:00:e8:00:00/40 Emask 0x10 (ATA bus error)
[   27.330302] ata2.00: status: { DRDY }
[   27.333931] ata2.00: failed command: READ FPDMA QUEUED
[   27.339048] ata2.00: cmd 60/98:c0:68:86:e0/00:00:e8:00:00/40 tag 24 ncq dma 77824 in
[   27.339048]          res 40/00:b4:00:86:e0/00:00:e8:00:00/40 Emask 0x10 (ATA bus error)
[   27.354666] ata2.00: status: { DRDY }
[   27.358386] ata2: hard resetting link
[   27.686522] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   27.706023] ata2.00: configured for UDMA/133
[   27.710340] ata2: EH complete

searching the forum for the error message (READ FPDMA QUEUED) brings you to this…

http://forum.banana-pi.org/t/searching-testing-people-for-hdmi-wifi-in-kernel-4-16/5830/13

which kernel do you use? imho it should be fixed in newer kernels, if not i will look for it

Hi Frank,

we have 0073-mtk-pcie-bug-fix.patch, the error happens on any kernel between 4.9.44 - 4.9.180. The 44 is the kernel we started on and kept rolling updates upto 180, that’s how long this issue has persisted.

The link you pointed to seems to talk about identifying the IDE, and that works fine. It’s just that IDE causes corruption after working sometimes for months.

It’d be really nice to get the IDE specs for this chip

http://www.asmedia.com.tw/eng/e_show_products.php?item=118

https://datasheetspdf.com/pdf-file/949526/ASMedia/ASM1061/1

have tried upgrading to newer kernel? maybe there is a bug already fixed in mailine, i have not seen it in 4.14…have also a ssd via sata connected (where lives my /var, lxc-vms and some other data not related to OS )

Thanks for the links, have not tried higher kernels than 4.9

have you looked if power- and sata-socket is soldered properly? have you tested/replaced your cable? maybe some kind of vibration or similar causes the issue

Yeah, like I said on the top post… tried different cables, different drives, different boards.

Are you running yours at 6gb or 3? Can you paste output of:

smartctl -a /dev/sda|grep current:

Which cable manufacturer/model do you have?

[19:35] frank@bpi-r2-e:~$ sudo smartctl -a /dev/sda|grep current
[sudo] Passwort für frank: 
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)

have bought my cable on ebay, don’t know the vendor…i have same cable currently on my test-r2 (for uboot-sata)…

IMG_20190730_193730_157

Ok, that looks like the manufacturers special cable for bpi-r1. I have about 10 of those and it shows the failure.

Left to right is the original bpi-r1 sata cable, then two different cables also low profile, and the orange special bpi-r2 cable made by the board manufacturer.

All of those cables show the problem, with different boards and different drives…

cable