Update to All:
Working Hardware:
4 Western Digital Blue - WDC WD40EZRZ-00GXCB0 - 4TB
Syba SY-ENC50104 4 Bay 3.5” SATA III HDD NON-RAID Enclosure – Supports USB 3.0 & eSATA Interface
IOCREST - mini PCIe SATA card - AS1061R
This error would occur and the system would redirect the attached drives. Things would work for a little while and the error would occur again.
When mdadm was building the RAID it would happen about every 30 seconds. It took 2 days to finish building the RAID with no issues. Just this
Error happening causing the redirects and the delays. When it was done I built the ext4 filesystem. The same occurrences would repeat util it finished.
This took about 1 hour. When it was done I mounted the RAID and started testing file transfers. File transfers were showing the same results.
At this point I decided to turn off the WRITE CACHE to the drives. When I did this amazingly the problem stopped.
I’ve been doing file transfer test for 2 days now with out an occurrence yet.
ERROR message
[ 750.620645] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 750.627869] ata1.00: failed command: FLUSH CACHE EXT
[ 750.632908] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 8
[ 750.632908] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 750.646454] ata1.00: status: { DRDY }
[ 750.650299] ata1: hard resetting link
Google Search most people say it is a SATA cable.
I’ve tried 4 different cables all with exact same results. It is not a cable issue.
Others say it is hard drive issue. Smart info below indicating no issue. The same for all 4 drives.
The drives are all brand-new.
Tried a different PCIe SATA card - AS1061R
Smart info:
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD40EZRZ-00GXCB0
Serial Number: WD-XXXXXXXXXXXX
LU WWN Device Id: 5 0014ee 20ff6bc36
Firmware Version: 80.00A80
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jul 8 18:34:16 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Performance was not as good as I would have liked.
I received the replacement enclosure were that turned out to be a total mess.
None of the drives were even being recognized. I return the enclosure for a replacement and the replacement worked.
I tried with the WRITE CACHE back on and it still is working flawlessly.
So it seems the problems were compounded by both the enclosure and the miniPCIe card.
I have been testing ever sense. Passing close to 30TB in read and writes to the disk without error.
Performance is what is expected with a software raid.
> /dev/md0:
Timing cached reads: 1036 MB in 2.00 seconds = 517.83 MB/sec
Timing buffered disk reads: 174 MB in 3.01 seconds = 57.81 MB/sec
Thank you very much Frank
BPI should be paying you for all your contribution !