Huawei OptiXstar S800E XGSPON SPF+ ONU not detected on BPI-r4 with current mainline openwrt kernel 6.6.52.
ethtool -m eth2 shows device not found. i have waited 30mins and done soft reboots but the device is not detected.
bootlog does not show any info pertaining to sfp2
the sfp+ ONU works if i plug it into a switch (used as a media converter in this case) and then connect the switch to the bpi-r4 via a DAC cable to SFP2 on the bpi-r4.
I have tried to disable autoneg and manually setting it to 10G duplex. But does not work.
I suspect the sfp+ module is not detected or powered up correctly.
Any help or pointers?
Even with the hardware modification where Q12 is removed and R102 is shorted with a 0R bridge, the OptiXstar S800E is still not detected on the bpi-r4 running on OpenWrt 24.10-rc4.
The SFP module appears to be properly powered after the changes, as it is now hot to the touch. The sfp power LED is also lit. However, the device does not show up on i2c:
[ 11.306555] sfp sfp1: Host maximum power 3.0W
[ 11.311453] sfp sfp2: Host maximum power 3.0W
...
[ 17.862546] sfp sfp1: please wait, module slow to respond
[ 73.272569] sfp sfp1: failed to read EEPROM: -ENXIO
I have confirmed that the sfp1 port still works properly after the mod. When I tried another SFP device in the same cage, it shows up correctly:
[ 588.003346] sfp sfp1: module SOURCEPHOTONICS SFP+-T30 rev 10 sn HP202411041001 dc 241104
[ 588.043211] hwmon hwmon2: temp1_input not attached to any thermal zone
With the OptiXstar S800E in the cage, this is the sfp status:
holding down RST button, powering the board, then releasing RST after a minute, where the SFP stick should have booted before OpenWrt starts, in case the SFP stick was trying to emulate an EEPROM
using the i2csfp tool to manually poke the tx-disable gpio, and then restoring the ports, which i assume would probe the sfp device again
This appears to be a problem that is specific to bpi-r4 – this S800E SFP module runs with no problems on my x86-based OpenWrt (also 24.10-rcX) device using a Mellanox connectx-3 card with no hacks. I’d be very grateful if anyone can point me on how I can further troubleshoot this issue
frank-w and glassdoor, thank you both for your suggestions:
I’ve just tried fw_setenv bootdelay 60, however even with the additional wait, the EEPROM is still unavailable :
[ 68.232653] sfp sfp1: failed to read EEPROM: -ENXIO
I intend to try the suggestion from frank-w to temporarily change sfp->module_t_wait to a much higher default value. If I understand correctly, the key difference between this and changing bootdelay is that sfp_module_tx_enable(sfp); should first pull the tx-disable pin low before waiting. In the event that the S800E’s processor reset is directly tied to the tx-disable pin, this should allow the cpu to start up and properly emulate an EEPROM.
I currently do not have a OpenWrt build environment ready, so it might take me a while to test this and report back. (Is there a faster/easier way to do this without recompiling? Maybe swapping the gpios on the devicetree?)
eicwoud, thank you for your suggestion. I’ll look into it.
Small update on a failed experiment: I removed R106 which connects GPIO70 to SFP1_TX_Disable. My assumption is that tx-disable will be permanently pulled to GND via R101, and therefore any installed SFP module will always be enabled regardless of the cpu/driver state. However this did not work, and I am still getting the same EEPROM error.
Do you have still the “slow to respond error”? Have you increased the bootup time via quirk and verified that this was called (e.g. debug message in the function).
Afair there was a way to tag eeprom as broken (imho only for checksum error),but here it seems eeprom cannot be accessed at all.
frank-w: I have not tried rebuilding the kernel to insert the delay in sfp.c. Unfortunately my OpenWrt build environment is unavailable right now so I can’t try the fix that you suggested yet. I tried the resistor removal hack as it was a faster and easier alternative to check if tx-disable was causing any issues.
Adding this as a quirk will be difficult because the EEPROM is unavailable. The sfp_lookup_quirk function expects the EEPROM to at least provide the vendor_name and vendor_pn which isn’t available here with the S800E.
If I had to hack this delay in temporarily, I would probably call sfp_fixup_long_startup(sfp); after the quirk lookup.
glassdoor: Those components were desoldered. I am concerned that ripping the parts off might also tear off the underlying traces.
On your original post on whether a software based approach is possible, I would cautiously suggest modifying the devicetree so that the pin at moddef0 is not used by sfp1. You might then be able to use pin 82 as a gpio output and manually drive it low so that mosfet Q12 can deliver power to the module.
Edit: bear in mind that the sfp driver only starts probing for the module when the configured moddef0 pin is grounded, so you’ll also have to figure out how to tell the kernel to start probing for the inserted sfp module
Thanks for the offer on your custom build. I am not sure if that will work as the current quirk implementation expects the EEPROM to at least work (discussed in this same post in my reply to frank-w)
how you manually drive pin 82 low on the devicetree? how do you change it from input to output? that is what i am really interested in. any suggestions?
Edit:
from cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 512-595, parent: platform/1001f000.pinctrl, pinctrl_moore:
gpio-512 ( |tx-disable ) out lo
gpio-513 ( |tx-fault ) in lo IRQ
gpio-514 ( |los ) in lo IRQ
gpio-515 ( |rate-select0 ) in hi ACTIVE LOW
gpio-517 ( |reset ) out hi ACTIVE LOW
gpio-524 ( |cd ) in lo IRQ ACTIVE LOW
gpio-526 ( |WPS ) in hi IRQ ACTIVE LOW
gpio-533 ( |rate-select0 ) in lo ACTIVE LOW
gpio-566 ( |los ) in lo IRQ
gpio-575 ( |blue:wps ) out lo
gpio-581 ( |tx-fault ) in lo IRQ
gpio-582 ( |tx-disable ) out lo
gpio-591 ( |green:status ) out hi
gpio-594 ( |mod-def0 ) in lo IRQ ACTIVE LOW
gpio-595 ( |mod-def0 ) in lo IRQ ACTIVE LOW
glassdoor: Yes you’re right about that, I added an edit shortly after posting that you might have to find a way to get the sfp driver to probe after a certain period since the moddef0 pin would be driven as an output and would not be able to tell if a module is actually inserted into the cage.
The guide at https://openwrt.org/docs/techref/hardware/port.gpio might be able to steer you in the right direction. I originally tried this approach but I could not export the gpio as it was already in use by the sfp driver. That was when I decided to go for a hardware mod to bypass the mosfet.
In your logs, gpio-594 ( |mod-def0 ) in lo IRQ ACTIVE LOW appears to be the moddef0 gpio for your sfp1.
Edit: I tried manually driving my sfp2 (since my sfp1 was already bypassed). The sfp driver was first unloaded to free the pins. When the pin was driven low “0”, I could see the green power LED light up so at least that seems to physically work. In your case, you will want to use replace 595 with 594 for sfp1.
root@OpenWrt:~# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 512-595, parent: platform/1001f000.pinctrl, pinctrl_moore:
gpio-512 ( |tx-disable ) in lo
gpio-513 ( |tx-fault ) in hi IRQ
gpio-514 ( |los ) in hi IRQ
gpio-515 ( |rate-select0 ) in hi ACTIVE LOW
gpio-517 ( |reset ) out hi ACTIVE LOW
gpio-524 ( |cd ) in lo IRQ ACTIVE LOW
gpio-526 ( |WPS ) in hi IRQ ACTIVE LOW
gpio-533 ( |rate-select0 ) in lo ACTIVE LOW
gpio-566 ( |los ) in hi IRQ
gpio-575 ( |blue:wps ) out lo
gpio-581 ( |tx-fault ) in hi IRQ
gpio-582 ( |tx-disable ) in hi
gpio-591 ( |green:status ) out hi
gpio-594 ( |mod-def0 ) in hi IRQ ACTIVE LOW
gpio-595 ( |mod-def0 ) in hi IRQ ACTIVE LOW
root@OpenWrt:~# rmmod sfp.ko
root@OpenWrt:~# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 512-595, parent: platform/1001f000.pinctrl, pinctrl_moore:
gpio-517 ( |reset ) out hi ACTIVE LOW
gpio-524 ( |cd ) in lo IRQ ACTIVE LOW
gpio-526 ( |WPS ) in hi IRQ ACTIVE LOW
gpio-575 ( |blue:wps ) out lo
gpio-591 ( |green:status ) out hi
root@OpenWrt:~# echo "595" > /sys/class/gpio/export
root@OpenWrt:~# echo "out" > /sys/class/gpio/gpio595/direction
root@OpenWrt:~# echo "1" > /sys/class/gpio/gpio595/value
root@OpenWrt:~# echo "0" > /sys/class/gpio/gpio595/value
root@OpenWrt:~#
I can confirm that freeing the sfp1 mod-def0 pin. allows it to be configured as output and to be pull low. S800E now gets power. Also removing mod-def0 gpio definition from tree seems ok. as other sfp modules still work in
However S800E still fails to work (previously nothing shows for sfp1 as it was not getting powered)
sfp sfp1: please wait, module slow to respond
…further down the bootlog
sfp sfp1: failed to read EEPROM: -ENXIO
the above applies with or without the below squark patch. it seems your suspicion that the eeprom read fails and as such sfp_quirk does not apply.