Huawei OptiXstar S800E XGSPON SPF+ ONU not detected

Huawei OptiXstar S800E XGSPON SPF+ ONU not detected on BPI-r4 with current mainline openwrt kernel 6.6.52.

ethtool -m eth2 shows device not found. i have waited 30mins and done soft reboots but the device is not detected.

bootlog does not show any info pertaining to sfp2

the sfp+ ONU works if i plug it into a switch (used as a media converter in this case) and then connect the switch to the bpi-r4 via a DAC cable to SFP2 on the bpi-r4.

I have tried to disable autoneg and manually setting it to 10G duplex. But does not work.

I suspect the sfp+ module is not detected or powered up correctly. Any help or pointers?

Take a look at the green LED next to the USB socket.

If it does not light up after inserting the module, it means that the module is not powered and cannot work.

No green LED next to the USB socket. If i plug in a DAC cable, the green led lights up.

Anyway to force the power on the sfp port?

I’m afraid you need to modify the hardware.

Remove Q12 and attach R102.(Q13/R127)

thanks for the reply. not going to do any soldering anytime soon.

going to leave it as a test bed for now.

Any software based workarounds to this issue?

Hi all,

Even with the hardware modification where Q12 is removed and R102 is shorted with a 0R bridge, the OptiXstar S800E is still not detected on the bpi-r4 running on OpenWrt 24.10-rc4.

The SFP module appears to be properly powered after the changes, as it is now hot to the touch. The sfp power LED is also lit. However, the device does not show up on i2c:

[   11.306555] sfp sfp1: Host maximum power 3.0W
[   11.311453] sfp sfp2: Host maximum power 3.0W
...
[   17.862546] sfp sfp1: please wait, module slow to respond
[   73.272569] sfp sfp1: failed to read EEPROM: -ENXIO

I have confirmed that the sfp1 port still works properly after the mod. When I tried another SFP device in the same cage, it shows up correctly:


[  588.003346] sfp sfp1: module SOURCEPHOTONICS  SFP+-T30         rev 10   sn HP202411041001   dc 241104
[  588.043211] hwmon hwmon2: temp1_input not attached to any thermal zone

With the OptiXstar S800E in the cage, this is the sfp status:

root@OpenWrt:~# cat /sys/kernel/debug/sfp1/state
Module state: error
Module probe attempts: 10 12
Device state: up
Main state: down
Fault recovery remaining retries: 0
PHY probe remaining retries: 0
Signalling rate: 10313 kBd
Rate select threshold: 0 kBd
moddef0: 1
rx_los: 0
tx_fault: 0
tx_disable: 1
rs0: 0
rs1: 0

I’ve also tried the “factory” firmware that is available in the spi-nand. Same issue, worded differently:

[   74.024794] sfp sfp@0: failed to read EEPROM: -6
[   74.029412] sfp sfp@0: SM: exit error:up:down

I’ve also tried these with no success:

  • soft rebooting
  • holding down RST button, powering the board, then releasing RST after a minute, where the SFP stick should have booted before OpenWrt starts, in case the SFP stick was trying to emulate an EEPROM
  • using the i2csfp tool to manually poke the tx-disable gpio, and then restoring the ports, which i assume would probe the sfp device again

This appears to be a problem that is specific to bpi-r4 – this S800E SFP module runs with no problems on my x86-based OpenWrt (also 24.10-rcX) device using a Mellanox connectx-3 card with no hacks. I’d be very grateful if anyone can point me on how I can further troubleshoot this issue

Maybe you need a quirk/fixup similar to the rollball delaying the probe

Setting sfp->module_t_wait (Maybe sfp->phy_t_retry too)

maybe you can set a boot delay for openwrt.

fw_setenv bootdelay 60

if it works shorten delay to 45 or 30 until point of failure.

fw_setenv is imho uboot config. This sets the time showing uboot boomenu,no the time for linux kernel wait for sfp

isn’t it the same? both give the sfp module enough time to boot up upon power up.

frank-w and glassdoor, thank you both for your suggestions:

I’ve just tried fw_setenv bootdelay 60, however even with the additional wait, the EEPROM is still unavailable :

[   68.232653] sfp sfp1: failed to read EEPROM: -ENXIO

I intend to try the suggestion from frank-w to temporarily change sfp->module_t_wait to a much higher default value. If I understand correctly, the key difference between this and changing bootdelay is that sfp_module_tx_enable(sfp); should first pull the tx-disable pin low before waiting. In the event that the S800E’s processor reset is directly tied to the tx-disable pin, this should allow the cpu to start up and properly emulate an EEPROM.

I currently do not have a OpenWrt build environment ready, so it might take me a while to test this and report back. (Is there a faster/easier way to do this without recompiling? Maybe swapping the gpios on the devicetree?)

You could try if your kernel has something like this, add that to the fixup, or add the fixup:

eicwoud, thank you for your suggestion. I’ll look into it.

Small update on a failed experiment: I removed R106 which connects GPIO70 to SFP1_TX_Disable. My assumption is that tx-disable will be permanently pulled to GND via R101, and therefore any installed SFP module will always be enabled regardless of the cpu/driver state. However this did not work, and I am still getting the same EEPROM error.

image

Do you have still the “slow to respond error”? Have you increased the bootup time via quirk and verified that this was called (e.g. debug message in the function).

Afair there was a way to tag eeprom as broken (imho only for checksum error),but here it seems eeprom cannot be accessed at all.

just curious, did you just rip of the mosfet and resistor with a plier?

I have a build with the follow squark patch applied if you want to test.

SFP_QUIRK_F(“HUAWEI”, “S800E”, sfp_fixup_long_startup),

frank-w: I have not tried rebuilding the kernel to insert the delay in sfp.c. Unfortunately my OpenWrt build environment is unavailable right now so I can’t try the fix that you suggested yet. I tried the resistor removal hack as it was a faster and easier alternative to check if tx-disable was causing any issues.

Adding this as a quirk will be difficult because the EEPROM is unavailable. The sfp_lookup_quirk function expects the EEPROM to at least provide the vendor_name and vendor_pn which isn’t available here with the S800E.

If I had to hack this delay in temporarily, I would probably call sfp_fixup_long_startup(sfp); after the quirk lookup.


glassdoor: Those components were desoldered. I am concerned that ripping the parts off might also tear off the underlying traces.

On your original post on whether a software based approach is possible, I would cautiously suggest modifying the devicetree so that the pin at moddef0 is not used by sfp1. You might then be able to use pin 82 as a gpio output and manually drive it low so that mosfet Q12 can deliver power to the module.

Edit: bear in mind that the sfp driver only starts probing for the module when the configured moddef0 pin is grounded, so you’ll also have to figure out how to tell the kernel to start probing for the inserted sfp module

Thanks for the offer on your custom build. I am not sure if that will work as the current quirk implementation expects the EEPROM to at least work (discussed in this same post in my reply to frank-w)

deleting line 44 will mean that sfp1 module will always be assumed to be present. https://www.kernel.org/doc/Documentation/devicetree/bindings/net/sff%2Csfp.txt

how you manually drive pin 82 low on the devicetree? how do you change it from input to output? that is what i am really interested in. any suggestions?

Edit: from cat /sys/kernel/debug/gpio

gpiochip0: GPIOs 512-595, parent: platform/1001f000.pinctrl, pinctrl_moore:
 gpio-512 (                    |tx-disable          ) out lo 
 gpio-513 (                    |tx-fault            ) in  lo IRQ 
 gpio-514 (                    |los                 ) in  lo IRQ 
 gpio-515 (                    |rate-select0        ) in  hi ACTIVE LOW
 gpio-517 (                    |reset               ) out hi ACTIVE LOW
 gpio-524 (                    |cd                  ) in  lo IRQ ACTIVE LOW
 gpio-526 (                    |WPS                 ) in  hi IRQ ACTIVE LOW
 gpio-533 (                    |rate-select0        ) in  lo ACTIVE LOW
 gpio-566 (                    |los                 ) in  lo IRQ 
 gpio-575 (                    |blue:wps            ) out lo 
 gpio-581 (                    |tx-fault            ) in  lo IRQ 
 gpio-582 (                    |tx-disable          ) out lo 
 gpio-591 (                    |green:status        ) out hi 
 gpio-594 (                    |mod-def0            ) in  lo IRQ ACTIVE LOW
 gpio-595 (                    |mod-def0            ) in  lo IRQ ACTIVE LOW

glassdoor: Yes you’re right about that, I added an edit shortly after posting that you might have to find a way to get the sfp driver to probe after a certain period since the moddef0 pin would be driven as an output and would not be able to tell if a module is actually inserted into the cage.

The guide at https://openwrt.org/docs/techref/hardware/port.gpio might be able to steer you in the right direction. I originally tried this approach but I could not export the gpio as it was already in use by the sfp driver. That was when I decided to go for a hardware mod to bypass the mosfet.

In your logs, gpio-594 ( |mod-def0 ) in lo IRQ ACTIVE LOW appears to be the moddef0 gpio for your sfp1.

Edit: I tried manually driving my sfp2 (since my sfp1 was already bypassed). The sfp driver was first unloaded to free the pins. When the pin was driven low “0”, I could see the green power LED light up so at least that seems to physically work. In your case, you will want to use replace 595 with 594 for sfp1.

root@OpenWrt:~# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 512-595, parent: platform/1001f000.pinctrl, pinctrl_moore:
 gpio-512 (                    |tx-disable          ) in  lo
 gpio-513 (                    |tx-fault            ) in  hi IRQ
 gpio-514 (                    |los                 ) in  hi IRQ
 gpio-515 (                    |rate-select0        ) in  hi ACTIVE LOW
 gpio-517 (                    |reset               ) out hi ACTIVE LOW
 gpio-524 (                    |cd                  ) in  lo IRQ ACTIVE LOW
 gpio-526 (                    |WPS                 ) in  hi IRQ ACTIVE LOW
 gpio-533 (                    |rate-select0        ) in  lo ACTIVE LOW
 gpio-566 (                    |los                 ) in  hi IRQ
 gpio-575 (                    |blue:wps            ) out lo
 gpio-581 (                    |tx-fault            ) in  hi IRQ
 gpio-582 (                    |tx-disable          ) in  hi
 gpio-591 (                    |green:status        ) out hi
 gpio-594 (                    |mod-def0            ) in  hi IRQ ACTIVE LOW
 gpio-595 (                    |mod-def0            ) in  hi IRQ ACTIVE LOW
root@OpenWrt:~# rmmod sfp.ko
root@OpenWrt:~# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 512-595, parent: platform/1001f000.pinctrl, pinctrl_moore:
 gpio-517 (                    |reset               ) out hi ACTIVE LOW
 gpio-524 (                    |cd                  ) in  lo IRQ ACTIVE LOW
 gpio-526 (                    |WPS                 ) in  hi IRQ ACTIVE LOW
 gpio-575 (                    |blue:wps            ) out lo
 gpio-591 (                    |green:status        ) out hi

root@OpenWrt:~# echo "595" > /sys/class/gpio/export
root@OpenWrt:~# echo "out" > /sys/class/gpio/gpio595/direction
root@OpenWrt:~# echo "1" > /sys/class/gpio/gpio595/value
root@OpenWrt:~# echo "0" > /sys/class/gpio/gpio595/value
root@OpenWrt:~#

I can confirm that freeing the sfp1 mod-def0 pin. allows it to be configured as output and to be pull low. S800E now gets power. Also removing mod-def0 gpio definition from tree seems ok. as other sfp modules still work in

However S800E still fails to work (previously nothing shows for sfp1 as it was not getting powered)

sfp sfp1: please wait, module slow to respond

…further down the bootlog

sfp sfp1: failed to read EEPROM: -ENXIO

the above applies with or without the below squark patch. it seems your suspicion that the eeprom read fails and as such sfp_quirk does not apply.

SFP_QUIRK_F(“HUAWEI”, “S800E”, sfp_fixup_long_startup),

Ideas?