[BPI-R4] and SFP

Be careful in the order, both functions set sfp->id.base.extended_cc to a different value.

Maybe use sfp_fixup_rollball() instead of sfp_fixup_rollball_cc(), if it is sfp_fixup_10gbaset_30m() that sets the correct value.

You know it is really up when you get a ping accros,. For example if the mac is still configured wrong, you get link up on the phy and advertisement, but still no ping.

As for the delay, you could still try:sfp_fixup_rollball_wait4s()

sfp->module_t_wait = msecs_to_jiffies(1000);

Try a shorter period here…

Or sfp->module_t_start_up = msecs_to_jiffies(1000);

Or similar

Or perhapse:

sfp->i2c_block_size = 1;

Will slow things down, but that is really hacky.

You know it is really up when you get a ping accros,

Yeah, that did work … for a while. After a few minutes, the module was “removed”, reinserted, and got no link do you know why the aquantia driver says

Aquantia AQR113C i2c:sfp1:11: unrecognised serdes mode 7

it might point to the reason it didn’t work?

nope, tried but failed again :frowning:

Maybe because the different ext_cc, thus the different SFF8024_ECC_10GBASE_T_SR vs SFF8024_ECC_10GBASE_T_SFI ?

Did you try setting module_t_start_up, instead of module_t_wait?

okay, so I browsed the SFP function hacklist a bit, so I tried this time:

+static void sfp_fixup_oem_10gt(struct sfp *sfp) {
+       sfp_fixup_rollball(sfp);
+       sfp_fixup_10gbaset_30m(sfp);
+       sfp_fixup_long_startup(sfp);
+       sfp->i2c_block_size = 1;
+}

and it did not work, but I got more dynamic debug output for you this time, I hope it helps:

 i2csfp sfp1 restore
[  140.368072] sfp sfp1: SM: enter present:up:fail event dev_down
[  140.374149] sfp sfp1: tx disable 0 -> 1
[  140.378025] sfp sfp1: SM: exit present:down:down
[  140.382634] sfp sfp1: SM: enter present:down:down event dev_detach
[  140.388810] sfp sfp1: SM: exit waitdev:detached:down
[  140.393766] sfp sfp1: SM: enter waitdev:detached:down event remove
[  140.399944] sfp sfp1: module removed
[  140.403510] sfp sfp1: SM: exit empty:detached:down
[  140.408671] sfp sfp1: Host maximum power 3.0W
[  140.413056] sfp sfp1: tx disable 1 -> 1
[  140.416913] sfp sfp1: SM: enter empty:detached:down event insert
[  140.422911] sfp sfp1: SM: exit probe:detached:down
[  140.427953] sfp sfp1: SM: enter probe:detached:down event dev_attach
[  140.434299] sfp sfp1: SM: exit probe:down:down
[  140.438743] sfp sfp1: SM: enter probe:down:down event dev_up
[  140.444391] sfp sfp1: SM: exit probe:up:down
root@APBureau4:~# [  140.726458] sfp sfp1: SM: enter probe:up:down event timeout
[  140.743104] sfp sfp1: module OEM              ZK-10G-TX        rev 1    sn 2505010443       dc 250412  
[  140.752505] sfp sfp1: tx disable 1 -> 0
[  140.756356] sfp sfp1: SM: exit present:up:wait
[  140.776434] sfp sfp1: skipping hwmon device registration due to broken EEPROM
[  140.783559] sfp sfp1: diagnostic EEPROM area cannot be read atomically to guarantee data coherency
[  140.792526] sfp sfp1: los 0 -> 1
[  140.795746] sfp sfp1: SM: enter present:up:wait event los_high
[  140.801573] sfp sfp1: SM: exit present:up:wait
[  140.816433] sfp sfp1: SM: enter present:up:wait event timeout
[  140.822993] mdio_bus i2c:sfp1: probed
[  148.594698] mtk_soc_eth 15100000.ethernet sfp-wan: validation with support 00,00000000,00000000,00000000 failed: -EINVAL
[  148.605655] sfp sfp1: sfp_add_phy failed: -EINVAL
[  148.610365] sfp sfp1: SM: exit present:up:fail
[  152.011892] sfp sfp1: los 1 -> 0
[  152.015135] sfp sfp1: SM: enter present:up:fail event los_low
[  152.020898] sfp sfp1: SM: exit present:up:fail

if I read it correctly: if the broken EEPROM causes the validation to fail, all we need is is a way to not fail the validation, am I correct?

EDIT and is there ANY way to install a kmod module without completely rebuilding the whole openwrt image? I’m completely fed up invoking sysupgrade for every experiment :frowning:

Not an expert, but wouldn’t you just mark the kernel driver as a M (module) and ensure that loading modules is supported in the kernel config? Then copy over the built module and do insmod/rmmod?

can you help me hack my way into overwriting the broken eeprom? https://patchwork.ozlabs.org/project/openwrt/patch/[email protected]/#3258778

The eeprom does not contain the aquantia firmware…

Remove the sfp->i2c_block_size = 1 it was a bad idea of mine. It is the source of the broken eeprom message.

Long startup function is 1 minute, try something with 1 second or so.

as you said, time to do some debugging …

for this test, I made 2 function calls: 1 where the link worked (using only rollball_cc), but the phy is not doing anything, and one where it did not (using sfp_fixup_fs_10gt), but I added the driver verbose debugging option and a lot more functions to dynamic debug:

root@OpenWrt:~# echo "file drivers/net/* +p" > /sys/kernel/debug/dynamic_debug/control
root@OpenWrt:~# echo "file drivers/soc/* +p" >> /sys/kernel/debug/dynamic_debug/control
root@OpenWrt:~# echo "file drivers/base/* +p" >> /sys/kernel/debug/dynamic_debug/control
root@OpenWrt:~# echo "file drivers/pinctrl/* +p" >> /sys/kernel/debug/dynamic_debug/control
root@OpenWrt:~# echo "file drivers/platform/* +p" >> /sys/kernel/debug/dynamic_debug/control
echo "file drivers/regulator/* +p" >> /sys/kernel/debug/dynamic_debug/control

time to compare:

  1. Working:
i2csfp sfp2 restore
[  366.065277] platform sfp2: bus: 'platform': __driver_probe_device: matched device with driver sfp
[  366.074200] platform sfp2: bus: 'platform': really_probe: probing driver sfp with device
[  366.082311] sfp sfp2: no pinctrl handle
[  366.086236] mt7988-pinctrl 1001f000.pinctrl: request pin 83 (UART1_RTS) for pinctrl_moore:595
[  366.094784] mt7988-pinctrl 1001f000.pinctrl: request pin 2 (UART2_CTS) for pinctrl_moore:514
[  366.103242] mt7988-pinctrl 1001f000.pinctrl: request pin 1 (UART2_TXD) for pinctrl_moore:513
[  366.111705] mt7988-pinctrl 1001f000.pinctrl: request pin 0 (UART2_RXD) for pinctrl_moore:512
[  366.120225] mt7988-pinctrl 1001f000.pinctrl: request pin 3 (UART2_RTS) for pinctrl_moore:515
[  366.128722] sfp sfp2: Host maximum power 3.0W
[  366.133107] sfp sfp2: tx disable 1 -> 1
[  366.136963] sfp sfp2: SM: enter empty:detached:down event insert
[  366.142964] sfp sfp2: SM: exit probe:detached:down
[  366.148133] sfp sfp2: SM: enter probe:detached:down event dev_attach
[  366.154482] sfp sfp2: SM: exit probe:down:down
[  366.158932] sfp sfp2: SM: enter probe:down:down event dev_up
[  366.164583] sfp sfp2: SM: exit probe:up:down
[  366.168897] sfp sfp2: driver: 'sfp': driver_bound: bound to device
[  366.175128] sfp sfp2: bus: 'platform': really_probe: bound device to driver sfp
root@OpenWrt:~# [  366.445637] sfp sfp2: SM: enter probe:up:down event timeout
[  366.462307] sfp sfp2: module OEM              ZK-10G-TX        rev 1    sn 2505010444       dc 250412  
[  366.471716] mtk_soc_eth 15100000.ethernet sfp-lan: optical SFP: interfaces=[mac=1-4,22-24,27,29, sfp=27]
[  366.481201] mtk_soc_eth 15100000.ethernet sfp-lan:  interface 27 (10gbase-r) rate match none supports 6,10,13-14,43
[  366.491643] mtk_soc_eth 15100000.ethernet sfp-lan: optical SFP: chosen 10gbase-r interface
[  366.499904] mtk_soc_eth 15100000.ethernet sfp-lan: requesting link mode inband/10gbase-r with support 00,00000000,00000800,00006440
[  366.511725] mtk_soc_eth 15100000.ethernet sfp-lan: major config, requested inband/10gbase-r
[  366.520074] mtk_soc_eth 15100000.ethernet sfp-lan: major config, active inband/none/10gbase-r
[  366.528596] mtk_soc_eth 15100000.ethernet sfp-lan: phylink_mac_config: mode=inband/10gbase-r/none adv=00,00000000,00000800,00006440 pause=04
[  366.553666] sfp sfp2: tx disable 1 -> 0
[  366.557525] sfp sfp2: SM: exit present:up:wait
[  366.561983] sfp sfp2: los 0 -> 1
[  366.565212] sfp sfp2: SM: enter present:up:wait event los_high
[  366.571049] sfp sfp2: SM: exit present:up:wait
[  366.575544] device: 'hwmon2': device_add
[  366.579702] hwmon hwmon2: temp1_input not attached to any thermal zone
[  366.625630] sfp sfp2: SM: enter present:up:wait event timeout
[  366.631376] sfp sfp2: SM: exit present:up:wait_los
[  372.801072] sfp sfp2: los 1 -> 0
[  372.804310] sfp sfp2: SM: enter present:up:wait_los event los_low
[  372.810416] sfp sfp2: SM: exit present:up:link_up
[  372.842881] mtk_soc_eth 15100000.ethernet sfp-lan: Link is Up - 10Gbps/Full - flow control off
[  372.853076] br-lan: port 4(sfp-lan) entered blocking state
[  372.858569] br-lan: port 4(sfp-lan) entered forwarding state
[  372.864301] mtk_soc_eth 15100000.ethernet sfp-lan: bond_netdev_event received NETDEV_CHANGE
  1. Not working (using 30m wait4s function, using rmmod & modprobe to investigate both modules):
modprobe sfp
[  535.622866] bus: 'platform': add driver sfp
[  535.627187] platform sfp1: bus: 'platform': __driver_probe_device: matched device with driver sfp
[  535.636080] platform sfp1: bus: 'platform': really_probe: probing driver sfp with device
[  535.644186] sfp sfp1: no pinctrl handle
[  535.648088] mt7988-pinctrl 1001f000.pinctrl: request pin 82 (UART1_CTS) for pinctrl_moore:594
[  535.656644] mt7988-pinctrl 1001f000.pinctrl: request pin 54 (PCM_MCK_I2S_MCLK) for pinctrl_moore:566
[  535.665797] mt7988-pinctrl 1001f000.pinctrl: request pin 69 (GPIO_B) for pinctrl_moore:581
[  535.674081] mt7988-pinctrl 1001f000.pinctrl: request pin 70 (GPIO_C) for pinctrl_moore:582
[  535.682362] mt7988-pinctrl 1001f000.pinctrl: request pin 21 (PWMD1) for pinctrl_moore:533
[  535.690556] sfp sfp1: Host maximum power 3.0W
[  535.695336] sfp sfp1: driver: 'sfp': driver_bound: bound to device
[  535.701677] sfp sfp1: bus: 'platform': really_probe: bound device to driver sfp
[  535.708994] platform sfp2: bus: 'platform': __driver_probe_device: matched device with driver sfp
[  535.717885] platform sfp2: bus: 'platform': really_probe: probing driver sfp with device
[  535.725993] sfp sfp2: no pinctrl handle
[  535.729891] mt7988-pinctrl 1001f000.pinctrl: request pin 83 (UART1_RTS) for pinctrl_moore:595
[  535.738452] mt7988-pinctrl 1001f000.pinctrl: request pin 2 (UART2_CTS) for pinctrl_moore:514
[  535.746920] mt7988-pinctrl 1001f000.pinctrl: request pin 1 (UART2_TXD) for pinctrl_moore:513
[  535.755375] mt7988-pinctrl 1001f000.pinctrl: request pin 0 (UART2_RXD) for pinctrl_moore:512
[  535.763826] mt7988-pinctrl 1001f000.pinctrl: request pin 3 (UART2_RTS) for pinctrl_moore:515
[  535.772281] sfp sfp2: Host maximum power 3.0W
[  535.777002] sfp sfp2: driver: 'sfp': driver_bound: bound to device
[  535.783243] sfp sfp2: bus: 'platform': really_probe: bound device to driver sfp
[  536.011660] sfp sfp1: module OEM              ZK-10G-TX        rev 1    sn 2505010443       dc 250412  
[  536.051062] device: 'hwmon1': device_add
[  536.055209] hwmon hwmon1: temp1_input not attached to any thermal zone
[  536.080945] device: 'i2c:sfp1': device_add
[  536.085213] mdio_bus i2c:sfp1: probed
[  536.422367] mdio_bus i2c:sfp1: poll timed out
[  536.752364] mdio_bus i2c:sfp1: poll timed out
[  537.092369] mdio_bus i2c:sfp1: poll timed out
[  537.722380] mdio_bus i2c:sfp1: poll timed out
[  538.052366] mdio_bus i2c:sfp1: poll timed out
[  538.442371] mdio_bus i2c:sfp1: poll timed out
[  538.802368] mdio_bus i2c:sfp1: poll timed out
[  539.162370] mdio_bus i2c:sfp1: poll timed out
[  539.522370] mdio_bus i2c:sfp1: poll timed out
[  539.882369] mdio_bus i2c:sfp1: poll timed out
[  540.242371] mdio_bus i2c:sfp1: poll timed out
[  540.602370] mdio_bus i2c:sfp1: poll timed out
[  540.962373] mdio_bus i2c:sfp1: poll timed out
[  541.322372] mdio_bus i2c:sfp1: poll timed out
[  541.682370] mdio_bus i2c:sfp1: poll timed out
[  542.072377] mdio_bus i2c:sfp1: poll timed out
[  542.402371] mdio_bus i2c:sfp1: poll timed out
[  542.792374] mdio_bus i2c:sfp1: poll timed out
[  543.152369] mdio_bus i2c:sfp1: poll timed out
[  543.630060] device: 'i2c:sfp1:11': device_add
[  543.634555] bus: 'mdio_bus': add device i2c:sfp1:11
[  544.242373] mdio_bus i2c:sfp1: poll timed out
[  544.572367] mdio_bus i2c:sfp1: poll timed out
[  544.962371] mdio_bus i2c:sfp1: poll timed out
[  544.992412] mtk_soc_eth 15100000.ethernet sfp-wan: validation with support 00,00000000,00000000,00000000 failed: -EINVAL
[  545.003341] bus: 'mdio_bus': remove device i2c:sfp1:11
[  545.008528] sfp sfp1: sfp_add_phy failed: -EINVAL
[  545.024588] sfp sfp2: module OEM              ZK-10G-TX        rev 1    sn 2505010444       dc 250412  
[  545.066792] device: 'hwmon2': device_add
[  545.070965] hwmon hwmon2: temp1_input not attached to any thermal zone
[  545.100953] device: 'i2c:sfp2': device_add
[  545.105224] mdio_bus i2c:sfp2: probed
[  545.462371] mdio_bus i2c:sfp2: poll timed out
[  545.792367] mdio_bus i2c:sfp2: poll timed out
[  546.122367] mdio_bus i2c:sfp2: poll timed out
[  546.752367] mdio_bus i2c:sfp2: poll timed out
[  547.112365] mdio_bus i2c:sfp2: poll timed out
[  547.472376] mdio_bus i2c:sfp2: poll timed out
[  547.862367] mdio_bus i2c:sfp2: poll timed out
[  548.192368] mdio_bus i2c:sfp2: poll timed out
[  548.582368] mdio_bus i2c:sfp2: poll timed out
[  548.922363] mdio_bus i2c:sfp2: poll timed out
[  549.282376] mdio_bus i2c:sfp2: poll timed out
[  549.672369] mdio_bus i2c:sfp2: poll timed out
[  550.002378] mdio_bus i2c:sfp2: poll timed out
[  550.392368] mdio_bus i2c:sfp2: poll timed out
[  550.732367] mdio_bus i2c:sfp2: poll timed out
[  551.092369] mdio_bus i2c:sfp2: poll timed out
[  551.482374] mdio_bus i2c:sfp2: poll timed out
[  551.812367] mdio_bus i2c:sfp2: poll timed out
[  552.202370] mdio_bus i2c:sfp2: poll timed out
[  552.562407] mdio_bus i2c:sfp2: poll timed out
[  552.922367] mdio_bus i2c:sfp2: poll timed out
[  553.305420] device: 'i2c:sfp2:11': device_add
[  553.309821] bus: 'mdio_bus': add device i2c:sfp2:11
[  554.012382] mdio_bus i2c:sfp2: poll timed out
[  554.372370] mdio_bus i2c:sfp2: poll timed out
[  554.732368] mdio_bus i2c:sfp2: poll timed out
[  554.736773] mtk_soc_eth 15100000.ethernet sfp-lan: validation with support 00,00000000,00000000,00000000 failed: -EINVAL
[  554.747703] bus: 'mdio_bus': remove device i2c:sfp2:11
[  554.752892] sfp sfp2: sfp_add_phy failed: -EINVAL

so, can I somehow access the registers on the mdio myself to see why this happens?

One has the mdio-tools Linux package, also my tool i2csfp gives acces to mdio registers, but not when the driver has access.

i2csfp sfp2 restore

You may want to try really inserting the module, see if there is any difference. Perhaps my debugging tool messes something up, who knows. It is not an offical way to reconnect.

I usually test it with the commands:

ip link set eth1 down
ip link set eth1 up

Which should be a supported method.

The OEM 2.5gbps module does not use it, only the FS modules, so if it isn’t giving any result, then maybe drop it.

Most likely just only sfp_fixup_rollball_cc as fixup, no more.

Or try

+static void sfp_fixup_oem_10gt(struct sfp *sfp) {
+       sfp_fixup_rollball(sfp);
+       sfp_fixup_10gbaset_30m(sfp);
+}

But make sure only using ip link down/up, no need to modprobe or i2csfp.

yeah that sounds like a power problem… My 12V adapter is only 3A instead of the 5A supported by the banana PI. If you have 3 WiFi cards on the device, may this be the underlying root cause?

EDIT tried your simplified function, no luck

I always remove other hardware if there are some unexpected errors…

Keep in mind these are ‘cheap’ led power supplies, and in the 50%-100% noninal amperage, may not be as stable as needed. I always double the amperage for this reason.

all right, this definitely explains why the link state went up and down as well … to be continued after the weekend when adapters will be delivered :smiley: