looks like primary difference between the two is that tx-disable is pulled low about 2s into bootup on the mellanox.
as tx-disable is an input on the bpi-r4. question is what is causing S800E to pull tx-disable pin low on mellanox?
looks like primary difference between the two is that tx-disable is pulled low about 2s into bootup on the mellanox.
as tx-disable is an input on the bpi-r4. question is what is causing S800E to pull tx-disable pin low on mellanox?
tx-disable
switches between an input (high impedance) and output:
As far as I can tell, it is enabled once during initialization:
Today in hardware gore, I thought it would be interesting to stick an actual at24 eeprom directly on the bus, using values dumped from the S800E.
The EEPROM gets detected, and the sfp driver halts after detecting a checksum failure. Sure enough, after adding up all the bytes to verify, the original checksum from the S800E is indeed incorrect.
[ 182.279685] sfp sfp1: EEPROM base structure checksum failure: 0x98 != 0x5c
[ 182.286582] sfp EE: 00000000: 03 04 01 20 00 00 00 00 00 00 00 03 64 00 14 c8 ... ........d...
[ 182.295274] sfp EE: 00000010: 00 00 00 00 48 55 41 57 45 49 20 20 20 20 20 20 ....HUAWEI
[ 182.303963] sfp EE: 00000020: 20 20 20 20 00 00 00 00 53 38 30 30 45 00 00 00 ....S800E...
[ 182.312655] sfp EE: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 04 f6 00 5c ...............\
[ 182.321342] sfp EE: 00000040: 90 d2 8c 80 01 00 00 00 30 bd d0 80 c0 ff ff ff ........0.......
[ 182.330054] sfp EE: 00000050: 24 35 01 80 c0 ff ff ff 70 bd d0 80 c0 ff ff ff $5......p.......
I compared the values just in case the dump was wrong. I ran ethtool -m eth0 hex on
on the mellanox, and compared with another dump generated from the raw i2c signals, and they both match correctly. At least this can be quirked into the sfp driver, assuming the i2c issue ever finds a solution.
Anyway, base structure and the extended structure both have checksum failures. I used i2csfp
to fix them (thank you ericwoud)
root@OpenWrt:/tmp# ./i2csfp sfp1 eepromfix
Checksum 0x00-0x3e failed, set at 5c, but should be 98
Checksum 0x40-0x5e failed, set at 55, but should be f9
Error: i2c_transfer() failed: No such device or address
Error: i2c_transfer() failed: No such device or address
Error: i2c_transfer() failed: No such device or address
Error: i2c_transfer() failed: No such device or address
RollBall Password used: 0xfffffffa
Error: i2c_transfer() failed: No such device or address
Error: Cannot fill in password!
...
root@OpenWrt:/tmp# ./i2csfp sfp1 byte write 0x50 0x3F 0x98
root@OpenWrt:/tmp# ./i2csfp sfp1 byte write 0x50 0x5F 0xF9
root@OpenWrt:/tmp# ./i2csfp sfp1 restore
With the checksums fixed, the module goes further into the initialization!
[ 749.594781] sfp sfp1: module removed
[ 1060.452804] sfp sfp1: Host maximum power 3.0W
[ 1060.773374] sfp sfp1: module HUAWEI S800E rev sn 4857XXXXXXXXXXXX dc 24082602
[ 1060.803238] hwmon hwmon2: temp1_input not attached to any thermal zone
Strangely, it gets stuck waiting for LOS even though the line was already deasserted:
root@OpenWrt:/tmp# cat /sys/kernel/debug/sfp1/state
Module state: present
Module probe attempts: 0 0
Device state: up
Main state: wait_los
Fault recovery remaining retries: 5
PHY probe remaining retries: 12
Signalling rate: 10313 kBd
Rate select threshold: 0 kBd
moddef0: 1
rx_los: 0
tx_fault: 0
tx_disable: 0
rs0: 0
rs1: 0
Turns out, for some whack reason, the S800E specifies that it requires the LOS to be inverted, so the driver waits indefinitely.
I changed that field from 1C
to 1A
, wrote it into the EEPROM, and fixed the checksum again:
root@OpenWrt:/tmp# ./i2csfp sfp1 byte write 0x50 0x41 0x1A
root@OpenWrt:/tmp# ./i2csfp sfp1 eepromfix
Checksum 0x00-0x3e matched 98
Checksum 0x40-0x5e failed, set at f9, but should be f7
...
root@OpenWrt:/tmp# ./i2csfp sfp1 byte write 0x50 0x5F 0xF7
root@OpenWrt:/tmp# ./i2csfp sfp1 restore
With that, I was greeted with this:
[ 499.274793] mtk_soc_eth 15100000.ethernet eth2: Link is Up - 10Gbps/Full - flow control off
[ 499.274823] br-wan: port 2(eth2) entered blocking state
[ 499.288355] br-wan: port 2(eth2) entered forwarding state
root@OpenWrt:/tmp# ifconfig
br-lan Link encap:Ethernet HWaddr 9A:90:63:9B:8C:BA
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::9890:63ff:fe9b:8cba/64 Scope:Link
inet6 addr: fd44:c9c7:ddf::1/60 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:14993 errors:0 dropped:0 overruns:0 frame:0
TX packets:7007 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2186957 (2.0 MiB) TX bytes:2191255 (2.0 MiB)
br-wan Link encap:Ethernet HWaddr 9A:90:63:9B:8C:BB
inet addr:202.XX.XX.XXX Bcast:202.XX.XX.255 Mask:255.255.255.0
inet6 addr: fe80::9890:63ff:fe9b:8cbb/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3021 errors:0 dropped:57 overruns:0 frame:0
TX packets:884 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:286989 (280.2 KiB) TX bytes:408756 (399.1 KiB)
This permanently “breaks” my sfp1 WAN port, but runs on stock OpenWrt and survives reboots. It’s not a real solution but is very cool to see the link up on a bpi-r4.
Edit: happy new year folks!
Happy New Year to you too… great to see some soldering porn on the 1st day of the year.
the 1st question that pops to mind, since the checksum of S800E is wrong. Wy didn’t the mellanox flag it? which also begs to the importance of the epprom data.
S800E is a SFP+ xgs-pon ONU. So for all intensive purpose, we don’t really care what the eeprom say as long as
kern.info kernel: [ 11.624377] mtk_soc_eth 15100000.ethernet eth2: switched to inband/10gbase-r link mode
so more like a fixed-phy than an sfp. which gave me a really stupid/crazy idea… going to test it out now.
Mellanox (specifically this connectx-3) runs off kmod-mlx4-core
and does not appear to involve the current sfp driver that we are hacking on, so the implementation is probably different and possibly more forgiving.
IMO the EEPROM data is “nice to have” but not absolutely necessary. LOS is already exposed as a pin, the only useful omission is the DDMI data such as rx power level, module temperature. Most of the actual link configuration appear to be done in-band once the soc is aware that it is possible to do so.
Going to be fun to see what you are up to, if I had to guess, likely something to do with gmac1
on the devicetree? :^)
crazy/stupid idea worked!
going to test a few more things then clean things up. hotplug(as in plugin and plug out), reboots and cold power on. All works. No hardware hacks needed and only minor dts changes needed.
Details to follow.
Special thanks to @j_g and all who chipped in.
To get the S800E working on bpi-r4 without any hardware modifications on openwrt-24.10. Current mainline may have switched around sfp1/sfp2 and/or gmac/eth definitions. Do your homework before you start.
Step one: Edit mt7988a-bananapi-bpi-r4.dtsi
Remove the following:
/* SFP1 cage (WAN) */
sfp1: sfp1 {
compatible = "sff,sfp";
i2c-bus = <&i2c_sfp1>;
los-gpios = <&pio 54 GPIO_ACTIVE_HIGH>;
mod-def0-gpios = <&pio 82 GPIO_ACTIVE_LOW>;
tx-disable-gpios = <&pio 70 GPIO_ACTIVE_HIGH>;
tx-fault-gpios = <&pio 69 GPIO_ACTIVE_HIGH>;
rate-select0-gpios = <&pio 21 GPIO_ACTIVE_LOW>;
maximum-power-milliwatt = <3000>;
};
Edit &gmac2 to the following, must change usxgmill to 10gbase-R and remove sfp reference.
&gmac2 {
managed = "in-band-status";
phy-mode = "10gbase-r";
status = "okay";
};
Step two: Add the following to startup /etc/rc.local to force manually enable power to the S800E in sfp1 slot.
echo 594 > /sys/class/gpio/export
echo out > /sys/class/gpio/gpio594/direction
echo 0 > /sys/class/gpio/gpio594/value
Step three: (optional) Edit eth2 device under /etc/config/network. Adjust to own taste.
config device
option name 'eth2'
option autoneg '0'
option speed '10000'
option duplex '1'
option rxpause '1'
option txpause '1'
In my testing, hotplugging works and link goes up and down. warm restarts sysupgrade and cold boot all work.
One thing to note, the S800E does run warm. So it’s reading 57 deg cel and have been running the last 12+hrs… Up from 50 deg cel without the S800E module. This is with the factory casing (all case up) and a passive heatsink on the processor. And this is also overclocked from 1.8GHz to 2.2GHz at the same default voltage. All within safety margins i suppose.
Have fun and don’t burn down the house.
Nice, excellent to see that it is possible to get a working link without any physical mods!
Exciting as the bpi-r4 is probably one of the best performance-to-value option for 10g-capable routing (assuming that the bpi-r4 can concurrently push line rates at both SFP+ ports).
For step two, I wonder if it is possible to move it into the devicetree, something like this:
&pio {
+ sfp1-module-power-hog {
+ gpio-hog;
+ gpios = <&pio 82 GPIO_ACTIVE_HIGH>;
+ output-low;
+ };
pwm0_pins: pwm0-pins {
mux {
groups = "pwm0";
function = "pwm";
};
};
};
If that works, it’ll keep all the changes within the devicetree and might make it easier to maintain a single dtso
to hack in support for the S800E.
For step 3, how optional is that configuration? i.e. can I plug in the S800E and expect a working 10g link to come up automatically without touching /etc/config/network
?
57 degC should be a reasonably comfortable temperature for the S800E. Assuming the EEPROM values are sane, the temperature should only be a cause for concern above 70c+.
Module temperature : 58.23 degrees C / 136.82 degrees F
Module temperature high alarm threshold : 80.00 degrees C / 176.00 degrees F
Module temperature high warning threshold : 75.00 degrees C / 167.00 degrees F
Will test out your sfp1 moddef0 output low dt changes. The advantage is that dt are applied almost immediately on boot. rc.local happens much much later. So on a cold boot, the connection should come up much faster.
I expect speed and duplex to be negotiated successfully . But flow control will be on auto which will not work for sfp devices i tried previously. And given in my case both sfp are connected to 10G, I rather hardcode it.
No issues pushing line rate for forwarding. As there are 3 ppe engines one each of the gmac. I had previously tested pushing near line rate forwarding between both sfp+ ports on two relatively fast machines with iperf3 --bidir. For traffic terminating at the bpi-r4, rss/lro are not available yet. So I am maxing out at 6.5Gbps+ on iperf3 receive on the device itself.
Also mtk has been active adding patches to the 6.6 kernel and happy days. HQoS was recently added. And from my limited testing it does improve latency on a 10G fiber thou only modestly as latency was pretty low already.
@j_g the above change to DT did not work. builds but system fails to boot up after sysupgrade. strange…
I can’t tell if there is an issue with that entry, but maybe taking one step back would be to try with LEDs instead? Can probably change led_green
to use the moddef0 pin instead, and switch it to GPIO_ACTIVE_LOW
?
Also curious, in your working hacked devicetree, when the link is up, does anything respond on the i2c bus where the S800E should have been? Still scratching my head on why the S800E doesn’t respond on i2c, but at least we can eliminate some possibilities such as bad power supply, or module held in reset.
still scratching my head on gpio-hog, i double checked seems correct. Maybe someone else can chip in. @ericwoud @frank-w
I left sftp1 i2c intact on device tree. And I just checked. This is with S800E active and link up. Nada… zip… nothing…
i2cdetect -y 3
0 1 2 3 4 5 6 7 8 9 a b c d e f
00: -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: UU -- -- -- -- -- -- --
You need to remove the gpio from the sfp part. But with overlay cannot delete. But you can overwrite the gpio of the sfp with an unused one. Then it is free to hog.
Check dmesg log for conflicts
i have deleted the entire sfp1. it is free as i can control it from userspace. Don’t understand what u mean by overlay cannot delete.
I meant with a .dtbo cannot delete from .dtb
question: why active_high instead of active_low? Also does it matter? since it will be set to output-low.
I don’t think the polarity is as critical in gpio-hog, ultimately output-low
is what we want to keep the mosfet active:
- output-low A property specifying to set the GPIO direction as output with
the value low.
I picked active_high out of habit, in this context, active_low is semantically better since a p-channel mosfet gate/output is inverting in a high-side switch topology.
However, if you are going to try using leds instead of gpio-hog, GPIO_ACTIVE_LOW
will make a difference since it provides the polarity context to default-state = "on";
managed to get gpio-hog added in. it should be:
gpios = <82 GPIO_ACTIVE_LOW>;`
as it is already nested in &pio.
BUT it does not work as intended. this is with a warm boot with S800E already powered on and S800E lost power upon reboot.
gpio-594 ( |sfp1-mod-def0-power-) in hi ACTIVE LOW
And if i change to:
gpios = <82 GPIO_ACTIVE_HIGH>;
then
gpio-594 ( |sfp1-mod-def0-power-) in lo
the above is with a warm reboot with the S800E already powered up. the S800E continues to be power up. I did not test a cold boot as I am pretty sure the S800E will fail to boot up as direction is still input.
what is clear is that the “output-low” is not working as intended in both case. Ideas?
current approach is to export gpio 594 and manage it from userland.
You could try to set pinctrl states default
and sleep
.
Then the pin will change to the configuration set at sleep state when shutting down and set to the default state again after (re)boot.
https://www.kernel.org/doc/Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt
If I’m not mistakin, should change to sleep mode also at shutdown before reboot, but I cannot quickly find where this is documented.
@j_g can you help check, when bpi-r4 boots up with your EEPROM hack. Does S800E report usxgmill mode or does it automatically switches to 10gbase-R?
[ 11.484218] mtk_soc_eth 15100000.ethernet eth1: switched to inband/10gbase-r link mode
[ 13.704939] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/10gbase-r link mode
[ 13.754367] mtk_soc_eth 15100000.ethernet eth2: configuring for inband/10gbase-r link mode
I had to change phy-mode for sfp1 from usxgmill to 10gbase-r for the link to be up and working.
With the EEPROM hack:
root@OpenWrt:~# dmesg | grep eth0
[ 4.236230] mtk_soc_eth 15100000.ethernet eth0: mediatek frame engine at 0xffffffc082700000, irq 106
[ 4.506944] mtk_soc_eth 15100000.ethernet eth0: entered promiscuous mode
[ 5.443141] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 5.451281] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control rx/tx
[ 13.219595] mtk_soc_eth 15100000.ethernet eth0: Link is Down
[ 13.241731] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 13.249895] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control rx/tx
[ 13.284181] mtk_soc_eth 15100000.ethernet eth0: entered allmulticast mode
Edit1: I can’t confirm if eth1 is the right interface, but here are the logs:
root@OpenWrt:~# dmesg | grep eth1
[ 4.245908] mtk_soc_eth 15100000.ethernet eth1: mediatek frame engine at 0xffffffc082700000, irq 106
[ 11.713076] mtk_soc_eth 15100000.ethernet eth1: switched to inband/10gbase-r link mode
[ 13.375314] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/10gbase-r link mode
Edit2: might be better to show both interfaces, here’s dmesg after a fresh reboot
root@OpenWrt:~# dmesg | grep mtk_soc_eth
[ 4.223854] mtk_soc_eth 15100000.ethernet: generated random MAC address 65:74:68:25:64:00
[ 4.236265] mtk_soc_eth 15100000.ethernet eth0: mediatek frame engine at 0xffffffc082780000, irq 106
[ 4.245957] mtk_soc_eth 15100000.ethernet eth1: mediatek frame engine at 0xffffffc082780000, irq 106
[ 4.255617] mtk_soc_eth 15100000.ethernet eth2: mediatek frame engine at 0xffffffc082780000, irq 106
[ 4.504808] mtk_soc_eth 15100000.ethernet eth0: entered promiscuous mode
[ 5.439328] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 5.447496] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control rx/tx
[ 11.711799] mtk_soc_eth 15100000.ethernet eth2: switched to inband/10gbase-r link mode
[ 11.749734] mtk_soc_eth 15100000.ethernet eth1: switched to inband/10gbase-r link mode
[ 13.225263] mtk_soc_eth 15100000.ethernet eth0: Link is Down
[ 13.247536] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/internal link mode
[ 13.255730] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 10Gbps/Full - flow control rx/tx
[ 13.289913] mtk_soc_eth 15100000.ethernet eth0: entered allmulticast mode
[ 13.381128] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/10gbase-r link mode
[ 13.424071] mtk_soc_eth 15100000.ethernet eth1: entered allmulticast mode
[ 13.431098] mtk_soc_eth 15100000.ethernet eth1: entered promiscuous mode
[ 13.476204] mtk_soc_eth 15100000.ethernet eth2: configuring for inband/10gbase-r link mode
[ 13.518961] mtk_soc_eth 15100000.ethernet eth2: entered allmulticast mode
[ 13.525880] mtk_soc_eth 15100000.ethernet eth2: entered promiscuous mode
[ 15.624827] mtk_soc_eth 15100000.ethernet eth2: Link is Up - 10Gbps/Full - flow control off
wrt gpio-hog being stuck as an input, I can’t think of any answers at the moment. I don’t see the same pin used anywhere else, and output-low
seems pretty explicit. I can only suggest an uglier hack (inserting it as a led in gpio-leds
with default-state = "on";
with GPIO_ACTIVE_LOW
)