Bananapi BPI R3 periodic disconnect all clients

Bananapi BPI R3 is running as main router (DHCP server) with adblock, samba4 (nvme storage)

It was running smoothly last few months with longest uptime of 30+ days. Just recently it acted weird, it would disconnect all network clients and stopped respond to ping, ssh, luci login…etc. like every 2 days. hard restart is only thing that can bring it back online again.

I didn’t touched anything or modified any config since its last good run, it just suddenly act like this recently.

Tried look at the system.log, but missing the time window before lost connection.

custom package installed so far:

  • adblock (active)
  • samba4 (active)
  • transmission daemon (disabled)

Things I’ve tried but didn’t help:

  • updated to the latest official snapshot.
  • Tried disable adblock.
  • Tried delete ipv6

Anyone has same experience or anyone can give an idea of whats going on?

Are you experiencing this problem in both, 2.4GHz and 5GHz bands, and does it occur simultaneously in both bands?

How did you configure the AP interfaces? (share /etc/config/wireless please, remove passwords of course)

Which are you WiFi clients? (in the past even weird stuff like Sony PlayStation sending odd WMM action frames could crash WiFi firmware, so it does matter…)

I am not aware of it’s wireless functionality (as if is it caused by 2.4Ghz or 5Ghz), PC directly connected to it through CAT6 cable. I got Ethernet adapter disconnection notification from network manager when it happens, and other wireless devices were also lost connection and assigned IP.

Its my main router at home, Wifi clients are just family members cellphone, iPads (MAX of 5 devices at same time), nothing else. I have other 3 Openwrt routers (2 ASUS-RT58, 1 Redmi AX6000) connected to BPI-R3 as dump AP and all routers are running 802.11r fast roaming around the house.

config wifi-device 'radio0'
	option type 'mac80211'
	option path 'platform/soc/18000000.wmac'
	option band '2g'
	option channel 'auto'
	option cell_density '0'
	option htmode 'HT40'
	option disabled '1'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option network 'lan'
	option mode 'ap'
	option ssid '*******'
	option encryption 'psk2'
	option key '*******'
	option ieee80211r '1'
	option mobility_domain 'E168'
	option ft_over_ds '0'
	option ft_psk_generate_local '1'
	option macfilter 'allow'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	option disabled '1'

config wifi-device 'radio1'
	option type 'mac80211'
	option path 'platform/soc/18000000.wmac+1'
	option band '5g'
	option htmode 'HE80'
	option cell_density '0'
	option channel 'auto'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option network 'lan'
	option mode 'ap'
	option macfilter 'allow'
	option key '********'
	option ieee80211r '1'
	option mobility_domain 'E168'
	option ft_over_ds '0'
	option ft_psk_generate_local '1'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	list maclist '##:##:##:##:##:##'
	option encryption 'psk2+ccmp'
	option ssid '********** (5G)'

Oh, that’s even worse than I thought :frowning: I suppose you have already excluded common causes such as bad cabling, right?

Are you using hardware flow-offloading and WED? I’m asking because we have recently introduced a patch series taking care of resetting everything down to the Ethernet layer in case of the offloading engine crashing…

Cat6 cable 100% no problem.

No, I didn’t enable it. tried enable SOF/HOF before on some older snapshots, and it seems make no difference on bufferbloat, ping test result.

One thing really confuse me is the memory of BPI-R3. for a fresh boot, it uses like 200MiB+, but after a day or 2 the used memory would reach as much as 1.90GiB+ (98%). I understand that it is used for cached data, but why it use so much for cache?

BPI-R3 Screenshot_20230133

AX6000 Screenshot_20230132

If I recalled correctly, the disconnection issue happened after I updated to some newer snapshot, it was running stable with uptime of 30+ days before (I don’t have track of record snapshot version to support the theory)

Strange, the AX6000 is nearly identical hardware. Are you running additional services on the BPi-R3 and are you using the eMMC for that? Because almost 2GiB of cache are hard to explain in any other way. But also shouldn’t be a problem, using all available RAM for cache is the desired behavior in this case, and the AX6000 simply doesn’t have as much persistent storage which could be cached…

Anyway. Even a near indicator of the time or git commit of the snapshot version which still worked fine compared to the breaking one would probably help to figure out what’s going on…

thank you @dangowrt for the quick reply.

No, openwrt is running from NAND. eMMC wasn’t used, neither boot nor storage, I have 512GB of NVME attached as storage shared through Samba4.

I checked the forum of my own activities, the stable running snapshot should be master git built before Feb. 2023.

Is there a way for me to pull snapshot version before Feb. from git? so I can test it again to confirm if the issue is firmware related?

found stable snapshot from my own post Screenshot_20230105_000645

This explains the cache usage, and it should not be a problem.

Please use git bisect to find the breaking commit, if possible. You can start with

git clone https://git.openwrt.org/openwrt/openwrt.git
cd openwrt
git checkout 9260027535
make menuconfig
# select device and packages
make -j$(nproc)
# now flash and test image, then start bisecting assuming that the bug is not present in this old build
git bisect start
git bisect bad 52dbb38469
git bisect good
# now rebuild and try again, each time report back to git via `git bisect good` or `git bisect bad`.
# After a couple of steps you will end up with exactly the commit which breaks the build

thank you @dangowrt for the detailed instruction.

Its been up for 2d 12h now after I removed the metal cover (to isolate the probability of overheating) and rebooted, will try older build once i get disconnection again. IMG_2023

while still monitoring the current running system, one thing caught my attention upon check system.log: Is it normal for dnsmasq not logging anymore after sometime?

Fri Feb 24 10:59:13 2023 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.200 18:c0:4d:40:a2:79 James-PC
Fri Feb 24 11:00:08 2023 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.200 18:c0:4d:40:a2:79
Fri Feb 24 11:00:08 2023 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.200 18:c0:4d:40:a2:79 James-PC
Fri Feb 24 11:01:04 2023 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.200 18:c0:4d:40:a2:79
Fri Feb 24 11:01:04 2023 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.200 18:c0:4d:40:a2:79 James-PC
Fri Feb 24 11:02:00 2023 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.200 18:c0:4d:40:a2:79
Fri Feb 24 11:02:00 2023 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.200 18:c0:4d:40:a2:79 James-PC
Fri Feb 24 11:02:53 2023 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.200 18:c0:4d:40:a2:79
Fri Feb 24 11:02:53 2023 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.200 18:c0:4d:40:a2:79 James-PC
Fri Feb 24 11:03:28 2023 daemon.info dnsmasq[1]: read /etc/hosts - 12 names
Fri Feb 24 11:03:28 2023 daemon.info dnsmasq[1]: read /tmp/hosts/dhcp.cfg01411c - 10 names
Fri Feb 24 11:03:28 2023 daemon.info dnsmasq-dhcp[1]: read /etc/ethers - 0 addresses
Fri Feb 24 11:03:28 2023 daemon.info samba4-server: io_uring module found, enabling VFS io_uring. (also needs Kernel 5.4+ Support)
Fri Feb 24 11:03:28 2023 daemon.info samba4-server: io_uring module found, enabling VFS io_uring. (also needs Kernel 5.4+ Support)
Fri Feb 24 11:03:29 2023 daemon.info samba4-server: io_uring module found, enabling VFS io_uring. (also needs Kernel 5.4+ Support)
Fri Feb 24 11:03:29 2023 daemon.info samba4-server: io_uring module found, enabling VFS io_uring. (also needs Kernel 5.4+ Support)
Fri Feb 24 11:44:25 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 11:44:25 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 11:44:25 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 12:49:17 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 12:49:17 2023 daemon.info hostapd: phy1-ap0: STA 04:68:65:8b:ee:f8 IEEE 802.11: associated (aid 2)
Fri Feb 24 12:49:17 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 04:68:65:8b:ee:f8 auth_alg=ft
Fri Feb 24 13:25:50 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 13:25:50 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 13:25:51 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 13:46:41 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 04:68:65:8b:ee:f8
Fri Feb 24 13:46:41 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 13:46:41 2023 daemon.info hostapd: phy1-ap0: STA 04:68:65:8b:ee:f8 IEEE 802.11: associated (aid 2)
Fri Feb 24 13:46:41 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 04:68:65:8b:ee:f8 auth_alg=ft
Fri Feb 24 13:47:16 2023 kern.info kernel: [151165.682165] mt7530 mdio-bus:1f lan3: Link is Up - 1Gbps/Full - flow control rx/tx
Fri Feb 24 13:47:16 2023 kern.info kernel: [151165.689768] br-lan: port 3(lan3) entered blocking state
Fri Feb 24 13:47:16 2023 kern.info kernel: [151165.695068] br-lan: port 3(lan3) entered forwarding state
Fri Feb 24 13:47:16 2023 daemon.notice netifd: Network device 'lan3' link is up
Fri Feb 24 13:47:25 2023 kern.info kernel: [151175.037269] mt7530 mdio-bus:1f lan3: Link is Down
Fri Feb 24 13:47:25 2023 kern.info kernel: [151175.042227] br-lan: port 3(lan3) entered disabled state
Fri Feb 24 13:47:25 2023 daemon.notice netifd: Network device 'lan3' link is down
Fri Feb 24 13:47:28 2023 kern.info kernel: [151177.230332] mt7530 mdio-bus:1f lan3: Link is Up - 1Gbps/Full - flow control off
Fri Feb 24 13:47:28 2023 kern.info kernel: [151177.237752] br-lan: port 3(lan3) entered blocking state
Fri Feb 24 13:47:28 2023 kern.info kernel: [151177.243063] br-lan: port 3(lan3) entered forwarding state
Fri Feb 24 13:47:28 2023 daemon.notice netifd: Network device 'lan3' link is up
Fri Feb 24 15:30:48 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 15:30:48 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 15:30:48 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 15:35:45 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 15:35:45 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 15:35:45 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 15:35:45 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 15:41:24 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 15:41:24 2023 daemon.info hostapd: phy1-ap0: STA d4:a3:3d:c1:03:e1 IEEE 802.11: associated (aid 3)
Fri Feb 24 15:41:24 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED d4:a3:3d:c1:03:e1 auth_alg=ft
Fri Feb 24 15:47:15 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 15:47:15 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 15:47:16 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 15:47:20 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED d4:a3:3d:c1:03:e1
Fri Feb 24 15:47:20 2023 daemon.info hostapd: phy1-ap0: STA d4:a3:3d:c1:03:e1 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 15:47:21 2023 daemon.info hostapd: phy1-ap0: STA d4:a3:3d:c1:03:e1 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 16:00:40 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 16:00:40 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 16:00:40 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 16:26:41 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 16:26:41 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 16:26:42 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 16:30:29 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 16:30:29 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 16:30:29 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 16:37:42 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 16:37:42 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 16:37:43 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 17:23:42 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 17:23:42 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 17:23:42 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 18:02:40 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 18:02:40 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 18:02:41 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 18:13:23 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 18:13:23 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: associated (aid 1)
Fri Feb 24 18:13:23 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 08:c7:29:b4:34:26 auth_alg=ft
Fri Feb 24 18:34:46 2023 daemon.err hostapd: nl80211: kernel reports: key addition failed
Fri Feb 24 18:34:46 2023 daemon.info hostapd: phy1-ap0: STA d4:a3:3d:c1:03:e1 IEEE 802.11: associated (aid 3)
Fri Feb 24 18:34:46 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED d4:a3:3d:c1:03:e1 auth_alg=ft
Fri Feb 24 18:35:39 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED d4:a3:3d:c1:03:e1
Fri Feb 24 18:35:39 2023 daemon.info hostapd: phy1-ap0: STA d4:a3:3d:c1:03:e1 IEEE 802.11: disassociated
Fri Feb 24 18:35:40 2023 daemon.info hostapd: phy1-ap0: STA d4:a3:3d:c1:03:e1 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 19:45:41 2023 daemon.notice netifd: wan (2663): udhcpc: sending renew to server 192.168.1.1
Fri Feb 24 19:45:41 2023 daemon.notice netifd: wan (2663): udhcpc: lease of 192.168.1.2 obtained from 192.168.1.1, lease time 86400
Fri Feb 24 20:07:16 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 08:c7:29:b4:34:26
Fri Feb 24 20:07:16 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 20:07:17 2023 daemon.info hostapd: phy1-ap0: STA 08:c7:29:b4:34:26 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 20:36:40 2023 daemon.err uhttpd[2227]: [info] luci: accepted login on / for root from 192.168.0.200
Fri Feb 24 20:38:11 2023 authpriv.info dropbear[11960]: Child connection from 192.168.0.200:38596
Fri Feb 24 20:38:13 2023 authpriv.notice dropbear[11960]: Pubkey auth succeeded for 'root' with ssh-rsa key SHA256:IxxZxaxzJ7QpaU6nVA262VDia2dNQPjMKQ+4+FccQcI from 192.168.0.200:38596
Fri Feb 24 20:40:10 2023 authpriv.info dropbear[11960]: Exit (root) from <192.168.0.200:38596>: Disconnect received
Fri Feb 24 21:20:19 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 04:68:65:8b:ee:f8
Fri Feb 24 21:20:19 2023 daemon.info hostapd: phy1-ap0: STA 04:68:65:8b:ee:f8 IEEE 802.11: disassociated due to inactivity
Fri Feb 24 21:20:20 2023 daemon.info hostapd: phy1-ap0: STA 04:68:65:8b:ee:f8 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Feb 24 22:30:04 2023 daemon.notice netifd: Network device 'lan3' link is down
Fri Feb 24 22:30:04 2023 kern.info kernel: [182500.263868] mt7530 mdio-bus:1f lan3: Link is Down
Fri Feb 24 22:30:04 2023 kern.info kernel: [182500.268829] br-lan: port 3(lan3) entered disabled state
Sat Feb 25 07:45:42 2023 daemon.notice netifd: wan (2663): udhcpc: sending renew to server 192.168.1.1
Sat Feb 25 07:45:42 2023 daemon.notice netifd: wan (2663): udhcpc: lease of 192.168.1.2 obtained from 192.168.1.1, lease time 86400
Sat Feb 25 08:39:41 2023 daemon.err uhttpd[2227]: [info] luci: accepted login on / for root from 192.168.0.200

UPDATE:

After removed the case cover, without modify any configuration it seems running stable and no more for disconnection for 4 days already. Possible over temp throttle issue, still observing for more time… will update again.

1 Like

Do you have made any ventilation holes into the case?

By default case is very closed except the hole for boot-switch and the other right from it. I already mentioned that at least holes for fan are needed,but airflow needs also an entry (if fan used for outgoing flow)

There are two holes (for wall mounting) beneath the case, the case is sitting directly on top of a fan (suction) of laptop cooling pad.

I checked the temp by touching the case from time to time at the beginning of this setup, room temp is AC cooled at 26c-28c, and the case barely feel hot even without AC, just very slightly warm, had it ran for months without problem. (sorry for unable to obtain scientific temp measurement result due to lack of tool)

Although not strongly convinced its caused by the heat issue, but the periodic disconnect issue apparently gone after take off the top cover.

It’s sounds like a hard issue to debug.

Cooling

The wall mounting holes are not enough for cooling. Frank really aimed for home-made holes, manually created. You also might want to add a fan pointing towards the components, rather than trying to blow air from under neat. As Frank stated you might want to integrate an actual custom fan into the casing somehow.

Memory

I found the memory usage very high. Keep in mind the BPI-R3 has “only” 2GB ram. And you were using basically all of it (98%)! Assuming there is no swap partition, your router will crash since it has no additional RAM left anymore!!

What is using so much ram? Could you try to login via SSH maybe, and run commands like top. Shift+ M should order the processes by memory usage, in my case I’m not so sure about that. Since the top version on my OpenWRT seems old, I can’t even use interactive mode…

Anyway… hopefully you are able to pin point the process that is using a lot of memory. I mean look at my old router with only 56MB RAM, and I have more RAM available then you have…

image

OpenWRT

Last, I want to point out that the snapshot versions aren’t as stable as you might want it to be. It’s not advised to run snapshot OpenWRT version during production.

1 Like

Thank you all for the concern & suggestion.

Cooling

After remove the top cover and ran for straight a week without disconnection issue, I putted back the cover and rebooted, trying to reproduce the disconnection issue see if its really caused by overheat, which I really doubt as if it is caused by the heat with following evidence.

  • As the previous disconnection happened about every 2 days periodically, if it is heat or ventilation issue it should reach throttle point quicker than 48 hours.
  • With the help of the cooling pad, the case is barely hot, only slightly warm to touch (feels like when we check temp by touch someone’s forehead.)

Still, this is an unconfirmed case. I will try to drill holes on top of the case to adapt an exhaust fan if the disconnection happens again after put back the cover.

Memory Usage

I’ve checked the process list before when the memory usage was high, found no clue. The only suspected process was transmission daemon I enabled for download torrents when my PC is off. After disabled transmission daemon, I didn’t have high memory usage so far. (200MiB+ over 2GiB)

Still observing, will post updates… Screenshot_2023-03-05_100252-01

This bug seems related:

1 Like

Checked the log, and the records are not looking good. Current highest is 78c, will check record again tomorrow see how high will it go after AC auto turned off.

Current settings

  • OEM metal case, fully closed.
  • Exhaust fan under the case pulling air from two mounting holes. (laptop cooling pad)
  • Room AC switched to FAN mode, and with OFF timer set at 2am.
  • only 3 client devices connected, no heavy network usage. Probably only 1 iPad is streaming from Netflix, other 2 phone devices are idling.

Sun Mar 5 23:50:37 PST 2023 mt7915_phy1-isa-18000000 Adapter: ISA adapter temp1: +53.0°C (high = +120.0°C, crit = +110.0°C)

mt7915_phy0-isa-18000000 Adapter: ISA adapter temp1: +74.0°C (high = +120.0°C, crit = +110.0°C)

Sun Mar 5 23:52:00 PST 2023 mt7915_phy1-isa-18000000 Adapter: ISA adapter temp1: +53.0°C (high = +120.0°C, crit = +110.0°C)

mt7915_phy0-isa-18000000 Adapter: ISA adapter temp1: +61.0°C (high = +120.0°C, crit = +110.0°C)

Sun Mar 5 23:53:00 PST 2023 mt7915_phy1-isa-18000000 Adapter: ISA adapter temp1: +53.0°C (high = +120.0°C, crit = +110.0°C)

mt7915_phy0-isa-18000000 Adapter: ISA adapter temp1: +78.0°C (high = +120.0°C, crit = +110.0°C)

Checked the log, and the records are not looking good.

Not looking good? Maybe you want to share your logging…?

It could be overheat issues indeed, it could be software issue. We need to deep futher.

Regarding memory usage, 200M is at least better than 2GB. So hopefully there is no additional sudden memory increase anymore… And consider a dedicated machine/SBC for running the torrent daemon.

The logging was a shell script that saves the result from sensors command to a file every 30 seconds, not sure how accurate the sensor report is.

Got 78c without heavy usage or many device connected.

mt7915_phy0-isa-18000000 Adapter: ISA adapter temp1: +78.0°C (high = +120.0°C, crit = +110.0°C)

Haven’t experience periodic disconnection so far, with or without case cover. But the high memory usage has coming back. (1.7GiB cached data?)

Screenshot_2023-03-08_161327-01

Screenshot_2023-03-08_161405-01

Screenshot_2023-03-08_161517-01

Logging from the sensors command averages 50c for 48 hrs, with sudden spike to 82c (highest) for less than a minute.