I have following problem - from time to time my BPI-R3 crashes and reboots into a ‘rescue’ partition. Since it is not configured it makes a router dead until I plug it and unplug it.
That poses 2 problems:
How do I see what the problem is? Ideally kernel crash would be logged somewhere but I cannot figure any other way than have RPi or something constantly connected to the serial port
What is proper way of updating the stuff. Ideally ‘rescue’ partition would have last known working configuration but I don’t want to copy it manually every time I change it. Alternatively how can I disable the feature?
This is the best way. The problem when a device crashes so badly it reboots is that it doesn’t have the ability to write log files. Even log entries that have been “written” are often not committed to physical storage before the crash happens.
You can try setting up remote system logging. This may give you events closer to the crash, but still likely won’t give you the crash itself. The serial output is your best bet.
What OS are you running? I am assuming either the modified OpenWrt that came on it, or vanilla OpenWrt snapshot.
In either of these cases, the rescue system is read only. It is there only to diagnose the main system and determine if it’s still usable. It’s not designed as a high-availability system with a failover to a backup OS that keeps running your network. If you were to configure as you suggest, then whatever is causing it to crash and reboot on the primary is likely to do the same thing on the backup. The best thing to do is the above to diagnose the crash.
When running vanilla OpenWrt snapshot a reboot to the recovery firmware indicates the presence of a log of a kernel crash in /sys/fs/pstore. Hence a constantly connected RPi to log the serial port is not needed, you can simply dump /sys/fs/pstore/* and share it here, so we can see what went wrong and why it crashed.