-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pico W disassociates from AP under some power modes but fails to detect it #2153
Comments
A search suggests "49" is WLC_E_ICV_ERROR, which I would only expect to see if using WEP. So that's a bit odd. |
There is definitely no WEP... Here is a packet capture: https://github.com/edmcman/rpi-temp-ser ... fice2.pcap Some events:
My interpretation is that my AP is buggy and did not tell the Pico to wake up when it should have via TIM field in beacons. (Oddly it does earlier in the pcap) I think the main Pico bug here is that the Pico should eventually detect that it is disconnected. |
You might be right. I will do some testing to see if I can reproduce any issue. But I would guess it's an interop problem. If I could reproduce it I would try editing cyw43_cb_process_async_event to handle the event. You could give this a go if you're feeling keen? Either just disconnect...
or see if reconnecting works (probably a long shot?).
|
That would also be my suspicion, that the AP does not work properly with STAs that are sleeping and a group rekey event, eg it's not waiting for the correct DTIM interval to send the rekey to the Pico W. |
Thanks @peterharperuk I will give this a shot. I just left for vacation, so I don't have physical access, but I do have remote access to the machine the Pico is plugged into. So I'll see if I can update the firmware remotely. |
Great, I have it running with the modified firmware. I started with the "long shot". |
First run, and it triggered the bug and appeared to work!
I am continuing to run more tests. |
So far I've run 45 tests, and none have failed. Previously, about 1/3 tests would fail. So I think it's safe to say this fixes the problem. Here is the (very minimal) patch against cyw43-driver that I've been using: diff --git a/src/cyw43_ctrl.c b/src/cyw43_ctrl.c
index a9db363..ad4a0ad 100644
--- a/src/cyw43_ctrl.c
+++ b/src/cyw43_ctrl.c
@@ -389,6 +389,10 @@ void cyw43_cb_process_async_event(void *cb_data, const cyw43_async_event_t *ev)
// Cannot authenticate
self->wifi_join_state = WIFI_JOIN_STATE_BADAUTH;
}
+ } else if (ev->event_type == 49) {
+ self->pend_rejoin = true;
+ self->pend_rejoin_wpa = false;
+ cyw43_schedule_internal_poll_dispatch(cyw43_poll_func);
} else if (ev->event_type == CYW43_EV_DEAUTH_IND) {
if (ev->status == 0 && ev->reason == 2) {
// Deauth, probably because password was wrong; disassociate |
Nice. Do you want to submit a PR over here https://github.com/georgerobotics/cyw43-driver ? |
PR started: georgerobotics/cyw43-driver#130 |
The high-level problem is that on my AP, under some power modes, when running a http server, the RPI eventually disconnects and becomes unreachable. It never recovers when this happens. I have reproduced this a lot using micropython, and more recently using this example from the pico-examples repository.
Port, board and/or hardware
RP2040 on Raspberry Pi Pico W
Reproduction
pipkin --port /dev/ttyACM0 install -r requirements.txt
Expected behaviour
The test.bash script tests whether the web server remains accessible after progressively longer periods of times with no requests. It is expected that the web server will remain available and the script will print "Test succeeded".
Observed behaviour
When run 20 times with a limit of 2048 second periods, 8 out of the 20 runs failed, meaning that the web server became inaccessible. Here is a breakdown of the power modes during failures:
And the overall distribution of tested pm values:
I am not sure why they are so uneven.
Here is a zip file of my collected logs:
logs-2048.zip
Other Information
When the Pico W becomes inaccessible, the logs contain the following messages repeated very often:
I believe the 49 message indicates crypto verification failure. Under normal operation, it appears to be correlated with PSK_SUP messages. But as soon as it starts repeating by itself the Pico W is offline.
Also, an asynchronous task that calls WLAN.isconnected() and WLAN.status() shows that the Pico W is connected, even though it does not seem to be:
High level problem
I think that under some power management modes, the Pico is missing rekeying and disassociate messages. The only way to detect that this situation has happened is to detect crypto failures and rejoin the network. I believe the 49 messages are crypto failures.
One way forward is to respond to crypto failures by rejoining the network, like most other wifi implementations. I did file an issue with the cyw43-driver, since it seems to me like the driver ought to be handling this.
Another way forward might be to change the default power mode in pico-sdk. Here, most of my testing is micropython-centric. I've found the infinite disconnection to happen during both the micro-python default power mode, and the power mode setting of 0xa11140 suggested in the Connecting to the Internet document.
Outside of micropython, the default power mode for the pico_w
httpd
example disconnects. I think this usesPM_PERFORMANCE
from perusing the source code, but I am not sure.Happy to help in any way that I can. I have enough information that I can simply work around the problem, but I'd prefer to see it fixed so that the next person doesn't have to deal with this.
The text was updated successfully, but these errors were encountered: