Thanks for your good explanations and ideas. As you’ve described, the Mayfly->MMW system in its current form would need additional engineering effort to be able to support a high-reliability communication system with built-in buffering, retries, etc.
At the moment, I’m trying to understand why our Mayfly systems’ reliability has dropped recently, apparently since we’ve implemented code changes. We have loggers that have been in the field for 9 months, sampling every 5 minutes, with no data missing from MMW. Beginning in January, we loaded new firmware onto a handful of those boards (or have simply swapped new, reprogrammed boards in their place) in order to take advantage of improvements that were made to ModularSensors since the boards were first deployed last summer/fall. Since installing that new firmware, a number of these Mayfly stations have begun to have multiple missing datapoints per day, and in some cases, missing data for hours before resuming.
So what I’ve attempted to do is to reestablish a known, working, baseline state to see if I can identify any bug that I might have introduced. I began by cleaning up my development environment (removing all PlatformIO libraries from global storage), creating a new PlatformIO project from the logging_to_MMW example code, and making minimal changes (setting UUIDs, station identifier, etc.). The errors and missing data I’ve described have been the outcome of this testing so far. So I’m a bit stumped as to why I’ve been unable to get back to the higher reliability that we seem to have had previously.
It seems to me that, in the case of the 504 response codes, since the MMW REST endpoint is returning a response, that it is receiving the messages, but perhaps for reasons internal to the server, is failing to save them to its database.
Thanks for reading. I’m kind of a one-man show as far as writing and testing this code, so I’m grateful for any and all suggestions!
P.S. I have to mention the irony that, when I went to have a look at your ModularSensors repo, github spun for a while and then eventually returned a 504-Gateway Time-out page. 🙂 It has since recovered.