Communicating with your Crazyflie is an important pillar for its operation. As more robots are controlled, the reliability of this communication link becomes more and more important, as the probability that there is no failure at any of the Crazyflies decreases exponentially with the number of robots. We have written about the low-level radio link before. Today, we focus on past, on-going, and future improvements to make the communication more reliable.
Reliability Challenges
As part of doing research with the Crazyswarm, I noticed several issues:
- Sometimes commands do not seem to arrive at the Crazyflie, especially when using unicast (i.e., direct) communication with a specific Crazyflie.
- Sending ‘too much’ data while the Crazyflie is flying can cause catastrophic crashes. One example is trying to upload a trajectory, while flying in a motion capture space. However, this occasionally even happens if just trying to update a parameter.
The radio link has a feature called Safelink, which essentially guarantees that packets are send and received in order and no packet gets dropped. This feature never worked reliably in the crazyflie_cpp implementation and is therefore not used in the Crazyswarm. However, it is the default mode for cflib. The Crazy RealTime Protocol (CRTP) also has the notion of different communication ports in order to prioritize important messages such as control commands over less important ones such as trajectory upload. However, this prioritization was never implemented in any of the clients.
Native Link Implementations
Another, non reliability related, issue always was that the performance of cflib is not stellar when connected to many Crazyflies. This is mostly because (C)Python has a Global Interpreter Lock (GIL), which prevents true multi-threaded operation. A common solution to allow true parallelism is multiprocessing or implementations in a language that compiles to machine code (native code).
Crazyflie-link-Cpp
The first native implementation is written in C++ and includes Python bindings using pybind11. The overall API is simple: A connection can be created given a URI, and data can be send and received. Internally, this library implements Safelink to guarantee packets are in order, uses priority queues to prioritize messages on important CRTP ports, and handles all multi-radio and multi-thread related synchronization issues. Unlike prior implementations, the bandwidth is shared uniformly between all connections, i.e., there is no race between different Crazyflies communicating over the same radio. We measured a 50% lower CPU utilization when connecting with the CF client, a 20% higher bandwidth, and a 20% lower latency compared to the pure Python implementation.
For now, this feature is experimental and needs to be enabled using an environment variable. If you want to give it a try, you can use
pip install cflinkcpp USE_CFLINK=cpp cfclient
once everything is merged to master in the next couple of days.
Crazyflie-link-rs
The second native implementation is written in Rust. As it turns out, writing multi-threaded code in C++ is very difficult and error-prone, because it is up to the user to use the various synchronization primitives correctly. Rust, on the other hand, provides extensive compile-time checks to increase reliability by design. This implementation does not have quite as many features as the C++ version yet and requires a bit more work to be used with the client but it is going to be worked-on on Fridays. The goal is to reach functional parity with the other link implementations and be able to use the rust-implemented link with “USE_CFLINK=rs cfclient” in the future.
Future Work
As part of the development process, we also wrote many system tests that verify correctness and measure performance of the various link implementations. We already found and fixed several firmware bugs (681 and 688). The major open issue is that there is no flow control on the system link (the connection between the NRF51 and the STM32), which still causes packet loss in some cases.