Start with Mesh Positioning Framework

Hello,
I’m trying to make work example Ultra Wide Band | Home Smart Mesh on DWM1001 boards. There are 2 versions of firmware on the website available for download. Which one is used to send and receive commands from the network (like TWR , ping). If I flashed 2 devices with mp_node.hex they are choosing who is coordinator, who is “no longer coordinator” but both don’t react to commands. If one is flashed with mp_node and another with simplemesh_cli.hex I can type commands to simplemesh_cli device, but responses are wrong.
MP_node side:
Receive Frame Wait Timeout; User defined RX timeouts;
uwb>twr_command done

SimpleMesh_cli side:
sm{“uwb_cmd”:“twr”,“initiator”:1,“responder”:0,“at_ms”:1001}

sm/C74473027F774205{“error”:“mp_receive_1_failed”,“initiator”:1,“responder”:0,“seq”:0,“uwb_cmd”:“twr”}
Thank you!

You’re right, x3 devices are needed for using twr from command line

  • cli (rf sniffer)
  • node 0
  • node 1

you can then paste commands on the cli that will broadcast to the nodes. Note also that the nodes do not need uart connection and take commands and reply through rf so that you can freely move them.

Let me know if it does’t work and we’ll go step by step.
Also you can use the latest version if you can compile the zephyr projects, otherwise I’ll let you know as soon as I upload a newer version, but the available ones are good to get started.

Unfortunatly, under Win10 ZephyrSDK doesnt compile. I need Virtual or WSL. Thank you for reply.I have question already. How to setup python for using

range_measure = uwb_twr(initiator, responder)

It seem that should be a librarty for doing this

  • yes, it’s a library available in the same directory see usage example in main

Needed dependencies are here

  • for building with zephyr that is optinal but I provided steps here

Feel free to post a topic about the particular issue you had and I’ll help you.

Now i am working with hex from web site (Node and Cli). i Have 1 Cli, 1 cordinator and 7 nodes.
After setting up system i got nodes discovered:

node_id_get:C74473027F774205
node_id_set:C74473027F774205:01
sm/C74473027F774205{"shortid":1}

node_id_get:3096AF1FC4EE6B16
node_id_set:3096AF1FC4EE6B16:02
sm/3096AF1FC4EE6B16{"shortid":2}

node_id_get:284829E5EDB355A2
node_id_set:284829E5EDB355A2:04
sm/284829E5EDB355A2{"shortid":4}

node_id_get:9F268011C0AAA884
sm/9F268011C0AAA884{"shortid":3}

node_id_get:677A169E72D366B9
node_id_set:677A169E72D366B9:07
sm/677A169E72D366B9{"shortid":7}

node_id_get:A92BEB7F72C7CB5C
node_id_set:A92BEB7F72C7CB5C:06
sm/A92BEB7F72C7CB5C{"shortid":6}

node_id_get:A92BEB7F72C7CB5C
node_id_set:A92BEB7F72C7CB5C:06
sm/A92BEB7F72C7CB5C{"shortid":6}

node_id_get:1A0983A3B14F28DC
node_id_set:1A0983A3B14F28DC:05

When im trying to take information from 4 devices with command:

sm{"uwb_cmd":"twr","initiator":0,"responders":[1,2,3,4],"at_ms":100,"step_ms":10,"count":2,"count_ms":50}

I receive back from client:

sm/C74473027F774205{"initiator":0,"range":"0.127","responder":1,"seq":0,"uwb_cmd":"twr"}
sm/3096AF1FC4EE6B16{"initiator":0,"range":"0.239","responder":2,"seq":1,"uwb_cmd":"twr"}
sm/9F268011C0AAA884{"initiator":0,"range":"0.450","responder":3,"seq":2,"uwb_cmd":"twr"}
sm/284829E5EDB355A2{"initiator":0,"range":"0.131","responder":4,"seq":3,"uwb_cmd":"twr"}

sm/C74473027F774205{"initiator":0,"range":"0.113","responder":1,"seq":4,"uwb_cmd":"twr"}
sm/3096AF1FC4EE6B16{"initiator":0,"range":"0.305","responder":2,"seq":5,"uwb_cmd":"twr"}
sm/9F268011C0AAA884{"initiator":0,"range":"0.450","responder":3,"seq":6,"uwb_cmd":"twr"}
sm/284829E5EDB355A2{"initiator":0,"range":"0.089","responder":4,"seq":7,"uwb_cmd":"twr"}

Seems fine.
When im trying to take the same info from 7 devices:

sm{"uwb_cmd":"twr","initiator":0,"responders":[1,2,3,4,5,6,7],"at_ms":100,"step_ms":10,"count":1,"count_ms":50}

I have this response:

sm/C74473027F774205{"initiator":0,"range":"0.155","responder":1,"seq":0,"uwb_cmd":"twr"}
sm/3096AF1FC4EE6B16{"initiator":0,"range":"0.295","responder":2,"seq":1,"uwb_cmd":"twr"}
sm/9F268011C0AAA884{"initiator":0,"range":"0.427","responder":3,"seq":2,"uwb_cmd":"twr"}
sm/284829E5EDB355A2{"initiator":0,"range":"0.136","responder":4,"seq":3,"uwb_cmd":"twr"}
sm/677A169E72D366B9{"initiator":0,"responder":7,"seq":6,"uwb_cmd":"twr"}
sm/A92BEB7F72C7CB5C{"initiator":0,"range":"0.807","responder":6,"seq":5,"uwb_cmd":"twr"}

An this moment coordinator Print UART is saying :

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (6195) us

initiator> success with frame 1

sm> /!\ target_ticks_delay missed by (6195) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (16693) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (17181) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (27740) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (38299) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (48828) us

Receive Frame Wait Timeout; User defined RX timeouts; 
uwb>twr_command done

What is parametes “at_ms”:100,“step_ms”:10,“count_ms”:50 ? What is this description of the values. Should they be adjusted with more nodes in the system?

I’m using code versions mp_node-f64b6e199dee on nodes and simplemesh_cli-451cc6bf4da on client . With this versions on client side there no response for

sm{"dwt_config":{"chan":5}}

Coordinator vUART is empty.

Command simple twr doesnt work

sm{"uwb_cmd":"twr","initiator":0,"responder":1,"at_ms":150} or sm{"uwb_cmd":"twr","initiator":0,"responder":1,"at_ms":100}

On coordinator vUart

Receive Frame Wait Timeout; User defined RX timeouts; 
uwb>twr_command done

Increasing at_ms to 200ms leads to vUART : exit and then coordinator stops.

Command

sm/3B9D70EB615FF217{"rf_diag":"target_ping","target":"3096AF1FC4EE6B16"}

where 3B9D70EB615FF217 is coordinator and 3096AF1FC4EE6B16 - device to ping. Doenst response. vUart on coordinator doesn’t say anything.

Commands

sm{"uwb_cmd":"ping", "pinger":0,"target":5,"at_ms":100} and
sm{"uwb_cmd":"ping", "pinger":0,"target":7,"at_ms":100,"count":3,"count_ms":6}

Works

If uwb commands work, that’s the most important, rf_diag is only for simplemesh 2.4 GHz rf debug.

The topic sm/uid{payload} should be the uid of the pinger not of the cli. The cli is trensparent and only broadcasting and receiving data.

I think I should document that better, at worst we can set a short workshop session during which I’ll help you.

Connecting both nodes uart and looking at their lod can help.

It’s also good to have an insight about the code and know where to add debug info, but for that building on the long run is important.

How many nodes can be in the network? UWB TWR for 7 nodes doesn’t work fully. I have problems that I receive valid data only for few nodes. For rest is

uwb>twr_command done (0)->(7)

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (6225) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (16693) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (27252) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (37811) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (48370) us

Receive Frame Wait Timeout; User defined RX timeouts; 
sm> /!\ target_ticks_delay missed by (58929) us

Receive Frame Wait Timeout; User defined RX timeouts; 
uwb>twr_command done

Simple uwb-twr from device to device responds with

Receive Frame Wait Timeout; User defined RX timeouts; 
uwb>twr_command done

There is not any design limit for max number of twr nodes.
Issues only happed for rf command sync.
If you get that warning message, you just have to increase the “at_ms” otherwise the nodes won’t be running the twr sequence in sync and will not be able to respond to each other. If 100ms generates misses, try with 150 or 200 ms. That is only a single sync after which many twr transactions can follow each other.

The “at_ms” is very important to transit from rtos jitter to busy wait where all nodes are in sync.

To have a better feeling, I posted a pio debug that shows what happen

See screenshot in section

Or in this topic

Would be nice to have uwb_ping dialogue for the list of targets. Using CIR and first path we can calculate RSSI of UWB than.

Still can’t make it work for 7 nodes. One - two doesn’t give a response. Tried with various AT_ms from 100 to 240

Thanks for the ideas and suggestions, I intend to be adding features as I only started developing this few weeks ago so it’s really just the beginning.

When you say dialogue, do you mean a sort of GUI where the user can select the target to ping ? If so then yes, this might be a bit too early though as the framework is being developed in layers, the lowest is the uC firmware, in the middle is the python server code, then that code will have an MQTT interface with a webapp. I think I will more go for a 2D webapp but I also tried a concept with a 3D webapp integration

Regarding the number of nodes, I have a kit of 12 nodes so I’ll definitely be scaling up the tests with the number of nodes. The framework is still young and has lots of bugs and lacks robustness, so by running this experiment I’ll fix the bugs and provide a functional mesh for higher numbers.

I think that the weak spot is the handing over from RTOS scheduling to TWR task, so the less that happens the more robust is the system, ideally that should only happen when changing the working mode and assigning new roles but with the current commands, it is making a sync with every new command

Don’t hesitate for experimental purpose to set “at_ms” to 300 ms or 400 ms and see if you get a stable response.
Also important to note that if you have a polluted RF channel at 2420 MHz that will degrade the responses collection performance, at the moment this is still a fire and forget that has no ack, that can be improved to ack based repetition within the allowed time slot margin

Also for info, the “at_ms” is not supposed to increase with the number of samples as it is only relating to the RTOS jitter of every single node, I think I might fallback on another RF packet that does not require a processing subject to such a big jitter, so could be fixed.

would it b easier to hard code some kind of id, that way u can keep working on it and dont get hang up with commissioning part, which is not important at the early stages but can take a lot of time? the devices can automatically assign their own id (1, 2, 3, …) depending on who is activated first. what is the principles in which these devices operate using the fware? would it save u more time if from the uC fw u interface it with an esp32, that way u can build ur webapp and configure ur uC directly through the interface? either just for configuration or testing.

Each uwb module uses simplemesh as back channel for configuration and data collection, it has advantage over esp that it’s already the same chip.
Short ids is a concept that allows short messages especially for uwb where I use the same, but 1-2 bytes size restrict it to small number e.g. 255 or 65536 therefore has to be assigned in addition.
In the previous generation I used fixed short ids only that I held in a database and automatically flash in a data section when reading the long uid by a script. Even with that process the short ids assignment and flashing every time was annoying and bringing manual overhead on daily basis when experimenting.
I thought about creating a table in the code and flashing it, that would also need updating code everytime for this.
In the simplemesh, the short id assignment is already done. I did it as first thing as lesson learned, and it is working quite reliably, did not had any issue with it so far.
I’m even now considering using friendly names in the topic/json API and map them to the short ids automatically.
Otherwise the server is now holding a uid to friendly name mapping and the uid to short id is also automatically collected on startup. This way I get best of efficiency, usability and ease of use with a unique firmware for all and without custom config.

Hello, thank you for your reply. When I’m saying about ping dialog, it was about adding multiple responders for the ping command. For example, list [1,2,3,4] as for TWR command. Data from this command is important for link analysis. GUI and so one is can be done easily by yourself if there is API for the client.
I made 2 experiments with 7 nodes. They are having ID numbers. I found a problem that the client doesn’t show any response for command ping.

>sm{"uwb_cmd":"ping", "pinger":0,"target":5,"at_ms":140,"count":4,"count_ms":6}
>sm{"uwb_cmd":"ping", "pinger":0,"target":2,"at_ms":100,"count":1,"count_ms":6}
>sm{"uwb_cmd":"ping", "pinger":0,"target":2,"at_ms":120}
>sm{"uwb_cmd":"ping", "pinger":0,"target":3,"at_ms":120}

At the same time Coord said that everything is fine:

uwb>twr_command done (0)->(5)

uwb_ping> seq(69)

uwb>twr_command done (0)->(2)

uwb_ping> seq(70)

uwb>twr_command done (0)->(2)

uwb_ping> seq(71)

uwb>twr_command done (0)->(3)

That I just replaced nodes, make them closer to the coordinator. And tried the same, the problem was the same but with different nodes.

Same problem with twr command. And I didn’t see bonds for that behave. Ones it can make twr for 6 nodes 4 times. Other times it failed with all or made it partly. But the physical environment is not changed place is the same, there is noone walking around.
Mainly coordinator is saying Receive Frame Wait Timeout; User defined RX timeouts; uwb>twr_command done or sm> /!\ target_ticks_delay missed by (16662) us Receive Frame Wait Timeout; User defined RX timeouts;

Also I saw negative result in TWR but it was when i moved coordinator

How about interfacing a ssd1306 for each that way u can list the targets on each node, u can also display logs that would help with debugging purposes instead on seeing them all at once on a terminal. or u can go up to the 1.8" tft for bigger size screen. Another idea regarding configuring or control device is that u can add an ir receiver for each unit. and interface an ir transmitter for a mobile device, that way u can make a webapp that can address each device directly through the ir interface. that would save a lot of time no?

no, this topic initial post is about “DWM1001 boards” not about a custom design hw.

@rather_simple thanks for pointing these out, I think we can address these points one by one.

  • For the RF ping with a list of pinger or target, I’ll think about it, but it does not match the underlying layer, or at least does not bring a big advantage over a python loop. The advantage of passing a list in the uwb command is that it creates a TDMA schema within all of those initiators or responders that will be executed a “count” number of times without any new sync.
  • regarding the fail when using a number of nodes higher than 4 or 5, I suspect a current limitation of the gateway RF rx system where packets are not being consumed fast enough, a simple fifo increase could solve this otherwise I’ll have to optimize the rx pipeline.

Just to be sure, let me know if you can discriminate the failing test cases if it only when using more than 5 nodes or 6 or 7, if you can isolate that number in a systematic way. If you have the possibility to debug with pio the gateway you could help, otherwise as soon as I manage to make tests with higher numbers I’ll stabilize that and share an improved update.

The negative uwb ranges are purely uwb samples calibration issues, the simplemesh and meshposition did not interfere with that at all for the moment, but are also using basic samples from decawave not advanced ones as they’re the only open source available ones.

  1. I’m talking about UWB ping command , where we working with data from UWB chip, like fAmp1,fAmp2 and etc.
    I have problem with responders for one node.For example, in one test nodes 1,4,6,7 responds. In another 1,2,3,6. (Another test = reset of the system, position is not changed). Best I got responds from 6. Command is for ping UWB initiator → responder, one by one from coordinator to each node. I don’t know what data is exchanging while pinging is executed. But may be we can give responders as parameter and then, ping it as a function ping step by step while there is item in a list. And then sending data to client
  2. yes, I can debug with logic analyser (next week). I will try to isolate problem with numbers of nodes.

If I put at_ms > 240 on coordinator debug windows I receive ‘exit’. Than system coordinator need to be restarted

I see. Keep in mind that even if the ping command is an uwb_cmd the sequence is as follows :

  1. The user pastes the json request on the console
  2. The gateway parses and broadcasts the text request on the 2.4 GHz RF
  3. all nodes receive the 2.4 GHz packet request at exactly the same time, that time will be used as reference timestamped from the ISR
  4. within the task thread all nodes addressed in the list, will start a busy wait to get in sync with the left out time relative to the timestamp of the request packet
  5. during the request execution loop, every nodes checks if it has to execute a responder task or initiator task otherwise loops over and the next loop will keep it wait for the next sync delay
  6. Every responder that executes a responder transaction will end up by sending a report mesh_bcas_json through RF 2.4 GHz
  7. The gateway is simply listening to RF 2.4 GHz only, I have an UWB sniffer in another sample project but it is currently not in use as you would not get meta info any way so the info has to come from the node, once a 2.4 GHz (simplemesh) packet is received, it is printed in the console and the python script has a queue to receive it from the serial port

So as you see, the current design limitation is a bottleneck on the RF packet sequence, if many packets are sent close to each other that the gateway is not yet optimized to receive them, but that might be any easy workaround to increase the fifo, I want rather to solve it differently by allowing a lower latency packets processing, I have a native USB dongle but had other issues with the sub serial console

And as you see in the code, the command_ping function could have a list as input, I just do not see necessity to bring this at the firmware level other than speeding up the process, I prefer to run python loops wherever possible.

I also started updating the simplemesh to add acknowledge and retransmission, as it is currently fire and forget, but I’m almost sure it’s gateway rx pipeline issue and not packet loss.

I just connected a pio to the gateway and started optimizing a new function for reliable transfer of a big size of data, that will allow upload of cir accumulator buffer and provide precious channel impulse response info, once I’m done with that I’ll turn the TWR and uwb ping reports into reliable as well and schedule at least one or two retransmissions to eliminate weak spots from the simplemesh backend otherwise it does not make sense to diagnose a system through another unreliable medium. The future steps will be. I also do not include mesh routing yet, so at the moment it’s all x to gateway communication, but that’s something I had running before, it’s just that this time I’ll not use flood but rather routing so that forwarding of packets through a mesh can stay constrained with real time scheduling.

  1. Diagnose the 2.4 GHz simplemesh RF network between all nodes
  2. use UWB remote functions

@rather_simple the first thing I’d recommend is to setup a working environment where you can compile the latest zephyr samples, I can help you with that, I have my toolchain running on windows, I recommend windows over wsl for direct flashing and faster development loops.
Then you’ll be able to test and I’ll be providing patches on top of the latest versions.