Arduino micros() 方法在 Nano 33 BLE Sense 上执行超过 500 个周期

Arduino's micros() method is taking over 500 cycles to execute on Nano 33 BLE Sense

我先在 Arduino 论坛上发布了这个问题,但我的问题最近没有受到太多关注,所以这里...

我非常努力地制作了一个简单的草图来证明 micros() 方法在 Nano 33 BLE Sense 上有多慢:

#include "mbed.h"
#include "nrf_delay.h"
#include "nrf_gpio.h"

#define OUTPUT_PIN NRF_GPIO_PIN_MAP(1,11)

unsigned long startClock;
unsigned long stopClock;
unsigned long arithmeticMinuend = 4294967295;
unsigned long arithmeticSubtrahend = 2147483648;
unsigned long arithmeticDifference = 0;
unsigned long timeDifference1 = 0;
unsigned long timeDifference2 = 0;

//Assume 1 to 4 clock cycles (15.625 to 62.5 ns) is required for every...
//* call of micros();
//* subtraction of two unsigned longs.
//Therefore we should not be able to measure the time to execute either task by using the micros() - startClock technique, as...
//* micros() - startClock = 0 in either case.

void setup() {
  Serial.begin(115200);
  while (!Serial);

  nrf_gpio_cfg_output(OUTPUT_PIN);

  startClock = micros();
  stopClock = micros();
}

void loop() {
  timeDifference1 = microsecondsToSubtract2LongsOnce();
  timeDifference2 = microsecondsToSubtract2LongsTwice();
  Serial.print("timeDifference1 = ");
  Serial.print(timeDifference1);
  Serial.print(" us.\ntimeDifference2 = ");
  Serial.print(timeDifference2);
  Serial.println(" us.");
  
  microsecondsToSubtract2LongsOnceMeasuredWithScope();
  delay(100);
  microsecondsToSubtract2LongsTwiceMeasuredWithScope();
  delay(100);
  microsecondsToToggleOutputPinMeasuredWithScope();
  
  while(1) {
    
  }
}

//Prints 8 us:
unsigned long microsecondsToSubtract2LongsOnce() {
  startClock = micros();
  stopClock = micros();
  return stopClock - startClock;
}

//Prints 9 us:
unsigned long microsecondsToSubtract2LongsTwice() {
  startClock = micros();
  arithmeticDifference = arithmeticMinuend - arithmeticSubtrahend;
  stopClock = micros();
  return stopClock - startClock;
}

//Measured 18 us on scope:
void microsecondsToSubtract2LongsOnceMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  startClock = micros();
  stopClock = micros();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

//Measured 18 us on scope:
void microsecondsToSubtract2LongsTwiceMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  startClock = micros();
  arithmeticDifference = arithmeticMinuend - arithmeticSubtrahend;
  stopClock = micros();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

//Measured 500 ns on scope:
void microsecondsToToggleOutputPinMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

我将使用示波器进行的时间测量与使用 micros() 进行的时间测量进行比较。通过计算,我们得到一个 micros() 调用所需的时钟周期数 (CC) = (0.5*(18000 - 500) ns)/15.625 ns = 560 CC!! !

谁能举例说明在这个板上测量时间不需要超过 10 CC???

根据我的阅读,我认为更快(theoretically 1 CC) strategy would involve using the method nrf_drv_timer or the newer nrfx_timer(这两种策略都需要将定时器设置为在计数器模式下运行),但我找不到在我的 Arduino 的 NRF52840 上使用的具体示例。


编辑:

我也尝试使用 mbed 的 us_ticker 来减少时间,但计时结果完全相同。这是我用于该测试的代码:

#include "mbed.h"
#include "nrf_delay.h"
#include "nrf_gpio.h"

#define OUTPUT_PIN NRF_GPIO_PIN_MAP(1,11)

mbed::Timer timer;

uint32_t startClock;
uint32_t stopClock;
uint32_t arithmeticMinuend = 4294967295;
uint32_t arithmeticSubtrahend = 2147483648;
uint32_t arithmeticDifference = 0;
uint32_t timeDifference1 = 0;
uint32_t timeDifference2 = 0;

//Assume 1 to 4 clock cycles (15.625 to 62.5 ns) is required for every...
//* call of micros();
//* subtraction of two unsigned longs.
//Therefore we should not be able to measure the time to execute either task by using the micros() - startClock technique, as...
//* micros() - startClock = 0 in either case.

void setup() {
  Serial.begin(115200);
  while (!Serial);

  nrf_gpio_cfg_output(OUTPUT_PIN);
  timer.start();
  startClock = timer.read_us();
  stopClock = timer.read_us();
}

void loop() {
  timeDifference1 = microsecondsToSubtract2LongsOnce();
  timeDifference2 = microsecondsToSubtract2LongsTwice();
  Serial.print("timeDifference1 = ");
  Serial.print(timeDifference1);
  Serial.print(" us.\ntimeDifference2 = ");
  Serial.print(timeDifference2);
  Serial.println(" us.");
  
  microsecondsToSubtract2LongsOnceMeasuredWithScope();
  timer.stop();
  delay(100);
  timer.start();
  microsecondsToSubtract2LongsTwiceMeasuredWithScope();
  timer.stop();
  delay(100);
  timer.start();
  microsecondsToToggleOutputPinMeasuredWithScope();
  timer.stop();
  
  while(1) {
    
  }
}

//Prints 8 us:
unsigned long microsecondsToSubtract2LongsOnce() {
  startClock = timer.read_us();
  stopClock = timer.read_us();
  return stopClock - startClock;
}

//Prints 9 us:
unsigned long microsecondsToSubtract2LongsTwice() {
  startClock = timer.read_us();
  arithmeticDifference = arithmeticMinuend - arithmeticSubtrahend;
  stopClock = timer.read_us();
  return stopClock - startClock;
}

//Measured 18 us on scope:
void microsecondsToSubtract2LongsOnceMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  startClock = timer.read_us();
  stopClock = timer.read_us();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

//Measured 18 us on scope:
void microsecondsToSubtract2LongsTwiceMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  startClock = timer.read_us();
  arithmeticDifference = arithmeticMinuend - arithmeticSubtrahend;
  stopClock = timer.read_us();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

//Measured 500 ns on scope:
void microsecondsToToggleOutputPinMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

这是一个使用 mBed 的 us_ticker_read 的策略,它比 Arduino 的 micros 或 mBed 的 read_us 快一个数量级以上,即 650 ns42 CC.

#include "mbed.h"
#include "nrf_delay.h"
#include "nrf_gpio.h"

#define OUTPUT_PIN NRF_GPIO_PIN_MAP(1,11)

uint32_t startClock;
uint32_t stopClock;
uint32_t arithmeticMinuend = 4294967295;
uint32_t arithmeticSubtrahend = 2147483648;
uint32_t arithmeticDifference = 0;
uint32_t timeDifference1 = 0;
uint32_t timeDifference2 = 0;

//Assume 1 to 4 clock cycles (15.625 to 62.5 ns) is required for every...
//* call of micros();
//* subtraction of two unsigned longs.
//Therefore we should not be able to measure the time to execute either task by using the micros() - startClock technique, as...
//* micros() - startClock = 0 in either case.

void setup() {
  Serial.begin(115200);
  while (!Serial);

  nrf_gpio_cfg_output(OUTPUT_PIN);
  us_ticker_init();
  startClock = us_ticker_read();
  stopClock = us_ticker_read();
}

void loop() {
  Serial.println("Starting program:");
  timeDifference1 = microsecondsToSubtract2LongsOnce();
  timeDifference2 = microsecondsToSubtract2LongsTwice();
  us_ticker_free();
  Serial.print("timeDifference1 = ");
  Serial.print(timeDifference1);
  Serial.print(" us.\ntimeDifference2 = ");
  Serial.print(timeDifference2);
  Serial.println(" us.");

  us_ticker_init();
  microsecondsToSubtract2LongsOnceMeasuredWithScope();
  us_ticker_free();
  delay(10);
  us_ticker_init();
  microsecondsToSubtract2LongsTwiceMeasuredWithScope();
  us_ticker_free();
  delay(10);
  us_ticker_init();
  microsecondsToToggleOutputPinMeasuredWithScope();
  us_ticker_free();
  
  while(1) {
    
  }
}

//Prints 1 us:
uint32_t microsecondsToSubtract2LongsOnce() {
  startClock = us_ticker_read();
  stopClock = us_ticker_read();
  return stopClock - startClock;
}

//Prints 1 us:
uint32_t microsecondsToSubtract2LongsTwice() {
  startClock = us_ticker_read();
  arithmeticDifference = arithmeticMinuend - arithmeticSubtrahend;
  stopClock = us_ticker_read();
  return stopClock - startClock;
}

//Measured 1.8 us on scope:
void microsecondsToSubtract2LongsOnceMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  startClock = us_ticker_read();
  stopClock = us_ticker_read();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

//Measured 2 us on scope:
void microsecondsToSubtract2LongsTwiceMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  startClock = us_ticker_read();
  arithmeticDifference = arithmeticMinuend - arithmeticSubtrahend;
  stopClock = us_ticker_read();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

//Measured 500 ns on scope:
void microsecondsToToggleOutputPinMeasuredWithScope() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

如果谁能在接下来的几天里post一个时间更快的例子,我会采纳他们的回答!


编辑(11/12/2021):

我应该感谢 Arduino 论坛上的 westfw 引导我朝着正确的方向前进。他建议的策略非常准确,只需 1 或 2 CC 即可执行:

#include "mbed.h"
#include "nrf.h"
#include "nrf_delay.h"
#include "nrf_gpio.h"

#define OUTPUT_PIN NRF_GPIO_PIN_MAP(1,11)

const float MICROS_PER_CYCLE = 0.015625; //Nano 33 BLE Sense operates at 64 MHz -> 15.625 ns/cc.
const byte BYTE_TO_SEND = 170; //b'10101010'.
const int BYTES_PER_TRANSFER = 54;

byte buffer[BYTES_PER_TRANSFER];

volatile uint32_t stopWatchTime1;
volatile uint32_t stopWatchTime2;
volatile uint32_t stopWatchTime3;

void setup(){
  Serial.begin(1000000); //Does nothing on the Nano 33 BLE Sense.
  while (!Serial); //Wait for serial port to connect. Needed for native USB CDC on Nano 33 BLE Sense.

  stopWatchInit();
  
  nrf_gpio_cfg_output(OUTPUT_PIN); //Configure pin as digital output for measuring time of events on scope.
  
  memset(buffer, BYTE_TO_SEND, sizeof(buffer));

  //Measured 300 ns on scope and printed 190 ns.
  togglePinTwice();
  delay(10);

  //Measured 840 ns on scope and printed 730 ns:
  togglePinTwiceAndCallMicrosTwice();
  delay(10);

  //Measured ~68 us on scope and printed 67.69 us:
  togglePinTwiceCallMicrosTwiceAndWriteBytes(); //This result makes sense bc 88 - 9.2 = 78.8 us (which is basically 78 us).
  delay(10);

  //Measured ~68 us on scope:
  togglePinTwiceAndWriteBytes();

  Serial.println("");
  Serial.print(MICROS_PER_CYCLE*((float)stopWatchTime1));
  Serial.println(" us");
  Serial.println("");
  Serial.print(MICROS_PER_CYCLE*((float)stopWatchTime2));
  Serial.println(" us");
  Serial.println("");
  Serial.print(MICROS_PER_CYCLE*((float)stopWatchTime3));
  Serial.println(" us");
}
  
void loop(){}

void togglePinTwice(){
  stopWatchStart();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  stopWatchTime1 = stopWatchGet();
}

void togglePinTwiceAndCallMicrosTwice() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  stopWatchStart();
  stopWatchTime2 = stopWatchGet();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

void togglePinTwiceCallMicrosTwiceAndWriteBytes() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  stopWatchStart();
  Serial.write(buffer, sizeof(buffer));
  Serial.flush(); //Patch current Serial.h to use or it will block forever.
  stopWatchTime3 = stopWatchGet();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

void togglePinTwiceAndWriteBytes() {
  nrf_gpio_pin_toggle(OUTPUT_PIN);
  Serial.write(buffer, sizeof(buffer));
  Serial.flush();
  nrf_gpio_pin_toggle(OUTPUT_PIN);
}

void stopWatchInit() {
  CoreDebug->DEMCR |= 0x01000000; //Enable the use of DWT.
}

void stopWatchStart() {
  DWT->CYCCNT = 0; //Reset cycle counter.
  DWT->CTRL |= 0x1; //Enable cycle counter.
}

uint32_t stopWatchGet() {
  return DWT->CYCCNT; //Number of clock cycles stored in count variable.
}

我的“提示”,供参考:

对于短时间间隔,您应该能够从 ARM Core SysTick 定时器或 ARM DWT 循环计数器读取循环计数(您可能必须先启用 DWT。)

    startTime = SysTick->VAL; // 24bit counter, probably resets at ~1ms.  Counts Down!
// or
    start = DWT->CYCCNT;  //CYCCNT wraps at 32bits; never resets, counts up.
// With:
  // Somewhere in system init
  CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk;
  DWT->CTRL |= DWT_CTRL_CYCCNTENA_Msk;

我不知道这些将如何受到(或影响)低功耗模式等的影响。 (特别是DWT可能会耗电,不知道在powerdown模式下会不会关机。)