在 STM32 中，循环 DMA 外围设备在传输结束时将如何表现？

Question

我想问一下，在以下情况下，STM32 中的 DMA SPI rx 将如何表现。我有一个指定的（例如）96 字节数组，称为 A，用于存储从 SPI 接收的数据。我打开我的循环 SPI DMA，它在每个字节上运行，配置为 96 字节。有没有可能，当 DMA 将填充我的 96 字节数组时，传输完成中断将关闭，以快速将 96 字节数组复制到另一个 - B，然后循环 DMA 将开始写入 A（并破坏保存在 B 中的数据）？我想通过 USB 将数据从 B 快速传输到 PC。

我只是在考虑如何通过 USB 将 STM32 的连续数据流 SPI 传输到 PC，因为我认为每隔一定时间通过 USB 传输一次 96 字节的数据块比实时 SPI 流更容易STM32的USB？我什至不知道这可能

Answer 1

为此，您必须能够保证在接收到下一个 SPI 字节并将其传输到缓冲区的开头之前复制所有数据。这是否可能取决于处理器的时钟速度和 SPI 的速度，并且能够保证不会发生可能延迟传输的更高优先级中断。为了安全起见，它需要非常慢的 SPI 速度，在这种情况下可能根本不需要使用 DMA。

总而言之，这是个坏主意，完全没有必要。 DMA 控制器有一个 "half-transfer" 中断正是为了这个目的。当传输前 48 个字节时，您将获得 HT 中断，当您复制 lower half 缓冲区时，DMA 将继续传输剩余的 48 个字节。完成传输后，您将传输 上半部分 。这将您必须传输数据的时间从单个字节的接收时间延长到 48 字节的接收时间。

如果每次传输实际上需要 96 个字节，那么只需将缓冲区设置为 192 个字节长 (2 x 96)。

在伪代码中：

#define BUFFER_LENGTH 96
char DMA_Buffer[2][BUFFER_LENGTH] ;

void DMA_IRQHandler()
{
    if( DMA_IT_Flag(DMA_HT) == SET )
    {
        memcpy( B, DMA_Buffer[0], BUFFER_LENGTH ) ;
        Clear_IT_Flag(DMA_HT) ;
    }
    else if( DMA_IT_Flag(DMA_TC) == SET )
    {
        memcpy( B, DMA_Buffer[1], BUFFER_LENGTH ) ;
        Clear_IT_Flag(DMA_TC) ;
    }
}

关于通过 USB 将数据传输到 PC，首先您需要确保您的 USB 传输速率至少与 SPI 传输速率一样快或更快。 USB 传输可能不太确定（因为它由 PC 主机控制——也就是说，您只能在主机明确要求时在 USB 上输出数据），所以即使 平均值 传输速率足够，可能存在需要进一步缓冲的延迟，因此与其简单地从 DMA 缓冲区 A 复制到 USB 缓冲区 B，您可能需要一个循环缓冲区或 FIFO 队列来馈送 USB。另一方面，如果您已经有了缓冲区 DMA_Buffer[0]、DMA_Buffer[1] 和 B，那么您实际上已经拥有一个包含三个 96 字节块的 FIFO，这可能就足够了

Answer 2

在我的一个项目中，我遇到了类似的问题。任务是通过全速 USB 将来自外部 ADC 芯片（与 SPI 连接）的数据传输到 PC。数据是（8 通道 x 16 位），我被要求达到尽可能快的采样频率。

我最终得到了一个 三重缓冲 解决方案。缓冲区可以处于 4 种可能的状态：

就绪：缓冲区已满数据，准备通过 USB 发送
SENT: 缓冲区已发送且已过时
IN_USE: DMA（由 SPI 请求）当前正在填充此缓冲区
NEXT: 该缓冲区被认为是空的，将在 IN_USE 已满时使用。

由于USB请求的时间无法与SPI进程同步，我认为双缓冲解决方案行不通。如果您没有 NEXT 缓冲区，当您决定发送 READY 缓冲区时，DMA 可能会完成填充 IN_USE 缓冲区并开始破坏 READY 缓冲区。但是在三缓冲区解决方案中，READY 缓冲区可以安全地通过 USB 发送，因为即使当前 IN_USE 缓冲区已满。

所以随着时间的流逝，缓冲区状态看起来像这样：

Buf0     Buf1      Buf2
====     ====      ====
READY    IN_USE    NEXT
SENT     IN_USE    NEXT
NEXT     READY     IN_USE
NEXT     SENT      IN_USE
IN_USE   NEXT      READY

当然，如果 PC 启动 USB 请求的速度不够快，您可能仍然会在 READY 缓冲区变为 NEXT[ 时丢失 READY 缓冲区=57=]（在成为 SENT 之前）。 PC 异步发送 USB IN 请求，没有关于当前缓冲区状态的信息。如果没有 READY 缓冲区（处于 SENT 状态），STM32 将响应一个 ZLP（零长度包），然后 PC 再次尝试1 毫秒延迟。

对于 STM32 上的实现，我使用双缓冲模式，并修改 DMA 传输完成 ISR 中的 M0AR 和 M1AR 寄存器以寻址 3 个缓冲区。

顺便说一句，我使用了 (3 x 4000) 字节缓冲区并在最后实现了 32 kHz 采样频率。 USB 配置为供应商特定 class 并且它使用批量传输。

Answer 3

通常，只有在 full/half 空的一半触发时，使用循环 DMA 才有效，否则您没有足够的时间将信息复制出缓冲区。

我建议不要在中断期间将数据复制出缓冲区。而是直接使用缓冲区中的数据，而无需额外的复制步骤。

如果您在中断中进行复制，则在复制过程中会阻塞其他优先级较低的中断。在 STM32 上，48 字节的简单原始字节副本可能需要额外的 48*6 ~ 300 个时钟周期。

如果您独立跟踪缓冲区的读取和写入位置，则只需更新单个指针和 post 对缓冲区使用者的延迟通知调用。

如果你想要更长的周期，那么不要使用循环 DMA，而是使用 48 字节块中的普通 DMA，并将循环字节缓冲区实现为数据结构。

我为接收异步可变长度数据包的 460k 波特率的 USART 执行此操作。如果你确保生产者只更新写指针而消费者只更新读指针，你可以避免大部分数据竞争。请注意，对 cortex m3/m4 上对齐的 <=32 位变量的读写是原子的。

包含的代码是我使用的支持 DMA 的循环缓冲区的简化版本。它仅限于 2^n 的缓冲区大小并使用模板和 C++11 功能，因此根据您的 development/platform 约束，它可能不适合。

要使用缓冲区调用getDmaReadBlock() 或getDMAwriteBlock() 并获取DMA 内存地址和块长度。 DMA 完成后，使用 skipRead() / skipWrite() 将读取或写入指针增加实际传输的数量。

 /**
   * Creates a circular buffer. There is a read pointer and a write pointer
   * The buffer is full when the write pointer is = read pointer -1
   */
 template<uint16_t SIZE=256>
  class CircularByteBuffer {
    public:
      struct MemBlock {
          uint8_t  *blockStart;
          uint16_t blockLength;
      };

    private:
      uint8_t *_data;
      uint16_t _readIndex;
      uint16_t _writeIndex;

      static constexpr uint16_t _mask = SIZE - 1;

      // is the circular buffer a power of 2
      static_assert((SIZE & (SIZE - 1)) == 0);

    public:
      CircularByteBuffer &operator=(const CircularByteBuffer &) = default;

      CircularByteBuffer(uint8_t (&data)[SIZE]);

      CircularByteBuffer(const CircularByteBuffer &) = default;

      ~CircularByteBuffer() = default;

    private:
      static uint16_t wrapIndex(int32_t index);

    public:
      /*
       * The number of byte available to be read. Writing bytes to the buffer can only increase this amount.
       */
      uint16_t readBytesAvail() const;

      /**
       * Return the number of bytes that can still be written. Reading bytes can only increase this amount.
       */
      uint16_t writeBytesAvail() const;

      /**
       * Read a byte from the buffer and increment the read pointer
       */
      uint8_t readByte();

      /**
       * Write a byte to the buffer and increment the write pointer. Throws away the byte if there is no space left.
       * @param byte
       */
      void writeByte(uint8_t byte);

      /**
       * Provide read only access to the buffer without incrementing the pointer. Whilst memory accesses outside the
       * allocated memeory can be performed. Garbage data can still be read if that byte does not contain valid data
       * @param pos the offset from teh current read pointer
       * @return the byte at the given offset in the buffer.
       */
      uint8_t operator[](uint32_t pos) const;

      /**
       * INcrement the read pointer by a given amount
       */
      void skipRead(uint16_t amount);
      /**
       * Increment the read pointer by a given amount
       */
      void skipWrite(uint16_t amount);


      /**
       * Get the start and lenght of the memeory block used for DMA writes into the queue.
       * @return
       */
      MemBlock getDmaWriteBlock();

      /**
       * Get the start and lenght of the memeory block used for DMA reads from the queue.
       * @return
       */
      MemBlock getDmaReadBlock();

  };

  // CircularByteBuffer
  // ------------------
  template<uint16_t SIZE>
  inline CircularByteBuffer<SIZE>::CircularByteBuffer(uint8_t (&data)[SIZE]):
      _data(data),
      _readIndex(0),
      _writeIndex(0) {
  }

  template<uint16_t SIZE>
  inline uint16_t CircularByteBuffer<SIZE>::wrapIndex(int32_t index){
    return static_cast<uint16_t>(index & _mask);
  }

  template<uint16_t SIZE>
  inline uint16_t CircularByteBuffer<SIZE>::readBytesAvail() const {
    return wrapIndex(_writeIndex - _readIndex);
  }

  template<uint16_t SIZE>
  inline uint16_t CircularByteBuffer<SIZE>::writeBytesAvail() const {
    return wrapIndex(_readIndex - _writeIndex - 1);
  }

  template<uint16_t SIZE>
  inline uint8_t CircularByteBuffer<SIZE>::readByte() {
    if (readBytesAvail()) {
      uint8_t result = _data[_readIndex];
      _readIndex = wrapIndex(_readIndex+1);
      return result;
    } else {
      return 0;
    }
  }

  template<uint16_t SIZE>
  inline void CircularByteBuffer<SIZE>::writeByte(uint8_t byte) {
    if (writeBytesAvail()) {
      _data[_writeIndex] = byte;
      _writeIndex = wrapIndex(_writeIndex+1);
    }
  }

  template<uint16_t SIZE>
  inline uint8_t CircularByteBuffer<SIZE>::operator[](uint32_t pos) const {
    return _data[wrapIndex(_readIndex + pos)];
  }

  template<uint16_t SIZE>
  inline void CircularByteBuffer<SIZE>::skipRead(uint16_t amount) {
    _readIndex = wrapIndex(_readIndex+ amount);
  }

  template<uint16_t SIZE>
  inline void CircularByteBuffer<SIZE>::skipWrite(uint16_t amount) {
    _writeIndex = wrapIndex(_writeIndex+ amount);
  }

  template <uint16_t SIZE>
  inline typename CircularByteBuffer<SIZE>::MemBlock  CircularByteBuffer<SIZE>::getDmaWriteBlock(){
    uint16_t len = static_cast<uint16_t>(SIZE - _writeIndex);
   // full is  (write == (read -1)) so on wrap around we need to ensure that we stop 1 off from the read pointer.
    if( _readIndex == 0){
      len = static_cast<uint16_t>(len - 1);
    }
    if( _readIndex > _writeIndex){
      len = static_cast<uint16_t>(_readIndex - _writeIndex - 1);
    }
    return {&_data[_writeIndex], len};
  }

  template <uint16_t SIZE>
  inline typename CircularByteBuffer<SIZE>::MemBlock  CircularByteBuffer<SIZE>::getDmaReadBlock(){
    if( _readIndex > _writeIndex){
      return {&_data[_readIndex], static_cast<uint16_t>(SIZE- _readIndex)};
    } else {
      return {&_data[_readIndex], static_cast<uint16_t>(_writeIndex - _readIndex)};
    }
  }
`

在 STM32 中，循环 DMA 外围设备在传输结束时将如何表现？

How will circular DMA periph to memory behave at the end of the transfer in STM32?

embedded

usb

spi

stm32

dma