Affectiva 每两帧下降一次

Affectiva drops every second frame

我正在 运行在 GoPro 视频录制上使用 Affectiva SDK 4.0。我在 Ubuntu 16.04 上使用 C++ 程序。 GoPro 视频以每秒 60 帧的速度录制。问题是 Affectiva 只提供大约一半帧的结果(即 30 fps)。如果我查看 Affectiva 提供的时间戳,最后的时间戳与视频持续时间匹配,这意味着 Affectiva 每隔一帧就会以某种方式跳过。

在 运行ning Affectiva 之前,我使用以下命令 运行ning ffmpeg 以确保视频具有 60 fps 的恒定帧速率:

ffmpeg -i in.MP4 -vf -y -vcodec libx264 -preset medium -r 60 -map_metadata 0:g -strict -2 out.MP4 </dev/null 2>&1

当我使用 ffprobe -show_entries frame=pict_type,pkt_pts_time -of csv -select_streams v in.MP4 检查演示时间戳时,我为原始视频获取以下值:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/media/GoPro_concat/GoPro_concat.MP4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.20.100
  Duration: 01:14:46.75, start: 0.000000, bitrate: 15123 kb/s
    Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuvj420p(pc, bt709), 1280x720 [SAR 1:1 DAR 16:9], 14983 kb/s, 59.94 fps, 59.94 tbr, 60k tbn, 119.88 tbc (default)
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 13:17:26:44
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 127 kb/s (default)
    Metadata:
      handler_name    :  GoPro AAC
    Stream #0:2(eng): Data: none (tmcd / 0x64636D74)
    Metadata:
      handler_name    :  GoPro AVC
      timecode        : 13:17:26:44
Unsupported codec with id 0 for input stream 2
frame,0.000000,I
frame,0.016683,P
frame,0.033367,P
frame,0.050050,P
frame,0.066733,P
frame,0.083417,P
frame,0.100100,P
frame,0.116783,P
frame,0.133467,I
frame,0.150150,P
frame,0.166833,P
frame,0.183517,P
frame,0.200200,P
frame,0.216883,P
frame,0.233567,P
frame,0.250250,P
frame,0.266933,I
frame,0.283617,P
frame,0.300300,P
frame,0.316983,P
frame,0.333667,P
frame,0.350350,P
frame,0.367033,P
frame,0.383717,P
frame,0.400400,I
frame,0.417083,P
frame,0.433767,P
frame,0.450450,P
frame,0.467133,P
frame,0.483817,P
frame,0.500500,P
frame,0.517183,P
frame,0.533867,I
frame,0.550550,P
frame,0.567233,P
frame,0.583917,P
frame,0.600600,P
frame,0.617283,P
frame,0.633967,P
frame,0.650650,P
frame,0.667333,I
frame,0.684017,P
frame,0.700700,P
frame,0.717383,P
frame,0.734067,P
frame,0.750750,P
frame,0.767433,P
frame,0.784117,P
frame,0.800800,I
frame,0.817483,P
frame,0.834167,P
frame,0.850850,P
frame,0.867533,P
frame,0.884217,P
frame,0.900900,P
frame,0.917583,P
frame,0.934267,I
frame,0.950950,P
frame,0.967633,P
frame,0.984317,P
frame,1.001000,P
frame,1.017683,P
frame,1.034367,P
frame,1.051050,P
frame,1.067733,I
...

我已经在 OneDrive 上上传了完整的输出。

如果我 运行 Affectiva 处理原始视频(未经 ffmpeg 处理),我会遇到同样的丢帧问题。我在 affdex::VideoDetector detector(60);

中使用 Affectiva

ffmpeg 命令或 Affectiva 有问题吗?

编辑:我想我已经找到问题所在了。似乎 Affectiva 没有处理整个视频,而是在处理一定数量的帧后停止,没有任何错误消息。下面我发布了我正在使用的 C++ 代码。在 onProcessingFinished() 方法中,我在处理完成后向控制台打印一些内容。但是这条消息永远不会打印出来,所以 Affectiva 永远不会结束。

我的代码有问题吗?或者我应该将视频编码为 MP4 以外的其他格式吗?

#include "VideoDetector.h"
#include "FrameDetector.h"

#include <iostream>
#include <fstream>
#include <mutex>
#include <condition_variable>

std::mutex m;
std::condition_variable conditional_variable;
bool processed = false;

class Listener : public affdex::ImageListener {
public:
    Listener(std::ofstream * fout) {
        this->fout = fout;
  }
  virtual void onImageCapture(affdex::Frame image){
      //std::cout << "called";
  }
  virtual void onImageResults(std::map<affdex::FaceId, affdex::Face> faces, affdex::Frame image){
      //std::cout << faces.size() << " faces detected:" << std::endl;

      for(auto& kv : faces){
        (*this->fout) << image.getTimestamp() << ",";
        (*this->fout) << kv.first << ",";
        (*this->fout) << kv.second.emotions.joy << ",";
        (*this->fout) << kv.second.emotions.fear << ",";
        (*this->fout) << kv.second.emotions.disgust << ",";
        (*this->fout) << kv.second.emotions.sadness << ",";
        (*this->fout) << kv.second.emotions.anger << ",";
        (*this->fout) << kv.second.emotions.surprise << ",";
        (*this->fout) << kv.second.emotions.contempt << ",";
        (*this->fout) << kv.second.emotions.valence << ",";
        (*this->fout) << kv.second.emotions.engagement << ",";
        (*this->fout) << kv.second.measurements.orientation.pitch << ",";
        (*this->fout) << kv.second.measurements.orientation.yaw << ",";
        (*this->fout) << kv.second.measurements.orientation.roll << ",";
        (*this->fout) << kv.second.faceQuality.brightness << std::endl;


        //std::cout <<  kv.second.emotions.fear << std::endl;
        //std::cout <<  kv.second.emotions.surprise  << std::endl;
        //std::cout <<  (int) kv.second.emojis.dominantEmoji;
      }
  }
private:
    std::ofstream * fout;
};

class ProcessListener : public affdex::ProcessStatusListener{
public:
    virtual void onProcessingException (affdex::AffdexException ex){
        std::cerr << "[Error] " << ex.getExceptionMessage();
    }
    virtual void onProcessingFinished (){
        {
            std::lock_guard<std::mutex> lk(m);
            processed = true;
            std::cout << "[Affectiva] Video processing finised." << std::endl;
        }
        conditional_variable.notify_one();
    }
};

int main(int argc, char ** argsv)
{
    affdex::VideoDetector detector(60, 1, affdex::FaceDetectorMode::SMALL_FACES);
    //affdex::VideoDetector detector(60, 1, affdex::FaceDetectorMode::LARGE_FACES);
    std::string classifierPath="/home/wrafael/affdex-sdk/data";
    detector.setClassifierPath(classifierPath);
    detector.setDetectAllEmotions(true);

    // Output
    std::ofstream fout(argsv[2]);
    fout << "timestamp" << ",";
    fout << "faceId" << ",";
    fout << "joy" << ",";
    fout << "fear" << ",";
    fout << "disgust" << ",";
    fout << "sadness" << ",";
    fout << "anger" << ",";
    fout << "surprise" << ",";
    fout << "contempt" << ",";
    fout << "valence" << ",";
    fout << "engagement"  << ",";
    fout << "pitch" << ",";
    fout << "yaw" << ",";
    fout << "roll" << ",";
    fout << "brightness" << std::endl;

    Listener l(&fout);
    ProcessListener pl;
    detector.setImageListener(&l);
    detector.setProcessStatusListener(&pl);

    detector.start();
    detector.process(argsv[1]);

    // wait for the worker
    {
    std::unique_lock<std::mutex> lk(m);
    conditional_variable.wait(lk, []{return processed;});
    }
    fout.flush();
    fout.close();
}

编辑 2:我现在进一步深入研究了这个问题,只查看了一个持续时间为 19 分 53 秒的 GoPro 文件(GoPro 拆分记录)。当我 运行 Affectiva with affdex::VideoDetector detector(60, 1, affdex::FaceDetectorMode::SMALL_FACES); 在原始视频上生成以下 file。 Affectiva 在 906 秒后停止,没有任何错误消息,也没有打印“[Affectiva] 视频处理完成”。

当我现在使用 ffmpeg -i raw.MP4 -y -vcodec libx264 -preset medium -r 60 -map_metadata 0:g -strict -2 out.MP4 转换视频,然后使用 运行 Affectiva 和 affdex::VideoDetector detector(60, 1, affdex::FaceDetectorMode::SMALL_FACES); 转换视频时,Affectiva 运行s 直到结束并打印 “[Affectiva] 视频处理完成”但帧速率仅为 23 fps。 Here 是文件。

当我现在 运行 Affectiva 和 affdex::VideoDetector detector(62, 1, affdex::FaceDetectorMode::SMALL_FACES); 在此转换后的文件上时,Affectiva 在 509 秒后停止,并且不会打印“[Affectiva] 视频处理完成”。 Here 是文件。

如果视频帧率为 60,则使用大于 60 的数字来处理所有帧。 IIRC 如果你只使用 61 或 62 你应该得到正确的帧数。