OpenCV:从 VideoCapture 读取帧将视频推进到奇怪的错误位置
OpenCV: reading frames from VideoCapture advances the video to bizarrely wrong location
(只要这个问题符合条件,我就会悬赏 500 声望——除非问题被关闭。)
一句有问题
从 VideoCapture
中读取帧使视频比预期的更进一步。
说明
我需要读取和分析特定时间间隔之间 100 fps(根据 cv2
和 VLC 媒体播放器)视频的帧。在下面的最小示例中,我试图读取三分钟视频的前十秒的所有帧。
我正在创建一个 cv2.VideoCapture
对象,我从中读取帧,直到达到以毫秒为单位的所需位置。在我的实际代码中,分析了每一帧,但为了展示错误,这个事实是无关紧要的。
在读取帧后检查 VideoCapture
的当前帧和毫秒位置会产生正确的值,因此 VideoCapture
认为 它在右边位置 - 但事实并非如此。保存最后一个读取帧的图像显示我的迭代严重超过目标时间 超过两分钟 。
更奇怪的是,如果我手动将捕获的毫秒位置VideoCapture.set
设置为10秒(读取帧后相同的值VideoCapture.get
returns)并且保存图片,视频(几乎)在正确的位置!
演示视频文件
如果您想 运行 MCVE,您需要 demo.avi 视频文件。
可以下载HERE.
MCVE
这个 MCVE 是精心制作和评论的。如有不明之处,请在问题下方留言。
如果您使用的是 OpenCV 3,则必须将 cv2.cv.CV_
的所有实例替换为 cv2.
。 (我的两个版本都有这个问题。)
import cv2
# set up capture and print properties
print 'cv2 version = {}'.format(cv2.__version__)
cap = cv2.VideoCapture('demo.avi')
fps = cap.get(cv2.cv.CV_CAP_PROP_FPS)
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('initial attributes: fps = {}, pos_msec = {}, pos_frames = {}'
.format(fps, pos_msec, pos_frames))
# get first frame and save as picture
_, frame = cap.read()
cv2.imwrite('first_frame.png', frame)
# advance 10 seconds, that's 100*10 = 1000 frames at 100 fps
for _ in range(1000):
_, frame = cap.read()
# in the actual code, the frame is now analyzed
# save a picture of the current frame
cv2.imwrite('after_iteration.png', frame)
# print properties after iteration
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after iteration: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# assert that the capture (thinks it) is where it is supposed to be
# (assertions succeed)
assert pos_frames == 1000 + 1 # (+1: iteration started with second frame)
assert pos_msec == 10000 + 10
# manually set the capture to msec position 10010
# note that this should change absolutely nothing in theory
cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, 10010)
# print properties again to be extra sure
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after setting msec pos manually: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# save a picture of the next frame, should show the same clock as
# previously taken image - but does not
_, frame = cap.read()
cv2.imwrite('after_setting.png', frame)
MCVE输出
print
语句产生以下输出。
cv2 version = 2.4.9.1
initial attributes: fps = 100.0, pos_msec = 0.0, pos_frames = 0.0
attributes after reading: pos_msec = 10010.0, pos_frames = 1001.0
attributes after setting msec pos manually: pos_msec = 10010.0, pos_frames = 1001.0
如您所见,所有属性都具有预期值。
imwrite
保存以下图片
first_frame.png
after_iteration.png
after_setting.png
你可以在第二张图片中看到问题。 9:26:15(图片中的实时时钟)的目标错过了超过两分钟。手动设置目标时间(第三张图片)将视频设置为(几乎)正确的位置。
我做错了什么,我该如何解决?
目前已尝试
cv2 2.4.9.1 @Ubuntu16.04
cv2 2.4.13 @ Scientific Linux 7.3(三台电脑)
cv2 3.1.0 @ Scientific Linux 7.3(三台电脑)
使用
创建捕获
cap = cv2.VideoCapture('demo.avi', apiPreference=cv2.CAP_FFMPEG)
和
cap = cv2.VideoCapture('demo.avi', apiPreference=cv2.CAP_GSTREAMER)
在 OpenCV 3 中(版本 2 似乎没有 apiPreference
参数)。
使用 cv2.CAP_GSTREAMER
需要非常长的时间(运行 MCVE 大约需要 2-3 分钟)但是两个 api 偏好都会产生相同的错误图像。
当使用 ffmpeg
直接读取帧时(归功于 this 教程)会生成正确的输出图像。
import numpy as np
import subprocess as sp
import pylab
# video properties
path = './demo.avi'
resolution = (593, 792)
framesize = resolution[0]*resolution[1]*3
# set up pipe
FFMPEG_BIN = "ffmpeg"
command = [FFMPEG_BIN,
'-i', path,
'-f', 'image2pipe',
'-pix_fmt', 'rgb24',
'-vcodec', 'rawvideo', '-']
pipe = sp.Popen(command, stdout = sp.PIPE, bufsize=10**8)
# read first frame and save as image
raw_image = pipe.stdout.read(framesize)
image = np.fromstring(raw_image, dtype='uint8')
image = image.reshape(resolution[0], resolution[1], 3)
pylab.imshow(image)
pylab.savefig('first_frame_ffmpeg_only.png')
pipe.stdout.flush()
# forward 1000 frames
for _ in range(1000):
raw_image = pipe.stdout.read(framesize)
pipe.stdout.flush()
# save frame 1001
image = np.fromstring(raw_image, dtype='uint8')
image = image.reshape(resolution[0], resolution[1], 3)
pylab.imshow(image)
pylab.savefig('frame_1001_ffmpeg_only.png')
pipe.terminate()
这产生了正确的结果! (正确的时间戳 9:26:15)
frame_1001_ffmpeg_only.png:
附加信息
在评论中,我被要求提供 cvconfig.h
文件。我似乎只有 /opt/opencv/3.1.0/include/opencv2/cvconfig.h
下的 cv2 版本 3.1.0 的这个文件。
HERE是这个文件的粘贴。
如果有帮助,我可以使用 VideoCapture.get
提取以下视频信息。
brightness 0.0
contrast 0.0
convert_rgb 0.0
exposure 0.0
format 0.0
fourcc 1684633187.0
fps 100.0
frame_count 18000.0
frame_height 593.0
frame_width 792.0
gain 0.0
hue 0.0
mode 0.0
openni_baseline 0.0
openni_focal_length 0.0
openni_frame_max_depth 0.0
openni_output_mode 0.0
openni_registration 0.0
pos_avi_ratio 0.01
pos_frames 0.0
pos_msec 0.0
rectification 0.0
saturation 0.0
您的视频文件数据仅包含 1313 个非重复帧(即持续时间为每秒 7 到 8 帧):
$ ffprobe -i demo.avi -loglevel fatal -show_streams -count_frames|grep frame
has_b_frames=0
r_frame_rate=100/1
avg_frame_rate=100/1
nb_frames=18000
nb_read_frames=1313 # !!!
转换带有 ffmpeg
的 avi 文件报告 16697 个重复帧(出于某种原因添加了 10 个额外的帧,16697=18010-1313)。
$ ffmpeg -i demo.avi demo.mp4
...
frame=18010 fps=417 Lsize=3705kB time=03:00.08 bitrate=168.6kbits/s dup=16697
# ^^^^^^^^^
...
BTW, thus converted video (demo.mp4
) is devoid of the problem being
discussed, that is OpenCV processes it correctly.
在这种情况下,avi 文件中实际上并不存在重复帧,而是每个重复帧都由重复前一帧的指令表示。可以这样检查:
$ ffplay -loglevel trace demo.avi
...
[ffplay_crop @ 0x7f4308003380] n:16 t:2.180000 pos:1311818.000000 x:0 y:0 x+w:792 y+h:592
[avi @ 0x7f4310009280] dts:574 offset:574 1/100 smpl_siz:0 base:1000000 st:0 size:81266
video: delay=0.130 A-V=0.000094
Last message repeated 9 times
video: delay=0.130 A-V=0.000095
video: delay=0.130 A-V=0.000094
video: delay=0.130 A-V=0.000095
[avi @ 0x7f4310009280] dts:587 offset:587 1/100 smpl_siz:0 base:1000000 st:0 size:81646
[ffplay_crop @ 0x7f4308003380] n:17 t:2.320000 pos:1393538.000000 x:0 y:0 x+w:792 y+h:592
video: delay=0.140 A-V=0.000091
Last message repeated 4 times
video: delay=0.140 A-V=0.000092
Last message repeated 1 times
video: delay=0.140 A-V=0.000091
Last message repeated 6 times
...
在上面的日志中,具有实际数据的帧由以“[avi @ 0xHHHHHHHHHHH]
”开头的行表示。 “video: delay=xxxxx A-V=yyyyy
”消息表示最后一帧必须再显示 xxxxx
秒。
cv2.VideoCapture()
跳过此类重复帧,只读取具有真实数据的帧。这是相应的(虽然,略微编辑)code from the 2.4 branch of opencv(注意,顺便说一句,使用了 ffmpeg 下面,我在 gdb 下通过 运行 python 验证并在 [=20 上设置断点=]):
bool CvCapture_FFMPEG::grabFrame()
{
...
int count_errs = 0;
const int max_number_of_attempts = 1 << 9; // !!!
...
// get the next frame
while (!valid)
{
...
int ret = av_read_frame(ic, &packet);
...
// Decode video frame
avcodec_decode_video2(video_st->codec, picture, &got_picture, &packet);
// Did we get a video frame?
if(got_picture)
{
//picture_pts = picture->best_effort_timestamp;
if( picture_pts == AV_NOPTS_VALUE_ )
picture_pts = packet.pts != AV_NOPTS_VALUE_ && packet.pts != 0 ? packet.pts : packet.dts;
frame_number++;
valid = true;
}
else
{
// So, if the next frame doesn't have picture data but is
// merely a tiny instruction telling to repeat the previous
// frame, then we get here, treat that situation as an error
// and proceed unless the count of errors exceeds 1 billion!!!
if (++count_errs > max_number_of_attempts)
break;
}
}
...
}
简而言之:我在装有 OpenCV 2.4.13 的 Ubuntu 12.04 机器上重现了您的问题,注意到您视频中使用的编解码器 (FourCC CVID) 似乎相当旧(根据此 post 从 2011 年开始),并且在将视频转换为编解码器 MJPG(又名 M-JPEG 或 Motion JPEG)之后,您的 MCVE 就可以工作了。当然,Leon(或其他人)可能 post 修复了 OpenCV,这可能是更适合您的情况的解决方案。
我最初尝试使用
进行转换
ffmpeg -i demo.avi -vcodec mjpeg -an demo_mjpg.avi
和
avconv -i demo.avi -vcodec mjpeg -an demo_mjpg.avi
(都在 16.04 盒子上)。有趣的是,两者都制作了 "broken" 个视频。例如,当使用 Avidemux 跳转到第 1000 帧时,没有实时时钟!此外,转换后的视频只有原始大小的 1/6 左右,这很奇怪,因为 M-JPEG 是一种非常简单的压缩。 (每一帧都是独立的 JPEG 压缩。)
使用 Avidemux 将 demo.avi
转换为 M-JPEG 生成了 MCVE 在其上运行的视频。 (我使用 Avidemux GUI 进行转换。)转换后的视频大小约为原始大小的 3 倍。当然,也可以使用 Linux 上支持更好的编解码器进行原始录制。如果您需要在应用程序中跳转到视频中的特定帧,M-JPEG 可能是最佳选择。否则,H.264 压缩得更好。根据我的经验,两者都得到了很好的支持,而且我看到的唯一代码是直接在网络摄像头上实现的(H.264 仅适用于高端网络摄像头)。
如你所说:
When using ffmpeg directly to read frames (credit to this tutorial) the correct output images are produced.
是否正常,因为你定义了一个
framesize = resolution[0]*resolution[1]*3
然后在阅读时重复使用它:
pipe.stdout.read(framesize)
所以在我看来你必须更新每个:
_, frame = cap.read()
到
_, frame = cap.read(framesize)
假设分辨率相同,最终代码版本为:
import cv2
# set up capture and print properties
print 'cv2 version = {}'.format(cv2.__version__)
cap = cv2.VideoCapture('demo.avi')
fps = cap.get(cv2.cv.CV_CAP_PROP_FPS)
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('initial attributes: fps = {}, pos_msec = {}, pos_frames = {}'
.format(fps, pos_msec, pos_frames))
resolution = (593, 792) #here resolution
framesize = resolution[0]*resolution[1]*3 #here framesize
# get first frame and save as picture
_, frame = cap.read( framesize ) #update to get one frame
cv2.imwrite('first_frame.png', frame)
# advance 10 seconds, that's 100*10 = 1000 frames at 100 fps
for _ in range(1000):
_, frame = cap.read( framesize ) #update to get one frame
# in the actual code, the frame is now analyzed
# save a picture of the current frame
cv2.imwrite('after_iteration.png', frame)
# print properties after iteration
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after iteration: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# assert that the capture (thinks it) is where it is supposed to be
# (assertions succeed)
assert pos_frames == 1000 + 1 # (+1: iteration started with second frame)
assert pos_msec == 10000 + 10
# manually set the capture to msec position 10010
# note that this should change absolutely nothing in theory
cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, 10010)
# print properties again to be extra sure
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after setting msec pos manually: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# save a picture of the next frame, should show the same clock as
# previously taken image - but does not
_, frame = cap.read()
cv2.imwrite('after_setting.png', frame)
(只要这个问题符合条件,我就会悬赏 500 声望——除非问题被关闭。)
一句有问题
从 VideoCapture
中读取帧使视频比预期的更进一步。
说明
我需要读取和分析特定时间间隔之间 100 fps(根据 cv2
和 VLC 媒体播放器)视频的帧。在下面的最小示例中,我试图读取三分钟视频的前十秒的所有帧。
我正在创建一个 cv2.VideoCapture
对象,我从中读取帧,直到达到以毫秒为单位的所需位置。在我的实际代码中,分析了每一帧,但为了展示错误,这个事实是无关紧要的。
在读取帧后检查 VideoCapture
的当前帧和毫秒位置会产生正确的值,因此 VideoCapture
认为 它在右边位置 - 但事实并非如此。保存最后一个读取帧的图像显示我的迭代严重超过目标时间 超过两分钟 。
更奇怪的是,如果我手动将捕获的毫秒位置VideoCapture.set
设置为10秒(读取帧后相同的值VideoCapture.get
returns)并且保存图片,视频(几乎)在正确的位置!
演示视频文件
如果您想 运行 MCVE,您需要 demo.avi 视频文件。 可以下载HERE.
MCVE
这个 MCVE 是精心制作和评论的。如有不明之处,请在问题下方留言。
如果您使用的是 OpenCV 3,则必须将 cv2.cv.CV_
的所有实例替换为 cv2.
。 (我的两个版本都有这个问题。)
import cv2
# set up capture and print properties
print 'cv2 version = {}'.format(cv2.__version__)
cap = cv2.VideoCapture('demo.avi')
fps = cap.get(cv2.cv.CV_CAP_PROP_FPS)
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('initial attributes: fps = {}, pos_msec = {}, pos_frames = {}'
.format(fps, pos_msec, pos_frames))
# get first frame and save as picture
_, frame = cap.read()
cv2.imwrite('first_frame.png', frame)
# advance 10 seconds, that's 100*10 = 1000 frames at 100 fps
for _ in range(1000):
_, frame = cap.read()
# in the actual code, the frame is now analyzed
# save a picture of the current frame
cv2.imwrite('after_iteration.png', frame)
# print properties after iteration
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after iteration: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# assert that the capture (thinks it) is where it is supposed to be
# (assertions succeed)
assert pos_frames == 1000 + 1 # (+1: iteration started with second frame)
assert pos_msec == 10000 + 10
# manually set the capture to msec position 10010
# note that this should change absolutely nothing in theory
cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, 10010)
# print properties again to be extra sure
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after setting msec pos manually: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# save a picture of the next frame, should show the same clock as
# previously taken image - but does not
_, frame = cap.read()
cv2.imwrite('after_setting.png', frame)
MCVE输出
print
语句产生以下输出。
cv2 version = 2.4.9.1
initial attributes: fps = 100.0, pos_msec = 0.0, pos_frames = 0.0
attributes after reading: pos_msec = 10010.0, pos_frames = 1001.0
attributes after setting msec pos manually: pos_msec = 10010.0, pos_frames = 1001.0
如您所见,所有属性都具有预期值。
imwrite
保存以下图片
first_frame.png
after_iteration.png
after_setting.png
你可以在第二张图片中看到问题。 9:26:15(图片中的实时时钟)的目标错过了超过两分钟。手动设置目标时间(第三张图片)将视频设置为(几乎)正确的位置。
我做错了什么,我该如何解决?
目前已尝试
cv2 2.4.9.1 @Ubuntu16.04
cv2 2.4.13 @ Scientific Linux 7.3(三台电脑)
cv2 3.1.0 @ Scientific Linux 7.3(三台电脑)
使用
创建捕获cap = cv2.VideoCapture('demo.avi', apiPreference=cv2.CAP_FFMPEG)
和
cap = cv2.VideoCapture('demo.avi', apiPreference=cv2.CAP_GSTREAMER)
在 OpenCV 3 中(版本 2 似乎没有 apiPreference
参数)。
使用 cv2.CAP_GSTREAMER
需要非常长的时间(运行 MCVE 大约需要 2-3 分钟)但是两个 api 偏好都会产生相同的错误图像。
当使用 ffmpeg
直接读取帧时(归功于 this 教程)会生成正确的输出图像。
import numpy as np
import subprocess as sp
import pylab
# video properties
path = './demo.avi'
resolution = (593, 792)
framesize = resolution[0]*resolution[1]*3
# set up pipe
FFMPEG_BIN = "ffmpeg"
command = [FFMPEG_BIN,
'-i', path,
'-f', 'image2pipe',
'-pix_fmt', 'rgb24',
'-vcodec', 'rawvideo', '-']
pipe = sp.Popen(command, stdout = sp.PIPE, bufsize=10**8)
# read first frame and save as image
raw_image = pipe.stdout.read(framesize)
image = np.fromstring(raw_image, dtype='uint8')
image = image.reshape(resolution[0], resolution[1], 3)
pylab.imshow(image)
pylab.savefig('first_frame_ffmpeg_only.png')
pipe.stdout.flush()
# forward 1000 frames
for _ in range(1000):
raw_image = pipe.stdout.read(framesize)
pipe.stdout.flush()
# save frame 1001
image = np.fromstring(raw_image, dtype='uint8')
image = image.reshape(resolution[0], resolution[1], 3)
pylab.imshow(image)
pylab.savefig('frame_1001_ffmpeg_only.png')
pipe.terminate()
这产生了正确的结果! (正确的时间戳 9:26:15)
frame_1001_ffmpeg_only.png:
附加信息
在评论中,我被要求提供 cvconfig.h
文件。我似乎只有 /opt/opencv/3.1.0/include/opencv2/cvconfig.h
下的 cv2 版本 3.1.0 的这个文件。
HERE是这个文件的粘贴。
如果有帮助,我可以使用 VideoCapture.get
提取以下视频信息。
brightness 0.0
contrast 0.0
convert_rgb 0.0
exposure 0.0
format 0.0
fourcc 1684633187.0
fps 100.0
frame_count 18000.0
frame_height 593.0
frame_width 792.0
gain 0.0
hue 0.0
mode 0.0
openni_baseline 0.0
openni_focal_length 0.0
openni_frame_max_depth 0.0
openni_output_mode 0.0
openni_registration 0.0
pos_avi_ratio 0.01
pos_frames 0.0
pos_msec 0.0
rectification 0.0
saturation 0.0
您的视频文件数据仅包含 1313 个非重复帧(即持续时间为每秒 7 到 8 帧):
$ ffprobe -i demo.avi -loglevel fatal -show_streams -count_frames|grep frame
has_b_frames=0
r_frame_rate=100/1
avg_frame_rate=100/1
nb_frames=18000
nb_read_frames=1313 # !!!
转换带有 ffmpeg
的 avi 文件报告 16697 个重复帧(出于某种原因添加了 10 个额外的帧,16697=18010-1313)。
$ ffmpeg -i demo.avi demo.mp4
...
frame=18010 fps=417 Lsize=3705kB time=03:00.08 bitrate=168.6kbits/s dup=16697
# ^^^^^^^^^
...
BTW, thus converted video (
demo.mp4
) is devoid of the problem being discussed, that is OpenCV processes it correctly.
在这种情况下,avi 文件中实际上并不存在重复帧,而是每个重复帧都由重复前一帧的指令表示。可以这样检查:
$ ffplay -loglevel trace demo.avi
...
[ffplay_crop @ 0x7f4308003380] n:16 t:2.180000 pos:1311818.000000 x:0 y:0 x+w:792 y+h:592
[avi @ 0x7f4310009280] dts:574 offset:574 1/100 smpl_siz:0 base:1000000 st:0 size:81266
video: delay=0.130 A-V=0.000094
Last message repeated 9 times
video: delay=0.130 A-V=0.000095
video: delay=0.130 A-V=0.000094
video: delay=0.130 A-V=0.000095
[avi @ 0x7f4310009280] dts:587 offset:587 1/100 smpl_siz:0 base:1000000 st:0 size:81646
[ffplay_crop @ 0x7f4308003380] n:17 t:2.320000 pos:1393538.000000 x:0 y:0 x+w:792 y+h:592
video: delay=0.140 A-V=0.000091
Last message repeated 4 times
video: delay=0.140 A-V=0.000092
Last message repeated 1 times
video: delay=0.140 A-V=0.000091
Last message repeated 6 times
...
在上面的日志中,具有实际数据的帧由以“[avi @ 0xHHHHHHHHHHH]
”开头的行表示。 “video: delay=xxxxx A-V=yyyyy
”消息表示最后一帧必须再显示 xxxxx
秒。
cv2.VideoCapture()
跳过此类重复帧,只读取具有真实数据的帧。这是相应的(虽然,略微编辑)code from the 2.4 branch of opencv(注意,顺便说一句,使用了 ffmpeg 下面,我在 gdb 下通过 运行 python 验证并在 [=20 上设置断点=]):
bool CvCapture_FFMPEG::grabFrame()
{
...
int count_errs = 0;
const int max_number_of_attempts = 1 << 9; // !!!
...
// get the next frame
while (!valid)
{
...
int ret = av_read_frame(ic, &packet);
...
// Decode video frame
avcodec_decode_video2(video_st->codec, picture, &got_picture, &packet);
// Did we get a video frame?
if(got_picture)
{
//picture_pts = picture->best_effort_timestamp;
if( picture_pts == AV_NOPTS_VALUE_ )
picture_pts = packet.pts != AV_NOPTS_VALUE_ && packet.pts != 0 ? packet.pts : packet.dts;
frame_number++;
valid = true;
}
else
{
// So, if the next frame doesn't have picture data but is
// merely a tiny instruction telling to repeat the previous
// frame, then we get here, treat that situation as an error
// and proceed unless the count of errors exceeds 1 billion!!!
if (++count_errs > max_number_of_attempts)
break;
}
}
...
}
简而言之:我在装有 OpenCV 2.4.13 的 Ubuntu 12.04 机器上重现了您的问题,注意到您视频中使用的编解码器 (FourCC CVID) 似乎相当旧(根据此 post 从 2011 年开始),并且在将视频转换为编解码器 MJPG(又名 M-JPEG 或 Motion JPEG)之后,您的 MCVE 就可以工作了。当然,Leon(或其他人)可能 post 修复了 OpenCV,这可能是更适合您的情况的解决方案。
我最初尝试使用
进行转换ffmpeg -i demo.avi -vcodec mjpeg -an demo_mjpg.avi
和
avconv -i demo.avi -vcodec mjpeg -an demo_mjpg.avi
(都在 16.04 盒子上)。有趣的是,两者都制作了 "broken" 个视频。例如,当使用 Avidemux 跳转到第 1000 帧时,没有实时时钟!此外,转换后的视频只有原始大小的 1/6 左右,这很奇怪,因为 M-JPEG 是一种非常简单的压缩。 (每一帧都是独立的 JPEG 压缩。)
使用 Avidemux 将 demo.avi
转换为 M-JPEG 生成了 MCVE 在其上运行的视频。 (我使用 Avidemux GUI 进行转换。)转换后的视频大小约为原始大小的 3 倍。当然,也可以使用 Linux 上支持更好的编解码器进行原始录制。如果您需要在应用程序中跳转到视频中的特定帧,M-JPEG 可能是最佳选择。否则,H.264 压缩得更好。根据我的经验,两者都得到了很好的支持,而且我看到的唯一代码是直接在网络摄像头上实现的(H.264 仅适用于高端网络摄像头)。
如你所说:
When using ffmpeg directly to read frames (credit to this tutorial) the correct output images are produced.
是否正常,因为你定义了一个
framesize = resolution[0]*resolution[1]*3
然后在阅读时重复使用它:
pipe.stdout.read(framesize)
所以在我看来你必须更新每个:
_, frame = cap.read()
到
_, frame = cap.read(framesize)
假设分辨率相同,最终代码版本为:
import cv2
# set up capture and print properties
print 'cv2 version = {}'.format(cv2.__version__)
cap = cv2.VideoCapture('demo.avi')
fps = cap.get(cv2.cv.CV_CAP_PROP_FPS)
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('initial attributes: fps = {}, pos_msec = {}, pos_frames = {}'
.format(fps, pos_msec, pos_frames))
resolution = (593, 792) #here resolution
framesize = resolution[0]*resolution[1]*3 #here framesize
# get first frame and save as picture
_, frame = cap.read( framesize ) #update to get one frame
cv2.imwrite('first_frame.png', frame)
# advance 10 seconds, that's 100*10 = 1000 frames at 100 fps
for _ in range(1000):
_, frame = cap.read( framesize ) #update to get one frame
# in the actual code, the frame is now analyzed
# save a picture of the current frame
cv2.imwrite('after_iteration.png', frame)
# print properties after iteration
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after iteration: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# assert that the capture (thinks it) is where it is supposed to be
# (assertions succeed)
assert pos_frames == 1000 + 1 # (+1: iteration started with second frame)
assert pos_msec == 10000 + 10
# manually set the capture to msec position 10010
# note that this should change absolutely nothing in theory
cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, 10010)
# print properties again to be extra sure
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after setting msec pos manually: pos_msec = {}, pos_frames = {}'
.format(pos_msec, pos_frames))
# save a picture of the next frame, should show the same clock as
# previously taken image - but does not
_, frame = cap.read()
cv2.imwrite('after_setting.png', frame)