从视频文件中提取音频通道以输入 TensorFlow 的 decode_wav 函数

Question

我想将视频文件的音频通道提供给以下 TenorFlow 函数：

tf.audio.decode_wav(
contents,
desired_channels=-1,
desired_samples=-1,
name=None)

其中参数：

内容：一个字符串类型的Tensor。 WAV 编码的音频，通常从一个文件。
desired_channels：一个可选的整数。默认为 -1。样品数量想要的频道。
desired_samples：一个可选的整数。默认为 -1。音频长度要求。
name：操作的名称（可选）。

Answer 1

您可以提取视频的音频，例如：

import subprocess

command = "ffmpeg -i C:/test.mp4 -ab 160k -ac 2 -ar 44100 -vn audio.wav"

subprocess.call(command, shell=True)

并将*.wav文件作为张量传递给tf.audio.decode_wav:

raw_audio = tf.io.read_file(filename)
waveform = tf.audio.decode_wav(raw_audio)

参考文献：

Python extract wav from video file

Answer 2

对于windows 10，我这样解决了我的问题：

我为 windows from here

从视频中提取的 wav 音频文件为：

import os,shlex, subprocess
files = os.listdir('.')
#convert all video files of the current directory to wav audio files
for f in files:
    if(f.endswith('.wmv') or f.endswith('.mp4') or f.endswith('.avi')):
        command_line = "ffmpeg -i "+f+" -ab 160k -ac 2 -ar 44100 -vn "+f[0:-4]+".wav"
        #command_line = "ffmpeg -i default.wmv -ab 160k -ac 2 -ar 44100 -vn default.wav"
        args = shlex.split(command_line)
        print(args)
        p = subprocess.Popen(args) # Success!
        print(p)

将音频文件解码为 wavform 为：

    import tensorflow as tf
files = os.listdir('.')
#convert all video files of the current directory to wav audio files
for filename in files:
    if(filename.endswith('.wav')):
        print(filename)
        audio_binary = tf.io.read_file(filename,name=None)
        waveform,_ = tf.audio.decode_wav(
            audio_binary,
            desired_channels=-1,
            desired_samples=-1,
            name=None
        )

从视频文件中提取音频通道以输入 TensorFlow 的 decode_wav 函数

Extracting Audio channel from a video file to feed in decode_wav function of TensorFlow

python

wav

tensorflow