如果从 Apple Music 播放音频,SFSpeechRecognizer 无法在真实设备上工作
SFSpeechRecognizer not working on real device if playing audio from Apple Music
已实现语音转文本功能,如果我对着麦克风说话,该功能运行良好。但如果我从 Apple Music 中选择音频,我希望它能正常工作。
我正在使用 MPMediaPickerController 播放音频并且音频播放完美。问题是它没有将其转换为文本。
这是我的代码:
'''
func startRecording() {
// Clear all previous session data and cancel task
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
// Create instance of audio session to record voice
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSession.Category.record, mode: AVAudioSession.Mode.measurement, options: AVAudioSession.CategoryOptions.duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
let inputNode = audioEngine.inputNode
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
// Keep speech recognition data on device
if #available(iOS 13, *) {
recognitionRequest.requiresOnDeviceRecognition = true
}
recognitionRequest.shouldReportPartialResults = true
self.recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
self.timer.invalidate()
if self.count == 0 {
self.textView.text = result!.bestTranscription.formattedString
} else {
self.textView.text = self.text + result!.bestTranscription.formattedString
}
isFinal = (result?.isFinal)!
}
else if result == nil || !isFinal {
self.textView.text = "Press record button and say something, I'm listening!"
}
if isFinal {
// this is to remove 1 minute limit.
self.count = self.count + 1
self.text = self.textView.text
self.timer = Timer.scheduledTimer(timeInterval: TimeInterval(1), target: self, selector: #selector(self.againStartRec), userInfo: nil, repeats: false)
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
isFinal = false
self.MicButton.isEnabled = true
}
if error != nil {
URLCache.shared.removeAllCachedResponses()
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
guard let task = self.recognitionTask else { return }
task.cancel()
task.finish()
}
})
audioEngine.reset()
inputNode.removeTap(onBus: 0)
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.analyzer.analyze(buffer, atAudioFramePosition: when.sampleTime)
self.recognitionRequest?.append(buffer)
}
self.audioEngine.prepare()
do {
try self.audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
}
'''
我找到了答案,如果将来有人需要,请在这里提及。
所以我使用的是 SFSpeechAudioBufferRecognitionRequest() 而不是 SFSpeechURLRecognitionRequest()。
如果您从设备中选择媒体,则需要获取所选音频文件的 url 并将其传递给 SFSpeechURLRecognitionRequest(url: audioURL)。这对我有用。
已实现语音转文本功能,如果我对着麦克风说话,该功能运行良好。但如果我从 Apple Music 中选择音频,我希望它能正常工作。 我正在使用 MPMediaPickerController 播放音频并且音频播放完美。问题是它没有将其转换为文本。 这是我的代码: ''' func startRecording() {
// Clear all previous session data and cancel task
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
// Create instance of audio session to record voice
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSession.Category.record, mode: AVAudioSession.Mode.measurement, options: AVAudioSession.CategoryOptions.duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
let inputNode = audioEngine.inputNode
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
// Keep speech recognition data on device
if #available(iOS 13, *) {
recognitionRequest.requiresOnDeviceRecognition = true
}
recognitionRequest.shouldReportPartialResults = true
self.recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
self.timer.invalidate()
if self.count == 0 {
self.textView.text = result!.bestTranscription.formattedString
} else {
self.textView.text = self.text + result!.bestTranscription.formattedString
}
isFinal = (result?.isFinal)!
}
else if result == nil || !isFinal {
self.textView.text = "Press record button and say something, I'm listening!"
}
if isFinal {
// this is to remove 1 minute limit.
self.count = self.count + 1
self.text = self.textView.text
self.timer = Timer.scheduledTimer(timeInterval: TimeInterval(1), target: self, selector: #selector(self.againStartRec), userInfo: nil, repeats: false)
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
isFinal = false
self.MicButton.isEnabled = true
}
if error != nil {
URLCache.shared.removeAllCachedResponses()
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
guard let task = self.recognitionTask else { return }
task.cancel()
task.finish()
}
})
audioEngine.reset()
inputNode.removeTap(onBus: 0)
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.analyzer.analyze(buffer, atAudioFramePosition: when.sampleTime)
self.recognitionRequest?.append(buffer)
}
self.audioEngine.prepare()
do {
try self.audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
}
'''
我找到了答案,如果将来有人需要,请在这里提及。 所以我使用的是 SFSpeechAudioBufferRecognitionRequest() 而不是 SFSpeechURLRecognitionRequest()。 如果您从设备中选择媒体,则需要获取所选音频文件的 url 并将其传递给 SFSpeechURLRecognitionRequest(url: audioURL)。这对我有用。