按顺序执行文字转语音

Question

我想合成文本。我有一系列句子和一系列停顿，我希望在这些句子之间。

当时是怎么想的 Synthesize -> start the timer, timer fires after provided time -> Synthesize -> start the timer -> Synt...

偶然地，我注意到计时器先触发较少的时间，而不是按顺序执行和设置计时器。循环不会等到合成器完成发音，它会继续运行.

如何计算出合成器按提供的停顿顺序发音句子？

import SwiftUI

struct KingsSpeechView: View {
    @ObservedObject var speaker = Speaker()
    @State private var subtitles = ""

    @State private var currentStepIndex = 0

    let kingsSpeech = [
        "Hello. Let's start the Game! Let the hunger Games Begin...Whoa-Whoa. Here're are the rules on the screen.",
        "Okey, now that you know the rules, chill out. Let's play another game.",
        "You say Hi, I say Ho.",
        "Hooo",
        "Hooo"
     ]
     var pauses = [0.0, 20.0, 90.0, 40.0, 40.0]
     // try to change into this
     // var pauses = [0.0, 20.0, 10.0, 5.0, 5.0]
     // the sequence of execution is completely different
     // the ones that has less value, will execute first
     // While I expected it to execute in order it is in array, instead it runs as it runs (wants)
     // (or maybe it's the case it's just one timer for all)
     // How to prevent loop from continuing to new iteration until the speech is not pronounced?

    var body: some View {
        VStack {
            Text(subtitles)
                .padding(.bottom, 50)
                .padding(.horizontal, 20)
        
        
            Button("Play") {
                playSound()
            }
        }
    }

    func playSound() {

        for step in 0..<kingsSpeech.count {
            let timer = Timer.scheduledTimer(withTimeInterval: pauses[step], repeats: false) { timer in

                subtitles = kingsSpeech[step]
                speaker.speak("\(kingsSpeech[step])")
                print("I am out")
                currentStepIndex += 1


                // I've tried to stop a loop from moving on, before the speech had finished to pronounce 
                // with some sort of a condition maybe; by index or by identifying if the synthesizer is speaking
                // but it even turned out that timer executes completely different, look in time arrays above
                // while speaker.semaphoreIndex == step {
                //     print("still waiting")
                // }
                // while speaker.synth.isSpeaking {
                //
                // }

            }
        }
    }
}

...

import AVFoundation
import Combine

class Speaker: NSObject, ObservableObject, AVSpeechSynthesizerDelegate {
    let synth = AVSpeechSynthesizer()
    // started to try something with simophore, but didn't understand how to implement it
    var semaphore = DispatchSemaphore(value: 0)
    var semaphoreIndex = 0
    

    override init() {
        super.init()
        synth.delegate = self
    }

    func speak(_ string: String) {
        let utterance = AVSpeechUtterance(string: string)
        utterance.voice = AVSpeechSynthesisVoice(language: "en-GB")
        utterance.rate = 0.4
        synth.speak(utterance)
    }
    
}

extension Speaker {
    func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
        print("all done")
        semaphore.signal()
        semaphoreIndex += 1
    }
}

Answer 1

只要说出一句话，接收委托方法，在该方法中等待所需的间隔，然后继续下一个话语和间隔。

这是一个完整的例子。它使用 Cocoa 项目，而不是 SwiftUI，但您可以轻松调整它。

import UIKit
import AVFoundation

func delay(_ delay:Double, closure:@escaping ()->()) {
    let when = DispatchTime.now() + delay
    DispatchQueue.main.asyncAfter(deadline: when, execute: closure)
}

class Speaker : NSObject, AVSpeechSynthesizerDelegate {
    var synth : AVSpeechSynthesizer!
    var sentences = [String]()
    var intervals = [Double]()
    func start(_ sentences: [String], _ intervals: [Double]) {
        self.sentences = sentences
        self.intervals = intervals
        self.synth = AVSpeechSynthesizer()
        synth.delegate = self
        self.sayOne()
    }
    func sayOne() {
        if let sentence = sentences.first {
            sentences.removeFirst()
            let utter = AVSpeechUtterance(string: sentence)
            self.synth.speak(utter)
        }
    }
    func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
        if let interval = intervals.first {
            intervals.removeFirst()
            delay(interval) {
                self.sayOne()
            }
        }
    }
}

class ViewController: UIViewController {
    let speaker = Speaker()
    override func viewDidLoad() {
        super.viewDidLoad()
        let sentences = [
            "I will speak again in one second",
            "I will speak again in five seconds",
            "I will speak again in one second",
            "Done"]
        let intervals = [1.0, 5.0, 1.0]
        self.speaker.start(sentences, intervals)
    }
}

Answer 2

尝试回答我在解决方案评论中提出的问题：目前，它可以播放/暂停

TODO：现在我必须发现如何在句子之间向后/向前跳转。所以，为此我首先需要停止当前的语音任务。 speaker.synth.stopSpeaking(at: .word)

然后，我也许应该有一些索引来跟踪当前阶段是什么。然后，当我停止任务时，我记得索引。我可以后退/前进。现在从index-1或index+1处开始，而不是从头开始。

  @State private var isPlaying = false
      ...
        // play button
        Button(action: {
            
            if isPlaying {
                isPlaying.toggle()
                speaker.synth.pauseSpeaking(at: .word)
            } else {
                isPlaying.toggle()
                // continue playing here if it was paused before, else ignite speech utterance
                if speaker.synth.isPaused {
                    speaker.synth.continueSpeaking()
                } else {
                    speaker.speak()
                }
                
            }
        }, label: {
            Image(systemName: (isPlaying ? "pause.fill" : "play.fill"))
                .resizable()
                .scaledToFit()
                .frame(width: 50, height: 50)
            
        })

按顺序执行文字转语音

Executing text-to-speech in order

timer

avfoundation

avspeechsynthesizer

swift

avspeechutterance