Google Speech to Text API 在本地给出的结果与在线演示不同
Google Speech to Text API gives different results locally than the online demo
6秒mp3音频文件(download)
首先直接在 https://cloud.google.com/speech-to-text/ 上测试,响应符合预期。
"hello brother how are you doing I'm doing really well hope mom is doing well"
然后我创建了 firebase 函数(见下面的代码):
const speech = require('@google-cloud/speech').v1p1beta1
exports.speechToText = functions.https.onRequest(async (req, res) => {
try {
// Creates a client
const client = new speech.SpeechClient()
const gcsUri = `gs://xxxxx.appspot.com/speech.mp3`
const config = {
encoding: 'MP3',
languageCode: 'en-US',
enableAutomaticPunctuation: false,
enableWordTimeOffsets: false,
}
const audio = {
uri: gcsUri,
}
const request = {
config: config,
audio: audio,
}
// Detects speech in the audio file
const [response] = await client.recognize(request)
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n')
console.log(`Transcription: ${transcription}`)
res.send({ response })
} catch (error) {
console.log('error:', error)
res.status(400).send({
error,
})
}
})
我收到以下不正确的响应:
"hello brother, how are you doing hope all is doing well"
更新:
运行 本地时收到相同的错误响应。所以云功能不是问题。
更新#2:
在配置中设置 model:'video'
或 model:'phone_call'
解决了这个问题。即
const config = {
encoding: 'MP3',
languageCode: 'en-US',
enableAutomaticPunctuation: false,
enableWordTimeOffsets: false,
model: 'phone_call',
}
在 config
中设置 model:'video'
或 model:'phone_call'
解决了这个问题。即
const config = {
encoding: 'MP3',
languageCode: 'en-US',
enableAutomaticPunctuation: false,
enableWordTimeOffsets: false,
model: 'phone_call',
}
我想 default
模型不适用于某些类型的音频。
6秒mp3音频文件(download) 首先直接在 https://cloud.google.com/speech-to-text/ 上测试,响应符合预期。
"hello brother how are you doing I'm doing really well hope mom is doing well"
然后我创建了 firebase 函数(见下面的代码):
const speech = require('@google-cloud/speech').v1p1beta1
exports.speechToText = functions.https.onRequest(async (req, res) => {
try {
// Creates a client
const client = new speech.SpeechClient()
const gcsUri = `gs://xxxxx.appspot.com/speech.mp3`
const config = {
encoding: 'MP3',
languageCode: 'en-US',
enableAutomaticPunctuation: false,
enableWordTimeOffsets: false,
}
const audio = {
uri: gcsUri,
}
const request = {
config: config,
audio: audio,
}
// Detects speech in the audio file
const [response] = await client.recognize(request)
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n')
console.log(`Transcription: ${transcription}`)
res.send({ response })
} catch (error) {
console.log('error:', error)
res.status(400).send({
error,
})
}
})
我收到以下不正确的响应:
"hello brother, how are you doing hope all is doing well"
更新: 运行 本地时收到相同的错误响应。所以云功能不是问题。
更新#2:
在配置中设置 model:'video'
或 model:'phone_call'
解决了这个问题。即
const config = {
encoding: 'MP3',
languageCode: 'en-US',
enableAutomaticPunctuation: false,
enableWordTimeOffsets: false,
model: 'phone_call',
}
在 config
中设置 model:'video'
或 model:'phone_call'
解决了这个问题。即
const config = {
encoding: 'MP3',
languageCode: 'en-US',
enableAutomaticPunctuation: false,
enableWordTimeOffsets: false,
model: 'phone_call',
}
我想 default
模型不适用于某些类型的音频。