如何通过麦克风语音为我网页上的用户自动完成表单填写过程?

How can I automate the form filling process for a user on my webpage with voice via their microphone?

我有一个带有 Flask 网络表单的网页。目前,用户需要手动将他们的信息输入到网页中。然后它被附加到一个 table 上,一旦点击提交,它们就会被重定向到。设置基本上是:视频自动播放并询问用户问题,用户手动填写他们的答案,点击提交后,他们会看到他们的答案附加到 table.

我想减少页面的混乱程度,让用户可以口头回答视频问题。我读过有关 getusermedia、websockets 和 WebRTC 的信息,但对它们感到困惑。我到处都看过了,youtube、reddit 等等。具体来说,, here, here, and here运气不佳。

我在想一个简单的 for 循环,语音识别器在字典中有不同的变量,然后按原样传递数据,但我不确定如何将麦克风动作与前端连接起来。前端不就是所有数据所在的地方吗,需要http请求去获取并分析?这是我的代码:

main.py:

from flask import render_template, Flask, request
import os
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer as SIA
import nltk
import io
import os
from nltk.corpus import stopwords
import speech_recognition as sr

app = Flask(__name__, static_folder = 'static')

# # set the stopwords to be the english version
# stop_words = set(stopwords.words("english"))
# # create the recognizer
# r = sr.Recognizer()
# # define the microphone
# mic = sr.Microphone(device_index=0)
# r.energy_threshold = 300
# # vader sentiment analyzer for analyzing the sentiment of the text
# sid = SIA()
# user = []
# location = []
# state = []
# info = [user, location, state]
# # patient.name?


@app.route("/", methods=["GET", "POST"])
def home():
    user = request.values.get('name')
    location = request.values.get('location')
    state = request.values.get('state')
    # if request.method == "POST":
        # with mic as source:
        #     holder = []
        #     for x in info:
        #         audio_data = r.listen(source)
        #         r.adjust_for_ambient_noise(source)
        #         text = r.recognize_google(audio_data, language = 'en-IN')
        #         holder.append(text.lower())
        #         if x == "state":
        #             ss = sid.polarity_scores(holder)
        #             if ss == "neg":
        #                 x.append(str("sad"))
        #             else:
        #                 x.append(str("not sad"))
        #         else:
        #             filtered_words = [words for words in holder if not words in stop_words] # this filters out the stopwords
        #             x.append(filtered_words.lower())

        # return redirect(url_for('care', user = user))

    return render_template('index.html', user = user, location=location, state=state)

@app.route("/care", methods=["POST"])
def care():
    user = request.values.get('name')
    location = request.values.get('location')
    state = request.values.get('state')
    return render_template('list.html', user = user, location=location, state=state)


if __name__ == "__main__":
    #app.run(debug=True)    
    app.run(debug=True, threaded=True)

index.html:

{% extends "base.html" %}
{% block content %}

<!---------Therapist Section--------->
    <section id="therapist">
        <div class="container" id="therapist_container">
            <script>
              window.onload = function() {
            </script>
            <div id="button">
              <button type="button" class="btn btn-primary" id="therapist-button" data-toggle="modal" data-target="#myModal">Talk with Delphi</button>
            </div>
            
            <!-- Modal -->
            <div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="vid1Title" aria-hidden="true">
              <div class="modal-dialog modal-dialog-centered" role="document">
                <div class="modal-content">
                  <div class="modal-body">
                    <video width="100%" id="video1">
                      <source src="./static/movie.mp4" type="video/mp4">
                    </video>
                    <form action="/care" method="POST">
                      <input type="text" name="name" placeholder="what's your name?" id="name">
                      <input type="text" name="location" placeholder="Where are you?" id="location">
                      <input type="text" name="state" placeholder="how can I help?" id="state">
                      <input id="buttonInput" class="btn btn-success form-control" type="submit" value="Send">
                    </form>
                  </div>
                </div>
              </div>
            </div>
            <script>
              $('#myModal').on('shown.bs.modal', function () {
              $('#video1')[0].play();
              })
              $('#myModal').on('hidden.bs.modal', function () {
                $('#video1')[0].pause();
              })
              video = document.getElementById('video1');
              video.addEventListener('ended',function(){       
              window.location.pathname = '/care';})

              function callback(stream) {
                  var context = new webkitAudioContext();
                  var mediaStreamSource = context.createMediaStreamSource(stream);
              }

              $(document).ready(function() {
                  navigator.webkitGetUserMedia({audio:true}, callback);
              }

            </script>
        </div>
    </section>
{% endblock content %}

list.html:

{% extends "base.html" %}
{% block content %}

<!----LIST------>
<section id="care_list">
    <div class="container" id="care_list_container">
        <h1 class="jumbotron text-center" id="care_list_title">{{ user }} Care Record</h1>
        <div class="container">
            <table class="table table-hover"> 
                <thead>
                  <tr>
                    <th scope="col">Session #</th>
                    <th scope="col">Length</th>
                    <th scope="col">Location</th>
                    <th scope="col">State</th> 
                  </tr>
                </thead>
                <tbody>
                  <tr>
                    <th scope="row">1</th>
                    <td>{{ length }}</td>
                    <td>{{ location }}</td>
                    <td>{{ state }}</td>
                  </tr>
                  <tr>
                    <th scope="row">2</th>
                    <td></td>
                    <td></td>
                    <td></td>
                  </tr>
                  <tr>
                    <th scope="row">3</th>
                    <td colspan="2"></td>
                    <td></td>
                  </tr>
                </tbody>
              </table>
        <ul class="list-group list-group-flush" id="care_list">
            <li class="list-group-item">Please email tom@vrifyhealth.com for help.</li>
        </ul>
    </div> 
</section> 
{% endblock content %}

实际上你不能使用flask来进行语音识别。 Flask 是一个后端框架,在您托管它的服务器上运行。由于您希望识别用户所说的语音,因此您需要使用客户端的一些东西,即使用 JavaScript。您可以使用 this 教程来完成您的任务。

比我们想象的要简单 就像我们创建 new Array() 一样,有一个代码 new SpeechRecognition() 可以创建语音到文本转换器。不需要外部库来执行此操作。 这是代码:-

            /* JS comes here */
            function SpeechRecog() {
                var output = document.getElementById("output");
                var action = document.getElementById("action");
                var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition;
                var recognition = new SpeechRecognition();
            
                // This runs when the speech recognition service starts
                recognition.onstart = function() {
                    action.innerHTML = "<small>listening, please speak...</small>";
                };
                
                recognition.onspeechend = function() {
                    action.innerHTML = "<small>stopped listening, hope you are done...</small>";
                    recognition.stop();
                }
              
                // This runs when the speech recognition service returns result
                recognition.onresult = function(event) {
                    var transcript = event.results[0][0].transcript;
                    var confidence = event.results[0][0].confidence;
                    output.value=transcript;
                };
              
                 // start recognition
                 recognition.start();
            }
button{
  color:white;
  background:blue;
  border:none;
  padding:10px;margin:5px;
  border-radius:1em;
}
input{
  padding:.5em;margin:.5em;
}
<p>I'm Aakash1282,<br> Are you lazy, here is voice writer for your</p>
<p><button type="button" onclick="SpeechRecog()">Write By Voice</button> &nbsp; <span id="action"></span></p>
        <input type="text" id="output">

这些代码在 Stack Overflow 中出现了一些问题,但在本地文件中运行良好,这里是 Codepen 工作代码:https://codepen.io/aakash1282/pen/xxqeQyM

参考它并根据需要制作表格。