Amazon Echo 如何捕获错误？

Question

我正在 python 中为 Amazon Echo 构建一个应用程序。当我说出 Amazon Echo 无法识别的脏话时，我的技能就会退出并 returns 我回到主屏幕。我希望避免这种情况并重复 Amazon Echo 刚才说的话。

为了在某种程度上实现这一目标，我尝试调用一个函数在会话结束或检测到错误输入时说些什么。

def on_session_ended(session_ended_request, session):
    """
    Called when the user ends the session.
    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    return get_session_end_response()

但是，我刚刚从 Echo 中得到一个错误 -- 这个函数 on_session_ended 从未被输入过。

那么如何在Amazon Echo上进行错误捕获和处理呢？

更新 1： 我将带有自定义槽的话语数量和意图数量减少为一个。现在用户应该只说 A、B、C 或 D。如果他们说的不是这个，那么意图仍然会被触发，但没有槽值。因此，我可以根据插槽值是否存在进行一些错误检查。但是，这似乎不是最好的方法。当我尝试添加没有插槽和相应话语的意图时，任何与我的意图都不匹配的内容都默认为这个新意图。我该如何解决这些问题？

更新 2：这是我的代码的一些相关部分。

意图处理程序：

def lambda_handler(event, context):
    print("Python START -------------------------------")
    print("event.session.application.applicationId=" +
          event['session']['application']['applicationId'])

    if event['session']['new']:
        on_session_started({'requestId': event['request']['requestId']},
                           event['session'])

    if event['request']['type'] == "LaunchRequest":
        return on_launch(event['request'], event['session'])
    elif event['request']['type'] == "IntentRequest":
        return on_intent(event['request'], event['session'])
    elif event['request']['type'] == "SessionEndedRequest":
        return on_session_ended(event['request'], event['session'])


def on_session_started(session_started_request, session):
    print("on_session_started requestId=" + session_started_request['requestId']
          + ", sessionId=" + session['sessionId'])


def on_launch(launch_request, session):
    """ Called when the user launches the skill without specifying what they want """
    print("on_launch requestId=" + launch_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    # Dispatch to your skill's launch
    return create_new_user()


def on_intent(intent_request, session):
    """ Called when the user specifies an intent for this skill """

    print("on_intent requestId=" + intent_request['requestId'] +
          ", sessionId=" + session['sessionId'])

    intent = intent_request['intent']
    intent_name = intent['name']
    attributes = session["attributes"] if 'attributes' in session else None
    intent_slots = intent['slots'] if 'slots' in intent else None

    # Dispatch to skill's intent handlers

    # TODO : Authenticate users
    #   TODO : Start session in a different spot depending on where user left off

    if intent_name == "StartQuizIntent":
        return create_new_user()

    elif intent_name == "AnswerIntent":
        return get_answer_response(intent_slots, attributes)

    elif intent_name == "TestAudioIntent":
        return get_audio_response()

    elif intent_name == "AMAZON.HelpIntent":
        return get_help_response()

    elif intent_name == "AMAZON.CancelIntent":
        return get_session_end_response()

    elif intent_name == "AMAZON.StopIntent":
        return get_session_end_response()

    else:
        return get_session_end_response()


def on_session_ended(session_ended_request, session):
    """
    Called when the user ends the session.
    Is not called when the skill returns should_end_session=true
    """
    print("on_session_ended requestId=" + session_ended_request['requestId'] +
          ", sessionId=" + session['sessionId'])
    return get_session_end_response()

然后我们有了实际调用的函数和响应生成器。我已经编辑了一些隐私代码。我还没有建立所有的显示响应文本字段并且有一些硬编码的 uid，所以我还不必担心身份验证。

# --------------- Functions that control the skill's behavior ------------------

####### GLOBAL SETTINGS ########
utility_background_image = "https://i.imgur.com/XXXX.png"


def get_welcome_response():
    """ Returns the welcome message if a user invokes the skill without specifying an intent """
    session_attributes = {}
    card_title = ""
    speech_output = ("Hello and welcome ... quiz .... blah blah ...")
    reprompt_text = "Ask me to start and we will begin the test!"
    should_end_session = False

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session,
                                                   build_display_response(utility_background_image,
                                                                          card_title, primary_text,
                                                                          secondary_text)))


def get_session_end_response():
    """ Returns the ending message if a user errs or exits the skill """
    session_attributes = {}
    card_title = ""
    speech_output = "Thank you for your time!"
    reprompt_text = ''
    should_end_session = True

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session,
                                                   build_display_response(utility_background_image,
                                                                          card_title, primary_text,
                                                                          secondary_text)))


def get_audio_response():
    """ Tests the audio capabilities of the echo """
    session_attributes = {}
    card_title = ""  # TODO : keep no 'welcome'?
    speech_output = ""
    reprompt_text = ""
    should_end_session = False

    # visual responses
    primary_text = ''  # TODO
    secondary_text = ''  # TODO

    return build_response(session_attributes,
                          build_speechlet_response(card_title, speech_output, reprompt_text,
                                                   should_end_session, build_audio_response()))


def create_new_user():
    """ Creates a new user that the server will recognize and whose action will be stored in db """
    url = "http://XXXXXX:XXXX/create_user"
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))
    uuid = data["uuid"]
    return ask_question(uuid)


def query_server(uuid):
    """ Requests to get num_questions number of questions from the server """
    url = "http://XXXXXXXX:XXXX/get_question_json?uuid=%s" % (uuid)  # TODO : change needs to to be uuid
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))

    if data["status"]:
        question = data["data"]["question"]
        quid = data["data"]["quid"]
        next_quid = data["data"]["next_quid"]  # TODO : will we need any of this?
        topic = data["data"]["topic"]
        type = data["data"]["type"]
        media_type = data["data"]["media_type"]  # either 'IMAGE', 'AUDIO', or 'VIDEO'
        answers = data["data"]["answer"]  # list of answers stored in order they should be spoken
        images = data["data"]["image"]  # list of images that correspond to order of answers list
        audio = data["data"]["audio"]
        video = data["data"]["video"]

        question_data = {"status": True, "data":{"question": question, "quid": quid, "answers": answers,
                         "media_type": media_type, "images": images, "audio": audio, "video": video}}
        if next_quid is "None":
            return None
        return question_data
    else:
        return {"status": False}


def ask_question(uuid):
    """ Returns a quiz question to the user since they specified a QuizIntent """
    question_data = query_server(uuid)

    if question_data is None:
        return get_session_end_response()

    card_title = "Ask Question"
    speech_output = ""
    session_attributes = {}
    should_end_session = False
    reprompt_text = ""

    # visual responses
    display_title = ""
    primary_text = ""
    secondary_text = ""

    images = []
    answers = []

    if question_data["status"]:
        session_attributes = {
            "quid": question_data["data"]["quid"],
            "uuid": "df876c9d-cd41-4b9f-a3b9-3ccd1b441f24",
            "question_start_time": time.time()
        }

        question = question_data["data"]["question"]
        answers = question_data["data"]["answers"]  # answers are shuffled when pulled from server
        images = question_data["data"]["images"]
        # TODO : consider different media types

        speech_output += question
        reprompt_text += ("Please choose an answer using the official NATO alphabet. For example," +
                          " A is alpha, B is bravo, and C is charlie.")

    else:
        speech_output += "Oops! This is embarrassing. There seems to be a problem with the server."
        reprompt_text += "I don't exactly know where to go from here. I suggest restarting this skill."

    return build_response(session_attributes, build_speechlet_response(card_title, speech_output,
            reprompt_text, should_end_session,
             build_display_response_list_template2(title=question, image_urls=images, answers=answers)))


def send_quiz_responses_to_server(uuid, quid, time_used_for_question, answer_given):
    """ Sends the users responses back to the server to be stored in the database """
    url = ("http://XXXXXXXX:XXXX/send_answers?uuid=%s&quid=%s&time=%s&answer_given=%s" %
          (uuid, quid, time_used_for_question, answer_given))
    response = urllib.request.urlopen(url)
    data = json.loads(response.read().decode('utf8'))
    return data["status"]


def get_answer_response(slots, attributes):
    """ Returns a correct/incorrect message to the user depending on their AnswerIntent """

    # get time, quid, and uuid from attributes
    question_start_time = attributes["question_start_time"]
    quid = attributes["quid"]
    uuid = attributes["uuid"]

    # get answer from slots
    try:
        answer_given = slots["Answer"]["value"].lower()
    except KeyError:
        return get_session_end_response()

    # calculate a rough estimate of the time it took to answer question
    time_used_for_question = str(int(time.time() - question_start_time))

    # record response data by sending it to the server
    send_quiz_responses_to_server(uuid, quid, time_used_for_question, answer_given)

    return ask_question(uuid)


def get_help_response():
    """ Returns a help message to the user since they called AMAZON.HelpIntent """
    session_attributes = {}
    card_title = ""
    speech_output = "" # TODO
    reprompt_text = "" # TODO
    should_end_session = False

    return build_response(session_attributes,
            build_speechlet_response(card_title, speech_output, reprompt_text, should_end_session,
             build_display_response(utility_background_image, card_title)))


# --------------- Helpers that build all of the responses ----------------------


def build_hint_response(hint):
    """
    Builds the hint response for a display.

    For example, Try "Alexa, play number 1" where "play number 1" is the hint.
    """
    return {
        "type": "Hint",
        "hint": {
            "type": "RichText",
            "text": hint
        }
    }


def build_display_response(url='', title='', primary_text='', secondary_text='', tertiary_text=''):
    """
    Builds the display template for the echo show to display.

    Echo show screen is 1024px x 600px

    For additional image size requirements, see the display interface reference.
    """
    return [{
        "type": "Display.RenderTemplate",
        "template": {
            "type": "BodyTemplate1",
            "token": "question",
            "title": title,
            "backgroundImage": {
                "contentDescription": "Question",
                "sources": [
                    {
                        "url": url
                    }
                ]
            },
            "textContent": {
                "primaryText": {
                    "type": "RichText",
                    "text": primary_text
                },
                "secondaryText": {
                    "type": "RichText",
                    "text": secondary_text
                },
                "tertiaryText": {
                    "type": "RichText",
                    "text": tertiary_text
                }
            }
        }
    }]


def build_list_item(url='', primary_text='', secondary_text='', tertiary_text=''):
    return {
        "token": "question_item",
        "image": {
            "sources": [
                {
                    "url": url
                }
            ],
            "contentDescription": "Question Image"
        },
        "textContent": {
            "primaryText": {
                "type": "RichText",
                "text": primary_text
            },
            "secondaryText": {
                "text": secondary_text,
                "type": "PlainText"
            },
            "tertiaryText": {
                "text": tertiary_text,
                "type": "PlainText"
            }
        }
    }


def build_display_response_list_template2(title='', image_urls=[], answers=[]):
    list_items = []
    for image, answer in zip(image_urls, answers):
        list_items.append(build_list_item(url=image, primary_text=answer))

    return [{
        "type": "Display.RenderTemplate",
        "template": {
            "type": "ListTemplate2",
            "token": "question",
            "title": title,
            "backgroundImage": {
                "contentDescription": "Question Background",
                "sources": [
                    {
                        "url": "https://i.imgur.com/HkaPLrK.png"
                    }
                ]
            },
            "listItems": list_items
        }
    }]


def build_audio_response(url): # TODO add a display repsonse here as well
    """ Builds audio response. I.e. plays back an audio file with zero offset """
    return [{
        "type": "AudioPlayer.Play",
        "playBehavior": "REPLACE_ALL",
        "audioItem": {
            "stream": {
                "token": "audio_clip",
                "url": url,
                "offsetInMilliseconds": 0
            }
        }
    }]


def build_speechlet_response(title, output, reprompt_text, should_end_session, directive=None):
    """ Builds speechlet response and puts display response inside """
    return {
        'outputSpeech': {
            'type': 'PlainText',
            'text': output
        },
        'card': {
            'type': 'Simple',
            'title': title,
            'content': output
        },
        'reprompt': {
            'outputSpeech': {
                'type': 'PlainText',
                'text': reprompt_text
            }
        },
        'shouldEndSession': should_end_session,
        'directives': directive
    }


def build_response(session_attributes, speechlet_response):
    """ Builds the complete response to send back to Alexa """
    return {
        'version': '1.0',
        'sessionAttributes': session_attributes,
        'response': speechlet_response
    }

更新 3： 我更新了意向，所以现在有一个自定义意向需要一个自定义槽，然后我有另一个自定义意向没有槽。这些自定义意图也有自己的示例话语。下面列出了他们的意图和话语。当我开始这项技能时，它工作正常。然后当我 say/type "zoo zoo zoo" 测试错误输入时，我得到一个错误。下面列出了 "zoo zoo zoo" 的请求和响应。我正在寻找一种好方法来捕捉这个错误的输入错误，并 resume/revert 将技能恢复到以前的状态。

意图：

...
{
      "intent": "TestAudioIntent"
},
{
      "slots": [
        {
          "name": "Answer",
          "type": "LETTER"
        }
      ],
      "intent": "AnswerIntent"
},
...

示例话语：

AnswerIntent {Answer}
AnswerIntent I think it is {Answer}
TestAudioIntent test the audio

示例JSON请求：

{
  "session": {
    "new": false,
    "sessionId": "SessionId.574f0b74-be17-4f79-bbd6-ce926a1bf856",
    "application": {
      "applicationId": "XXXXXXXX"
    },
    "attributes": {
      "quid": "7fa9fcbf-35db-4bbd-ac73-37977bcef563",
      "question_start_time": 1515691612.7381804,
      "uuid": "df876c9d-cd41-4b9f-a3b9-3ccd1b441f24"
    },
    "user": {
      "userId": "XXXXXXXX"
    }
  },
  "request": {
    "type": "IntentRequest",
    "requestId": "EdwRequestId.23765cb0-f327-4f52-a9a3-b9f92a375a5f",
    "intent": {
      "name": "TestAudioIntent",
      "slots": {}
    },
    "locale": "en-US",
    "timestamp": "2018-01-11T17:26:57Z"
  },
  "context": {
    "AudioPlayer": {
      "playerActivity": "IDLE"
    },
    "System": {
      "application": {
        "applicationId": "XXXXXXXX"
      },
      "user": {
        "userId": "XXXXXXXX"
      },
      "device": {
        "supportedInterfaces": {
          "Display": {
            "templateVersion": "1",
            "markupVersion": "1"
          }
        }
      }
    }
  },
  "version": "1.0"
}

我收到以下测试错误作为响应：

The remote endpoint could not be called, or the response it returned was invalid.

Answer 1

我最后做的是对我的所有插槽使用类似于 Amazon's dialogue management system. If a user says something that doesn't fill a slot, I re-prompt them with that question. My goal is to record a user's statements/answers after each time they speak, thus I didn't use the built-in dialogue management. Additionally, I used Amazon's slot synonyms 的东西来使我的模态更健壮。

我仍然不知道这是最好的方法，但它是一个起点并且似乎有效 O.K...

Amazon Echo 如何捕获错误？

How can the Amazon Echo catch errors?

python

amazon-web-services

amazon-echo