Rasa 的 NoSQL 注入
NoSQL Injections with Rasa
安全问题
我最近一直在研究 Rasa 并使用 MongoDB 作为数据库。
我想知道是否应该以某种方式预处理 Rasa 的输入以防止 NoSQL 注入?
我尝试将自定义组件作为 Rasa NLU Pipeline 的一部分,但是一旦有东西到达 NLU 管道的第一个元素,原始文本似乎也保存在 Mongo.
domain_file
language: "de"
pipeline:
- name: "nlu_components.length_limiter.LengthLimiter"
- name: "tokenizer_whitespace"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
length_limiter.py - 注意 "process" 方法
from rasa_nlu.components import Component
MAX_LENGTH = 300
class LengthLimiter(Component):
"""
This component shortens the input message to MAX_LENGTH chars
in order to prevent overloading the bot
"""
# Name of the component to be used when integrating it in a
# pipeline. E.g. ``[ComponentA, ComponentB]``
# will be a proper pipeline definition where ``ComponentA``
# is the name of the first component of the pipeline.
name = "LengthLimiter"
# Defines what attributes the pipeline component will
# provide when called. The listed attributes
# should be set by the component on the message object
# during test and train, e.g.
# ```message.set("entities", [...])```
provides = []
# Which attributes on a message are required by this
# component. e.g. if requires contains "tokens", than a
# previous component in the pipeline needs to have "tokens"
# within the above described `provides` property.
requires = []
# Defines the default configuration parameters of a component
# these values can be overwritten in the pipeline configuration
# of the model. The component should choose sensible defaults
# and should be able to create reasonable results with the defaults.
defaults = {
"MAX_LENGTH": 300
}
# Defines what language(s) this component can handle.
# This attribute is designed for instance method: `can_handle_language`.
# Default value is None which means it can handle all languages.
# This is an important feature for backwards compatibility of components.
language_list = None
def __init__(self, component_config=None):
super(LengthLimiter, self).__init__(component_config)
def train(self, training_data, cfg, **kwargs):
"""Train this component.
This is the components chance to train itself provided
with the training data. The component can rely on
any context attribute to be present, that gets created
by a call to :meth:`components.Component.pipeline_init`
of ANY component and
on any context attributes created by a call to
:meth:`components.Component.train`
of components previous to this one."""
pass
def process(self, message, **kwargs):
"""Process an incoming message.
This is the components chance to process an incoming
message. The component can rely on
any context attribute to be present, that gets created
by a call to :meth:`components.Component.pipeline_init`
of ANY component and
on any context attributes created by a call to
:meth:`components.Component.process`
of components previous to this one."""
message.text = message.text[:self.defaults["MAX_LENGTH"]]
def persist(self, model_dir):
"""Persist this component to disk for future loading."""
pass
@classmethod
def load(cls, model_dir=None, model_metadata=None, cached_component=None,
**kwargs):
"""Load this component from file."""
if cached_component:
return cached_component
else:
component_config = model_metadata.for_component(cls.name)
return cls(component_config)
我在 mongo 追踪器商店玩了转,但无法注入任何东西。
然而,如果你想添加你自己的组件去绝对确定你必须实现你的 own input channel。在 Rasa Core 处理消息之前,您可以在那里更改消息。
安全问题
我最近一直在研究 Rasa 并使用 MongoDB 作为数据库。
我想知道是否应该以某种方式预处理 Rasa 的输入以防止 NoSQL 注入? 我尝试将自定义组件作为 Rasa NLU Pipeline 的一部分,但是一旦有东西到达 NLU 管道的第一个元素,原始文本似乎也保存在 Mongo.
domain_file
language: "de"
pipeline:
- name: "nlu_components.length_limiter.LengthLimiter"
- name: "tokenizer_whitespace"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
length_limiter.py - 注意 "process" 方法
from rasa_nlu.components import Component
MAX_LENGTH = 300
class LengthLimiter(Component):
"""
This component shortens the input message to MAX_LENGTH chars
in order to prevent overloading the bot
"""
# Name of the component to be used when integrating it in a
# pipeline. E.g. ``[ComponentA, ComponentB]``
# will be a proper pipeline definition where ``ComponentA``
# is the name of the first component of the pipeline.
name = "LengthLimiter"
# Defines what attributes the pipeline component will
# provide when called. The listed attributes
# should be set by the component on the message object
# during test and train, e.g.
# ```message.set("entities", [...])```
provides = []
# Which attributes on a message are required by this
# component. e.g. if requires contains "tokens", than a
# previous component in the pipeline needs to have "tokens"
# within the above described `provides` property.
requires = []
# Defines the default configuration parameters of a component
# these values can be overwritten in the pipeline configuration
# of the model. The component should choose sensible defaults
# and should be able to create reasonable results with the defaults.
defaults = {
"MAX_LENGTH": 300
}
# Defines what language(s) this component can handle.
# This attribute is designed for instance method: `can_handle_language`.
# Default value is None which means it can handle all languages.
# This is an important feature for backwards compatibility of components.
language_list = None
def __init__(self, component_config=None):
super(LengthLimiter, self).__init__(component_config)
def train(self, training_data, cfg, **kwargs):
"""Train this component.
This is the components chance to train itself provided
with the training data. The component can rely on
any context attribute to be present, that gets created
by a call to :meth:`components.Component.pipeline_init`
of ANY component and
on any context attributes created by a call to
:meth:`components.Component.train`
of components previous to this one."""
pass
def process(self, message, **kwargs):
"""Process an incoming message.
This is the components chance to process an incoming
message. The component can rely on
any context attribute to be present, that gets created
by a call to :meth:`components.Component.pipeline_init`
of ANY component and
on any context attributes created by a call to
:meth:`components.Component.process`
of components previous to this one."""
message.text = message.text[:self.defaults["MAX_LENGTH"]]
def persist(self, model_dir):
"""Persist this component to disk for future loading."""
pass
@classmethod
def load(cls, model_dir=None, model_metadata=None, cached_component=None,
**kwargs):
"""Load this component from file."""
if cached_component:
return cached_component
else:
component_config = model_metadata.for_component(cls.name)
return cls(component_config)
我在 mongo 追踪器商店玩了转,但无法注入任何东西。 然而,如果你想添加你自己的组件去绝对确定你必须实现你的 own input channel。在 Rasa Core 处理消息之前,您可以在那里更改消息。