在 SQL Server 如何使用 TYPE COLUMN 选项创建全文索引
In SQL Server How to create full text index with TYPE COLUMN option
我有一个 table 定义如下
CREATE TABLE [dbo].[Dialogs](
[ID] [int] IDENTITY(1,1) NOT NULL,
[DiscussionID] [int] NOT NULL,
[ApprovedByUserID] [int] NULL,
[AddedByUserID] [int] NULL,
[Text] [nvarchar](max) NULL,
[ApprovalStatus] [int] NULL,
[ApprovedOn] [datetime] NULL,
[AddedOn] [datetime] NOT NULL,
CONSTRAINT [PK_dbo.Dialogs] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Text
列包含用户输入的 html。我想在 Text
列上创建全文索引,我还需要支持 html 过滤器,这样当任何用户键入 <div>
、<p>
或任何其他 html 标签,那么它不会 return 任何结果。
创建索引我在下面使用SQL
CREATE FULLTEXT INDEX ON [Dialogs]
(
[Text] TYPE COLUMN '.html'
)
KEY INDEX [PK_dbo.Dialogs]
ON AOPRDefault;
但是 SQL Server
抛出错误
Incorrect syntax near '.html'.
谁能举例说明在创建全文索引时如何指定TYPE COLUMN
选项。
首先,问题是您应该指的是包含 '.html'
的列,而不是文字,因此您可能有类似的内容:
-- ADD COMPUTED COLUMN TO STORE FILE TYPE
ALTER TABLE dbo.Dialogs ADD FileExtension AS '.html';
CREATE FULLTEXT INDEX ON dbo.Dialogs ([Text] TYPE COLUMN FileExtension)
KEY INDEX [PK_dbo.Dialogs] ON AOPRDefault;
然而,根据 the documentation:
,您误解了 TYPE COLUMN
属性 的目的
TYPE COLUMN type_column_name
Specifies the name of a table column, type_column_name, that is used to hold the document type for a varbinary(max) or image document. This column, known as the type column, contains a user-supplied file extension (.doc, .pdf, .xls, and so forth). The type column must be of type char, nchar, varchar, or nvarchar.
Specify TYPE COLUMN type_column_name only if column_name specifies a varbinary(max) or image column, in which data is stored as binary data; otherwise, SQL Server returns an error.
Note
At indexing time, the Full-Text Engine uses the abbreviation in the type column of each table row to identify which full-text search filter to use for the document in column_name. The filter loads the document as a binary stream, removes the formatting information, and sends the text from the document to the word-breaker component. For more information, see Configure and Manage Filters for Search
由于您的索引位于文本列上,因此这不适用,并且创建索引语句会 return 出错。即使您将 html 文档存储为二进制数据,它仍然无法按预期工作,您所追求的是 html parsing,这是与全文索引不同的问题。
我有一个 table 定义如下
CREATE TABLE [dbo].[Dialogs](
[ID] [int] IDENTITY(1,1) NOT NULL,
[DiscussionID] [int] NOT NULL,
[ApprovedByUserID] [int] NULL,
[AddedByUserID] [int] NULL,
[Text] [nvarchar](max) NULL,
[ApprovalStatus] [int] NULL,
[ApprovedOn] [datetime] NULL,
[AddedOn] [datetime] NOT NULL,
CONSTRAINT [PK_dbo.Dialogs] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Text
列包含用户输入的 html。我想在 Text
列上创建全文索引,我还需要支持 html 过滤器,这样当任何用户键入 <div>
、<p>
或任何其他 html 标签,那么它不会 return 任何结果。
创建索引我在下面使用SQL
CREATE FULLTEXT INDEX ON [Dialogs]
(
[Text] TYPE COLUMN '.html'
)
KEY INDEX [PK_dbo.Dialogs]
ON AOPRDefault;
但是 SQL Server
抛出错误
Incorrect syntax near '.html'.
谁能举例说明在创建全文索引时如何指定TYPE COLUMN
选项。
首先,问题是您应该指的是包含 '.html'
的列,而不是文字,因此您可能有类似的内容:
-- ADD COMPUTED COLUMN TO STORE FILE TYPE
ALTER TABLE dbo.Dialogs ADD FileExtension AS '.html';
CREATE FULLTEXT INDEX ON dbo.Dialogs ([Text] TYPE COLUMN FileExtension)
KEY INDEX [PK_dbo.Dialogs] ON AOPRDefault;
然而,根据 the documentation:
,您误解了TYPE COLUMN
属性 的目的
TYPE COLUMN type_column_name
Specifies the name of a table column, type_column_name, that is used to hold the document type for a varbinary(max) or image document. This column, known as the type column, contains a user-supplied file extension (.doc, .pdf, .xls, and so forth). The type column must be of type char, nchar, varchar, or nvarchar.
Specify TYPE COLUMN type_column_name only if column_name specifies a varbinary(max) or image column, in which data is stored as binary data; otherwise, SQL Server returns an error.
Note
At indexing time, the Full-Text Engine uses the abbreviation in the type column of each table row to identify which full-text search filter to use for the document in column_name. The filter loads the document as a binary stream, removes the formatting information, and sends the text from the document to the word-breaker component. For more information, see Configure and Manage Filters for Search
由于您的索引位于文本列上,因此这不适用,并且创建索引语句会 return 出错。即使您将 html 文档存储为二进制数据,它仍然无法按预期工作,您所追求的是 html parsing,这是与全文索引不同的问题。