在 SQL Server 如何使用 TYPE COLUMN 选项创建全文索引

In SQL Server How to create full text index with TYPE COLUMN option

我有一个 table 定义如下

CREATE TABLE [dbo].[Dialogs](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [DiscussionID] [int] NOT NULL,
    [ApprovedByUserID] [int] NULL,
    [AddedByUserID] [int] NULL,
    [Text] [nvarchar](max) NULL,
    [ApprovalStatus] [int] NULL,
    [ApprovedOn] [datetime] NULL,
    [AddedOn] [datetime] NOT NULL,
 CONSTRAINT [PK_dbo.Dialogs] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Text 列包含用户输入的 html。我想在 Text 列上创建全文索引,我还需要支持 html 过滤器,这样当任何用户键入 <div><p> 或任何其他 html 标签,那么它不会 return 任何结果。

创建索引我在下面使用SQL

CREATE FULLTEXT INDEX ON [Dialogs]
 ( 
  [Text] TYPE COLUMN '.html'
 ) 
  KEY INDEX [PK_dbo.Dialogs]
      ON AOPRDefault; 

但是 SQL Server 抛出错误

Incorrect syntax near '.html'.

谁能举例说明在创建全文索引时如何指定TYPE COLUMN选项。

首先,问题是您应该指的是包含 '.html' 的列,而不是文字,因此您可能有类似的内容:

-- ADD COMPUTED COLUMN TO STORE FILE TYPE
ALTER TABLE dbo.Dialogs ADD FileExtension AS '.html'; 

CREATE FULLTEXT INDEX ON dbo.Dialogs ([Text] TYPE COLUMN FileExtension)
    KEY INDEX [PK_dbo.Dialogs] ON AOPRDefault;

然而,根据 the documentation

,您误解了 TYPE COLUMN 属性 的目的

TYPE COLUMN type_column_name

Specifies the name of a table column, type_column_name, that is used to hold the document type for a varbinary(max) or image document. This column, known as the type column, contains a user-supplied file extension (.doc, .pdf, .xls, and so forth). The type column must be of type char, nchar, varchar, or nvarchar.

Specify TYPE COLUMN type_column_name only if column_name specifies a varbinary(max) or image column, in which data is stored as binary data; otherwise, SQL Server returns an error.

Note

At indexing time, the Full-Text Engine uses the abbreviation in the type column of each table row to identify which full-text search filter to use for the document in column_name. The filter loads the document as a binary stream, removes the formatting information, and sends the text from the document to the word-breaker component. For more information, see Configure and Manage Filters for Search

由于您的索引位于文本列上,因此这不适用,并且创建索引语句会 return 出错。即使您将 html 文档存储为二进制数据,它仍然无法按预期工作,您所追求的是 html parsing,这是与全文索引不同的问题。