在文本列中搜索字符串并列出计数

Question

好吧，我可能问了一个非常愚蠢的问题，但不知何故我无法找到执行以下操作的方法。

我有一个 table，其中包含如下两列

+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| SL No |                                                                      Work                                                                       |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+
|     1 | Identify Process Champs across all teams for BCUK processes                                                                                     |
|     2 | Impart short training on FMEA to all the Process Champs                                                                                         |
|     2 | List down all critical steps involved in the Process to ascertain the risk involved, feed the details back to FMEA template to analyze the risk |
|     3 | Prioritize the process steps based on Risk Priority Number                                                                                      |
|     4 | Identity the Process Gaps, suggest process improvement ideas to mitigate/mistake proof or reduce the risk involved in the process               |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+

现在我有其他 table 可以容纳 "Key Words"，如下所示

+-------+----------+
| Sl No |   Tags   |
+-------+----------+
|     1 | BCUK     |
|     2 | FMEA     |
|     3 | Priority |
|     4 | Process  |
+-------+----------+

现在我想 "Search for String" 在第一个 table 基于 "tags" 在第二个 table 和 return 像这样

+----------+-------+
|   Tags   | Count |
+----------+-------+
| BCUK     |     1 |
| FMEA     |     2 |
| Priority |     1 |
| Process  |     8 |
+----------+-------+

由于 "Process" 关键字在整个 table（第一个 table）中出现 eight times 跨多行，因此 return 计数为 8。

我正在使用SQL Server 2014 Express Edition

Answer 1

Adam Machanic 具有用于此类操作的函数 GetSubstringCount。我根据您的需要对其进行了一些修改。更多信息：http://dataeducation.com/counting-occurrences-of-a-substring-within-a-string/

示例数据

CREATE TABLE MyTable(
    SLNo    INT,
    Work    VARCHAR(4000)
)
INSERT INTO MyTable VALUES
(1, 'Identify Process Champs across all teams for BCUK processes'),
(2, 'Impart short training on FMEA to all the Process Champs'),
(2, 'List down all critical steps involved in the Process to ascertain the risk involved, feed the details back to FMEA template to analyze the risk'),
(3, 'Prioritize the process steps based on Risk Priority Number'),
(4, 'Identity the Process Gaps, suggest process improvement ideas to mitigate/mistake proof or reduce the risk involved in the process');

CREATE TABLE KeyWord(
    SLNo    INT,
    Tag     VARCHAR(20)
)
INSERT INTO KeyWord VALUES
(1, 'BCUK'),
(2, 'FMEA'),
(3, 'Priority'),
(4, 'Process');

解决方案

;WITH E1(N) AS(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,E2 AS(SELECT 1 AS N FROM E1 a, E1 b)
,E4 AS(SELECT 1 AS N FROM E2 a, E2 b)
,Tally(N) AS(
    SELECT TOP(11000) ROW_NUMBER() OVER(ORDER BY(SELECT NULL))FROM E4 a, e4 b
)
SELECT
    k.Tag,
    [Count] = SUM(x.cc)
FROM KeyWord k
CROSS JOIN MyTable m
CROSS APPLY(
    SELECT COUNT(*) AS cc
    FROM Tally
    WHERE
        SUBSTRING(m.Work, N, LEN(k.tag)) = k.tag
)x
GROUP BY k.tag

结果

Tag                  Count
-------------------- -----------
BCUK                 1
FMEA                 2
Priority             1
Process              8

Answer 2

我没有计算匹配项，而是用一个额外的字符替换它们并将长度与原始长度进行比较。这样计数就非常容易和快速了。

测试表和数据

DECLARE @texts table(SL_No int identity(1,1),Work varchar(max))

INSERT @texts VALUES
  ('Identify Process Champs across all teams for BCUK processes'),
  ('Impart short training on FMEA to all the Process Champs'),
  ('List down all critical steps involved in the Process to ascertain the risk involved, feed the details back to FMEA template to analyze the risk'),
  ('Prioritize the process steps based on Risk Priority Number'),
  ('Identity the Process Gaps, suggest process improvement ideas to mitigate/mistake proof or reduce the risk involved in the process')

DECLARE @searchvalues table(S1_No int identity(1,1),Tags varchar(max))

INSERT @searchvalues
VALUES('CUK'),('FMEA'),('Priority'),('Process')

查询：

SELECT 
  sum(len(replace(txt.work, sv.tags, sv.tags + '@')) - len(txt.work)) count, 
  tags
FROM 
  @texts txt
CROSS APPLY
  @searchvalues sv
WHERE charindex(sv.tag, txt.work) > 0
GROUP BY tags

结果：

count   tags
1   CUK
2   FMEA
1   Priority
8   Process

在文本列中搜索字符串并列出计数

Search for string in a text column and list the count

sql-server

sql-server-2014-express