按申请从BigQuery获取专利前向引用数据
Acquire patents' forward citation data from BigQuery by application
我想通过application_number
这样收集数据。真实申请号为CN 201510747352
.
SELECT c.application_number AS Pub, COUNT(p.publication_number) AS CitedBy
FROM `patents-public-data.patents.publications` AS p, UNNEST(citation) AS c
WHERE c.application_number IN ('CN-201510747352-A')
GROUP BY c.application_number
但是不行。 url 是专利页面。谁能帮我一个忙? https://patents.google.com/patent/CN105233911B/zh?oq=CN201510747352.8
我的猜测是专利在其状态为申请后可以被引用 - 所以不要使用初始编号 CN-201510747352
- 当状态为申请时你应该使用 app/pub 编号 - 你还需要申请不仅是不同的计数,而且还排除了带有 -A 或 -B 等后缀的相同应用程序的计数 - 这就是为什么您会看到使用 regex_extract 函数
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number LIKE ('CN-105233911%')
GROUP BY c.publication_number
结果
Row Pub CitedBy
1 CN-105233911-A 10
... If I only have the application data, how can I realize it?
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number IN (
SELECT publication_number
FROM `patents-public-data.patents.publications`
WHERE application_number IN ('CN-201510747352-A')
)
GROUP BY c.publication_number
我想通过application_number
这样收集数据。真实申请号为CN 201510747352
.
SELECT c.application_number AS Pub, COUNT(p.publication_number) AS CitedBy
FROM `patents-public-data.patents.publications` AS p, UNNEST(citation) AS c
WHERE c.application_number IN ('CN-201510747352-A')
GROUP BY c.application_number
但是不行。 url 是专利页面。谁能帮我一个忙? https://patents.google.com/patent/CN105233911B/zh?oq=CN201510747352.8
我的猜测是专利在其状态为申请后可以被引用 - 所以不要使用初始编号 CN-201510747352
- 当状态为申请时你应该使用 app/pub 编号 - 你还需要申请不仅是不同的计数,而且还排除了带有 -A 或 -B 等后缀的相同应用程序的计数 - 这就是为什么您会看到使用 regex_extract 函数
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number LIKE ('CN-105233911%')
GROUP BY c.publication_number
结果
Row Pub CitedBy
1 CN-105233911-A 10
... If I only have the application data, how can I realize it?
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number IN (
SELECT publication_number
FROM `patents-public-data.patents.publications`
WHERE application_number IN ('CN-201510747352-A')
)
GROUP BY c.publication_number