大查询；仅从字符串中提取数字

Question

我的数据看起来像一个 1x1000 向量，其中输入的数量可变。有时只是年龄，但有时他们会增加体重和状态 ID。

85 age
15 age; 68 Weight
25 age; 80 Weight; 02 Alaska
72 Weight; 50 Wyoming

我想得到的输出只是数字 - 即

我使用 SPLIT 并没有成功，因为它返回了 2000 多行而不是 1000 行。所以我不知道该怎么做。除非SPLIT可以组合起来告诉我split之前有多少个信息点。 IE。

等等

Answer 1

您可以使用 REGEXP_REPLACE:

SELECT REGEXP_REPLACE("25 age; 80 Weight; 02 Alaska",'[^0-9 ]','')

阅读更多关于 Regular Expression functions

Answer 2

为了完整起见 - 这是您可以使用 SPLIT 来获得拆分前信息点计数的结果：

select left(xs, 2), count(xs) within record from(
select split(x, ";") xs from 
(select "85 age" as x),
(select "15 age; 68 Weight" as x),
(select "25 age; 80 Weight; 02 Alaska" as x),
(select "72 Weight; 50 Wyoming" as x))

大查询；仅从字符串中提取数字

BigQuery; extract numbers only from a string

split

extract

google-bigquery