使用正则表达式排除值 - BigQuery

Exclude Values with Regex - BigQuery

如果我想从聚合中排除字段中的正则表达式匹配项,我会在标准 SQL 中使用什么。代码运行;然而,它 计算我在正则表达式中的值,我希望它排除 。我有这个代码:

SELECT
    channelGrouping, 
    date,
    SUM(totals.timeOnSite) AS Session_Duration,
    SUM(totals.visits) AS Visits,
    AVG(totals.timeonSite/totals.visits) AS Avg_Time_per_Session,
    SUM(totals.bounces) AS Bounce,
    (SUM(totals.bounces)/SUM(totals.visits)) AS Bounce_rate
FROM
    `93868086.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 365 DAY))
AND
    FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY 
    date,
    channelGrouping,
    geoNetwork.networkLocation 
HAVING
    REGEXP_CONTAINS(geoNetwork.networkLocation,
    r"^(ovh \(nwk\)|hostwinds llc.|bhost inc|prisma networks llc|psychz networks|buyvm services|private customer|secure dragon llc.|vmpanel|netaction telecom srl-d|hostigation|frontlayer technologies inc.|digital energy technologies limited|owned-networks|rica web services|netaction telecom srl-d|hurricane electric inc.|private customer - host.howpick.com|ssdvirt|sway broadband|detect network|gorillaservers inc.|micfo llc.| netaction telecom srl|egihosting|zenlayer inc|intercom online inc.|gs1 argentine|ovh hosting inc.|vps cheap inc.|limeip networks|blackhost ltd.|amazon.com inc.)$")
ORDER BY
    date ASC

取决于你真正想要什么(问题不是 100% 清楚)你应该简单地添加 NOT 到你的 HAVING 语句

HAVING NOT REGEXP_CONTAINS ...   

或将排除逻辑移动到 WHERE 子句

SELECT
    channelGrouping, 
    date,
    SUM(totals.timeOnSite) AS Session_Duration,
    SUM(totals.visits) AS Visits,
    AVG(totals.timeonSite/totals.visits) AS Avg_Time_per_Session,
    SUM(totals.bounces) AS Bounce,
    (SUM(totals.bounces)/SUM(totals.visits)) AS Bounce_rate
FROM
    `93868086.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 365 DAY))
AND
    FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
AND NOT
    REGEXP_CONTAINS(geoNetwork.networkLocation,
    r"^(ovh \(nwk\)|hostwinds llc.|bhost inc|prisma networks llc|psychz networks|buyvm services|private customer|secure dragon llc.|vmpanel|netaction telecom srl-d|hostigation|frontlayer technologies inc.|digital energy technologies limited|owned-networks|rica web services|netaction telecom srl-d|hurricane electric inc.|private customer - host.howpick.com|ssdvirt|sway broadband|detect network|gorillaservers inc.|micfo llc.| netaction telecom srl|egihosting|zenlayer inc|intercom online inc.|gs1 argentine|ovh hosting inc.|vps cheap inc.|limeip networks|blackhost ltd.|amazon.com inc.)$")
GROUP BY 
    date,
    channelGrouping,
    geoNetwork.networkLocation 
ORDER BY
    date ASC

另请注意:最终 SELECT 语句中缺少 networkLocation。不是致命的,但把它放在那里会更有意义