oracle中按部分字符串分组数据sql(Oracle9i企业版Release 9.2.0.4.0)

Group data by part of string in oracle sql ( Oracle9i Enterprise Edition Release 9.2.0.4.0)

我有以下查询:

select referrer, count(distinct ad_id) as Adverts,
       sum(case f when 'Y' then hits else 0 end) as clicks,
       sum(case f when 'N' then hits else 0 end) as views
from advert_view_hits
where ad_id in ({$id_strings})
group by referrer"

这将 return 如下数据:

 Referrer  Adverts  Clicks  Views
 Caterer     3       124     74
 Indeed      5       234     136

这很好,但在某些情况下,引荐来源网址已像这样存储在数据库中:

user1@jwrecruitment.co.uk_200890,
user2@jwrecruitment.co.uk_200890

user1@gatewayjobs.co.uk_200890, 
user3@towngate-personnel.co.uk_2

我将如何根据用作推荐人的用户电子邮件公司对数据进行分组。

因此数据将如下所示:

Referrer             Adverts  Clicks  Views
 Caterer               3       124     74
 Indeed                5       234     136
 jwrecruitment.co.uk   8       456     782 
 gatewayjobs.co.uk     9       897     959

这样像 jwrecruitment.co.uk 这样的电子邮件的所有数据将被组合在一起并显示。

如果我没听错,你可以使用 regexp_replace() :

select 
    regexp_replace(referrer, '^.*@([^_]+).*$', '') referrer, 
    count(distinct ad_id) as Adverts,
    sum(case f when 'Y' then hits else 0 end) as clicks,
    sum(case f when 'N' then hits else 0 end) as views
from advert_view_hits
where ad_id in ({$id_strings})
group by regexp_replace(referrer, '^.*@([^_]+).*$', '')

正则表达式匹配包含 arobas 的引荐来源网址,并捕获 arobas 之后下划线之前的部分。如果 referrer 与正则表达式不匹配,则保持不变。

如果我没理解错的话,你可以在 @ 之后获取所有内容——你可以使用 regexp_substr():

select regexp_substr(referrer, '[^@]+$') as referrer, count(distinct ad_id) as Adverts,
       sum(case f when 'Y' then hits else 0 end) as clicks,
       sum(case f when 'N' then hits else 0 end) as views
from advert_view_hits
where ad_id in ({$id_strings})
group by regexp_substr(referrer, '[^@]+$') ;

您可以将 regexp_substr() 逻辑替换为:

select substr(referrer, instr(referrer, '@') + 1) as referrer, count(distinct ad_id) as Adverts,
       sum(case f when 'Y' then hits else 0 end) as clicks,
       sum(case f when 'N' then hits else 0 end) as views
from advert_view_hits
where ad_id in ({$id_strings})
group by  substr(referrer, instr(referrer, '@') + 1) ;

您也可以使用regexp_substr来查找@_之间的字符串,如下所示:

select REGEXP_SUBSTR(referrer,'@([^_]+)',1,1,NULL,1) referrer, 
       count(distinct ad_id) as Adverts,
       sum(case f when 'Y' then hits else 0 end) as clicks,
       sum(case f when 'N' then hits else 0 end) as views
  from advert_view_hits
where ad_id in ({$id_strings})
group by REGEXP_SUBSTR(referrer,'@([^_]+)',1,1,NULL,1)

如果您使用的是旧版本,则不要使用 regexp_substr,而是使用 SUBSTR 和 INSTR 的组合,如下所示:

SUBSTR(referrer, 
       INSTR(referrer, '@') + 1, 
       DECODE(INSTR(referrer, '_', - 1), 
              0, 
              LENGTH(referrer) - INSTR(referrer, '@'), 
              INSTR(referrer, '_', - 1) - INSTR(referrer, '@') - 1)
      )