需要使连接使用列数据作为通用匹配的通配符
Need to make a join use column data as wild card for common match
大家好,我正在尝试使用 table 中仅包含 URLS 的数据,看看我的主 URLS 是否有任何变化或使用它们 URLS =23=]
url │··························································································
--------------------------------------- │··························································································
.0.9.40.52 │··························································································
.00000000314.0000000265.00000225.0323 │··························································································
.001916.com │··························································································
.00386.com │··························································································
.00-5dj-ar4c.club │··························································································
.007band.ru │··························································································
.007crconcert-japan.com │··························································································
.007pi.com │··························································································
.00dt7myo.work │··························································································
.00dzhqbghr.com │··························································································
(10 rows)
主要Table
user_id | campaign_id | url
| send_time
---------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------+---------------------
8468677 | 1004001 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Faircraft-orders-deliveries%2Fnigeria-s-arik-air-replaces-boeing-747s-787s&text=Nigeria%E2%80%99s+Arik+Air+replaces+Boeing+74
7s+with+787s&count=none | 2017-01-28 13:01:28
8468677 | 1003945 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Fairframes%2Fairbaltic-cs300-performance-exceeding-expectations&text=AirBaltic%3A+CS300+performance+%E2%80%98exceeding+expect
ations%E2%80%99&count=none | 2017-01-14 13:03:29
8468677 | 1004189 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Fairframes%2Famerican-again-defers-a350-deliveries-first-pushed-back-2020&text=American+again+defers+A350+deliveries%3B+first
+pushed+back+to+2020&count=none | 2017-05-02 12:02:04
8468677 | 1004057 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Fairframes%2Fatlas-has-acquired-all-20-767s-be-operated-amazon&text=Atlas+has+acquired+all+20+767s+to+be+operated+for+Amazon&
count=none | 2017-02-28 13:02:13
(4 rows)
我正在尝试运行以下内容
select t1.user_id,t1.campaign_id,t1.url
from madison_alldb as t1
inner join madison_url as t2
ON t1.url LIKE CONCAT('%',t1.url, '%');
但是当我回到我的 tmux 时,它只是说 Killed.. 我也不确定上面的方法是否有效。
我的目标是限制拥有与我的 URL table 通配符匹配的域的用户。
这种类型的查询很难优化。一个小的优化是使用 exists
而不是 join
:
select ma.*
from madison_alldb ma
where exists (select 1
from madison_url mu
where ma.url like concat('%', mu.url, '%')
)
limit 10;
不过,这还是要做嵌套循环连接。唯一的区别是匹配时比较少。
大家好,我正在尝试使用 table 中仅包含 URLS 的数据,看看我的主 URLS 是否有任何变化或使用它们 URLS =23=]
url │··························································································
--------------------------------------- │··························································································
.0.9.40.52 │··························································································
.00000000314.0000000265.00000225.0323 │··························································································
.001916.com │··························································································
.00386.com │··························································································
.00-5dj-ar4c.club │··························································································
.007band.ru │··························································································
.007crconcert-japan.com │··························································································
.007pi.com │··························································································
.00dt7myo.work │··························································································
.00dzhqbghr.com │··························································································
(10 rows)
主要Table
user_id | campaign_id | url
| send_time
---------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------+---------------------
8468677 | 1004001 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Faircraft-orders-deliveries%2Fnigeria-s-arik-air-replaces-boeing-747s-787s&text=Nigeria%E2%80%99s+Arik+Air+replaces+Boeing+74
7s+with+787s&count=none | 2017-01-28 13:01:28
8468677 | 1003945 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Fairframes%2Fairbaltic-cs300-performance-exceeding-expectations&text=AirBaltic%3A+CS300+performance+%E2%80%98exceeding+expect
ations%E2%80%99&count=none | 2017-01-14 13:03:29
8468677 | 1004189 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Fairframes%2Famerican-again-defers-a350-deliveries-first-pushed-back-2020&text=American+again+defers+A350+deliveries%3B+first
+pushed+back+to+2020&count=none | 2017-05-02 12:02:04
8468677 | 1004057 | http://twitter.com/share?url=http%3A%2F%2Fatwonline.com%2Fairframes%2Fatlas-has-acquired-all-20-767s-be-operated-amazon&text=Atlas+has+acquired+all+20+767s+to+be+operated+for+Amazon&
count=none | 2017-02-28 13:02:13
(4 rows)
我正在尝试运行以下内容
select t1.user_id,t1.campaign_id,t1.url
from madison_alldb as t1
inner join madison_url as t2
ON t1.url LIKE CONCAT('%',t1.url, '%');
但是当我回到我的 tmux 时,它只是说 Killed.. 我也不确定上面的方法是否有效。
我的目标是限制拥有与我的 URL table 通配符匹配的域的用户。
这种类型的查询很难优化。一个小的优化是使用 exists
而不是 join
:
select ma.*
from madison_alldb ma
where exists (select 1
from madison_url mu
where ma.url like concat('%', mu.url, '%')
)
limit 10;
不过,这还是要做嵌套循环连接。唯一的区别是匹配时比较少。