SELECT DISTINCT，试图从连接中提取，需要唯一的结果

Question

有两个 table 有问题； leads和contactAttempts。

我正在尝试从单独的 table 上的文本列中提取特定关键字的 2012 年联系的不同线索。

问题是我一遍又一遍地获得相同的潜在客户 ID，而且结果非常庞大，以至于超时并导致网站崩溃。

我已经尝试了多种变体，包括所有在一个 sql 语句中。将它们分成 2 sql 语句是我最近一次尝试的迭代。

Select distinct 当前无法以我正在尝试的格式运行。 id 在两个 table 中都是主要的，leadID 将 leads 连接到 contactAttempts：

<? $sql="SELECT * 
 FROM contactAttempts a
 JOIN leads l
 ON l.id = a.leadID  
 WHERE l.agentID = 2 
 AND l.leadType IN(0,2)
 AND a.timestamp BETWEEN '2012-01-01 00:00:00' AND '2012-12-31 23:59:59' 
 LIMIT 0,50";
$res=mysql_query($sql);
while($row=mysql_fetch_assoc($res)){
    $sql2="SELECT DISTINCT leadID FROM contactAttempts WHERE 
leadID='$row[id]' AND (contactAttempts.notes LIKE '%shown%' OR 
contactAttempts.notes LIKE '%showed%' OR contactAttempts.notes LIKE 
'%offer%' OR contactAttempts.notes LIKE '%inspection%' OR 
contactAttempts.notes LIKE '%appraisal%' OR contactAttempts.notes LIKE 
'%closing%' OR contactAttempts.notes LIKE '%drive%' OR 
contactAttempts.notes LIKE '%drove%' OR contactAttempts.notes LIKE '%car%' 
OR contactAttempts.notes LIKE '%preview%' OR contactAttempts.notes LIKE 
'%previewed%' OR contactAttempts.notes LIKE '%took pictures%') LIMIT 1";
    $res2=mysql_query($sql2);$x=0;
    while($row2=mysql_fetch_assoc($res2)){
        $x++;
        echo $x.' - '.$row2['leadID'];
        echo '<br />';
    }
} ?>

这是一个来源示例： table 潜在客户：

id - 100，一堆对这个脚本没有用的其他列

id - 200，一堆对这个脚本没有用的其他列

table 联系尝试次数：

id - 1，leadID - 100，注释 - 'Showed house to customer, they liked it'，时间戳 - '2012-01-21 12:05:11'

id - 2，leadID - 100，注释 - 'Showed house to customer again, they liked it'，时间戳 - '2012-02-21 12:05:11'

id - 3，leadID - 200，注释 - 'Showed house to a different customer, they hated it'，时间戳 - '2012-01-21 12:05:11'

现在，结果将是：100,100,200。我需要结果为 100,200。该脚本需要省略多次出现的 leadID 100。

Answer 1

在我（或其他人）开始解决实际问题之前；对您的第一个查询的一些观察。

首先，该查询（我相信）在逻辑上与以下内容相同，我发现后者更易于阅读：

SELECT * 
  FROM contactAttempts a
  JOIN leads l
    ON l.id = a.leadID  
 WHERE l.agentID = 2 
   AND l.leadType IN(0,2)
   AND YEAR(a.timestamp) = 2012 
 LIMIT 0,50

其次，YEAR() 阻止使用索引，因此在较大的数据集上，这会大大降低速度。 a.timestamp BETWEEN '2012-01-01 00:00:00' AND '2012-12-31 23:59:59' 可能看起来很麻烦，但在索引数据上会快得多。

第三，没有 ORDER BY 的 LIMIT 几乎没有意义。

同样，LIKE '%...' 不能使用索引，尽管 LIKE '...%' 可以

Answer 2

<? 
$sql="
 SELECT 
    l.id as id,
    a.id as attempt_id,
    a.leadID as leadID,
 FROM leads l
 INNER JOIN (
     SELECT id, leadID
     FROM contactAttempts 
     WHERE contactAttempts.notes REGEXP 'shown|showed|offer|inspection|appraisal|closing|drive|drove|car|preview|previewed|took pictures' 
     AND timestamp BETWEEN '2012-01-01 00:00:00' AND '2012-12-31 23:59:59'
    ) as a
 ON a.leadID = l.id
 WHERE l.agentID = 2 
   AND l.leadType IN(0,2)
 ORDER BY l.id, a.id
 ";
$res=mysql_query($sql);
$currLead = 0;
$x=0;
while($row=mysql_fetch_assoc($res)){
    if ($currLead != $row['leadID']){
        echo 'NEW DISTINCT LEAD = '.$row['leadID'].'<br />';
        $x=0;
    }
    $x++;
    echo $x.' of lead '.$row['leadID'].'  attempt '.$row['attempt_id'];
    echo '<br />';
    $currLead = $row['leadID'];
} ?>

SELECT DISTINCT，试图从连接中提取，需要唯一的结果

SELECT DISTINCT, trying to pull from a join, need unique results

mysql

select

distinct