Hive 查询中的语法错误

syntax error in Hive Query

我正在尝试回答这个问题

Of  the right-handed    batters who were    born    in  October and died in 
2011,   which   one had the most    hits    in  his career?

我尝试获取查询,请忽略总数,它应该是 b.hits 的总和,不知道如何给它起别名。

SELECT n.id, n.bmonth, n.dyear,n.bats, SUM(b.hits) FROM master n
JOIN (SELECT b.id , b.hits FROM batting GROUP BY id) o
WHERE n.bmonth == 10 AND n.dyear == 2011) x
ON x.id=n.id 
ORDER BY total DESC;

如果有人需要所用两个表的架构,请看下面。

INSERT OVERWRITE DIRECTORY '/home/hduser/hivetest/answer4' 
SELECT n.id, n.bmonth, n.dyear,n.bats, SUM(b.hits) FROM master n
JOIN (SELECT b.id , b.hits FROM batting GROUP BY id) o
WHERE n.bmonth == 10 AND n.dyear == 2011) x
ON x.id=n.id 
ORDER BY total DESC;

首先,尽管 Hive 接受 ==,但这并不意味着您应该使用它。标准的 SQL 相等运算符就是 =。没有理由使用同义词。

我怀疑问题出在几个方面:

  • 缺少group by.
  • 聚合函数的误用。
  • 缺少别名
  • SQL 以正确的顺序查询子句
  • 括号不平衡

换句话说,查询只是一团糟。您需要复习查询语法的基础知识。这个有用吗?

SELECT m.id, m.bmonth, m.dyear, m.bats, b.hits as total
FROM master m JOIN
     (SELECT b.id, SUM(b.hits) as hits
      FROM batting b
      GROUP BY id
     ) b
     ON b.id = m.id 
WHERE m.bmonth = 10 AND m.dyear = 2011
ORDER BY total DESC;