Hive 查询中的语法错误
syntax error in Hive Query
我正在尝试回答这个问题
Of the right-handed batters who were born in October and died in
2011, which one had the most hits in his career?
我尝试获取查询,请忽略总数,它应该是 b.hits 的总和,不知道如何给它起别名。
SELECT n.id, n.bmonth, n.dyear,n.bats, SUM(b.hits) FROM master n
JOIN (SELECT b.id , b.hits FROM batting GROUP BY id) o
WHERE n.bmonth == 10 AND n.dyear == 2011) x
ON x.id=n.id
ORDER BY total DESC;
如果有人需要所用两个表的架构,请看下面。
INSERT OVERWRITE DIRECTORY '/home/hduser/hivetest/answer4'
SELECT n.id, n.bmonth, n.dyear,n.bats, SUM(b.hits) FROM master n
JOIN (SELECT b.id , b.hits FROM batting GROUP BY id) o
WHERE n.bmonth == 10 AND n.dyear == 2011) x
ON x.id=n.id
ORDER BY total DESC;
首先,尽管 Hive 接受 ==
,但这并不意味着您应该使用它。标准的 SQL 相等运算符就是 =
。没有理由使用同义词。
我怀疑问题出在几个方面:
- 缺少
group by
.
- 聚合函数的误用。
- 缺少别名
- SQL 以正确的顺序查询子句
- 括号不平衡
换句话说,查询只是一团糟。您需要复习查询语法的基础知识。这个有用吗?
SELECT m.id, m.bmonth, m.dyear, m.bats, b.hits as total
FROM master m JOIN
(SELECT b.id, SUM(b.hits) as hits
FROM batting b
GROUP BY id
) b
ON b.id = m.id
WHERE m.bmonth = 10 AND m.dyear = 2011
ORDER BY total DESC;
我正在尝试回答这个问题
Of the right-handed batters who were born in October and died in
2011, which one had the most hits in his career?
我尝试获取查询,请忽略总数,它应该是 b.hits 的总和,不知道如何给它起别名。
SELECT n.id, n.bmonth, n.dyear,n.bats, SUM(b.hits) FROM master n
JOIN (SELECT b.id , b.hits FROM batting GROUP BY id) o
WHERE n.bmonth == 10 AND n.dyear == 2011) x
ON x.id=n.id
ORDER BY total DESC;
如果有人需要所用两个表的架构,请看下面。
INSERT OVERWRITE DIRECTORY '/home/hduser/hivetest/answer4'
SELECT n.id, n.bmonth, n.dyear,n.bats, SUM(b.hits) FROM master n
JOIN (SELECT b.id , b.hits FROM batting GROUP BY id) o
WHERE n.bmonth == 10 AND n.dyear == 2011) x
ON x.id=n.id
ORDER BY total DESC;
首先,尽管 Hive 接受 ==
,但这并不意味着您应该使用它。标准的 SQL 相等运算符就是 =
。没有理由使用同义词。
我怀疑问题出在几个方面:
- 缺少
group by
. - 聚合函数的误用。
- 缺少别名
- SQL 以正确的顺序查询子句
- 括号不平衡
换句话说,查询只是一团糟。您需要复习查询语法的基础知识。这个有用吗?
SELECT m.id, m.bmonth, m.dyear, m.bats, b.hits as total
FROM master m JOIN
(SELECT b.id, SUM(b.hits) as hits
FROM batting b
GROUP BY id
) b
ON b.id = m.id
WHERE m.bmonth = 10 AND m.dyear = 2011
ORDER BY total DESC;