PIG 中的标量投影无效
Invalid scalar projection in PIG
我在 PIG 中的数据列名称为
关键字,campaign_id,日期,时间,display_site,was_clicked,每次点击费用,国家/地区,展示位置
我想做的是寻找点击率高的关键字。
所以,我试图理解为什么以下代码会给我无效的标量投影错误
grouped = GROUP data BY keyword;
by_keyword = FOREACH grouped
{
clicked = FILTER data BY was_clicked == 1;
total = COUNT(data.keyword);
GENERATE group, ((double)COUNT(clicked) / total) AS ctr;
}
我遇到的错误:
37,632 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse:
<line 59, column 33> Invalid scalar projection: clicked : A column needs to be projected from a relation for it to be used as a scalar
Details at logfile: /home/cloudera/pig_1486224821223.log
如有任何帮助,我们将不胜感激。
编辑:
data = LOAD '/user/cloudera/pig_demo/ad_data.txt' AS (keyword:chararray,campaign_id:chararray,
date:chararray, time:chararray,display_site:chararray, was_clicked:int,
cpc:int, country:chararray, placement:chararray);
记录样本:
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
猪版本 0.15.
输入文件data.txt
:
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
tablet C6 5/1/2013 3:47:10 movienet.example.com 1 102 USA TOP
脚本:
data = LOAD '/path/data.txt' AS (keyword:chararray,campaign_id:chararray,
date:chararray, time:chararray,display_site:chararray, was_clicked:int,
cpc:int, country:chararray, placement:chararray);
grouped = GROUP data BY keyword;
by_keyword = FOREACH grouped
{
clicked = FILTER data BY was_clicked == 1;
total = COUNT(data.keyword);
GENERATE group, ((double)COUNT(clicked) / total) AS ctr;
}
dump by_keyword
给我正确的结果:
(tablet,0.25)
我在 PIG 中的数据列名称为
关键字,campaign_id,日期,时间,display_site,was_clicked,每次点击费用,国家/地区,展示位置
我想做的是寻找点击率高的关键字。
所以,我试图理解为什么以下代码会给我无效的标量投影错误
grouped = GROUP data BY keyword;
by_keyword = FOREACH grouped
{
clicked = FILTER data BY was_clicked == 1;
total = COUNT(data.keyword);
GENERATE group, ((double)COUNT(clicked) / total) AS ctr;
}
我遇到的错误:
37,632 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse:
<line 59, column 33> Invalid scalar projection: clicked : A column needs to be projected from a relation for it to be used as a scalar
Details at logfile: /home/cloudera/pig_1486224821223.log
如有任何帮助,我们将不胜感激。
编辑:
data = LOAD '/user/cloudera/pig_demo/ad_data.txt' AS (keyword:chararray,campaign_id:chararray,
date:chararray, time:chararray,display_site:chararray, was_clicked:int,
cpc:int, country:chararray, placement:chararray);
记录样本:
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
猪版本 0.15.
输入文件data.txt
:
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
tablet C6 5/1/2013 3:47:10 movienet.example.com 0 102 USA TOP
tablet C6 5/1/2013 3:47:10 movienet.example.com 1 102 USA TOP
脚本:
data = LOAD '/path/data.txt' AS (keyword:chararray,campaign_id:chararray,
date:chararray, time:chararray,display_site:chararray, was_clicked:int,
cpc:int, country:chararray, placement:chararray);
grouped = GROUP data BY keyword;
by_keyword = FOREACH grouped
{
clicked = FILTER data BY was_clicked == 1;
total = COUNT(data.keyword);
GENERATE group, ((double)COUNT(clicked) / total) AS ctr;
}
dump by_keyword
给我正确的结果:
(tablet,0.25)