如何在具有多个字段的猪中加入两个关系
How to join two relations in pig with multiple fields
我有两个 CSV 文件:
1- Fertiltiy.csv :
2- 生命 Expectency.csv :
我想在pig中加入他们,结果是这样的:
我是 pig 的新手,我无法得到正确答案,但这是我的代码:
fertility = LOAD 'fertility' USING org.apache.hcatalog.pig.HCatLoader();
lifeExpectency = LOAD 'lifeExpectency' USING org.apache.hcatalog.pig.HCatLoader();
A = JOIN fertility by country, lifeExpectency by country;
B = JOIN fertility by year, lifeExpectency by year;
C = UNION A,B;
DUMP C;
这是我的代码的结果:
您已按国家/地区和年份加入,select 最终输出所需的必要列。
fertility = LOAD 'fertility' USING org.apache.hcatalog.pig.HCatLoader();
lifeExpectency = LOAD 'lifeExpectency' USING org.apache.hcatalog.pig.HCatLoader();
A = JOIN fertility by (country,year), lifeExpectency by (country,year);
B = FOREACH A GENERATE fertility::country,fertility::year,fertility::fertility,lifeExpectency::lifeExpectency;
DUMP B;
我有两个 CSV 文件:
1- Fertiltiy.csv :
2- 生命 Expectency.csv :
我想在pig中加入他们,结果是这样的:
我是 pig 的新手,我无法得到正确答案,但这是我的代码:
fertility = LOAD 'fertility' USING org.apache.hcatalog.pig.HCatLoader();
lifeExpectency = LOAD 'lifeExpectency' USING org.apache.hcatalog.pig.HCatLoader();
A = JOIN fertility by country, lifeExpectency by country;
B = JOIN fertility by year, lifeExpectency by year;
C = UNION A,B;
DUMP C;
这是我的代码的结果:
您已按国家/地区和年份加入,select 最终输出所需的必要列。
fertility = LOAD 'fertility' USING org.apache.hcatalog.pig.HCatLoader();
lifeExpectency = LOAD 'lifeExpectency' USING org.apache.hcatalog.pig.HCatLoader();
A = JOIN fertility by (country,year), lifeExpectency by (country,year);
B = FOREACH A GENERATE fertility::country,fertility::year,fertility::fertility,lifeExpectency::lifeExpectency;
DUMP B;