STORE relation problem using pig -x 本地问题,读取数据失败
STORE relation problem using pig -x local problem, failed to read data
第一种方法:使用 pig -x mapreduce
- Hbase table 通过 hbase shell
创建
Hbase table is created:
hbase(main):003:0> list
TABLE
clientes
1 row(s)
Took 0.0047 seconds
=> ["clientes"]
- 使用此代码将数据从 clientes.txt 加载到 dados (pig -x mapreduce)
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
id:chararray,
nome:chararray,
sobrenome:chararray,
idade:int,
funcao:chararray
);
- 使用 dump dados 检查 dados 但失败:
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1615152557282_0002
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:00:32,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:00:37,406 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:00:37,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1615152557282_0002 has failed! Stop running all dependent jobs
2021-03-07 19:00:37,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:00:37,410 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2021-03-07 19:00:37,492 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1615152557282_0002. Redirecting to job history server.
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:00:37,597 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:00:31 2021-03-07 19:00:37 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1615152557282_0002 dados MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1565)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1562)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1562)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.run(MapReduceLauncher.java:301)
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:737)
at org.apache.hadoop.fs.RawLocalFileSystem.setWorkingDirectory(RawLocalFileSystem.java:604)
at org.apache.hadoop.fs.FilterFileSystem.setWorkingDirectory(FilterFileSystem.java:307)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:250)
... 18 more
hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1615152557282_0002
2021-03-07 19:00:37,597 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2021-03-07 19:00:37,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias dados. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154395936.log
第二种方法:使用 pig -x local(转储 dados 有效)
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
>> id:chararray,
>> nome:chararray,
>> sobrenome:chararray,
>> idade:int,
>> funcao:chararray
>> );
2021-03-07 19:02:17,219 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2021-03-07 19:02:17,222 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt:0+794
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 2
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2021-03-07 19:02:17,241 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:02:17,243 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:02:17,253 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:02:17,266 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2021-03-07 19:02:17,274 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task:attempt_local116575577_0001_m_000000_0 is done. And is in the process of committing
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task attempt_local116575577_0001_m_000000_0 is allowed to commit now
2021-03-07 19:02:17,285 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local116575577_0001_m_000000_0' to file:/tmp/temp2133275539/tmp1539690224
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - map
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local116575577_0001_m_000000_0' done.
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Final Counters for attempt_local116575577_0001_m_000000_0: Counters: 16
File System Counters
FILE: Number of bytes read=1264
FILE: Number of bytes written=530456
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=20
Map output records=20
Input split bytes=414
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
Total committed heap usage (bytes)=311427072
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
org.apache.pig.PigWarning
FIELD_DISCARDED_TYPE_CONVERSION_FAILED=1
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local116575577_0001_m_000000_0
2021-03-07 19:02:17,291 [Thread-7] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:02:17,485 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2021-03-07 19:02:17,492 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2021-03-07 19:02:17,493 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,536 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:02:17,540 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:02:16 2021-03-07 19:02:17 UNKNOWN
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local116575577_0001 1 0 n/a n/a n/a n/a 0 0 0 0 dados MAP_ONLY file:/tmp/temp2133275539/tmp1539690224,
Input(s):
Successfully read 20 records from: "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Successfully stored 20 records in: "file:/tmp/temp2133275539/tmp1539690224"
Counters:
Total records written : 20
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local116575577_0001
2021-03-07 19:02:17,542 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,544 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,551 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,558 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
2021-03-07 19:02:17,558 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2021-03-07 19:02:17,563 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2021-03-07 19:02:17,563 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2021-03-07 19:02:17,570 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 1
2021-03-07 19:02:17,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(id,nome,sobrenome,,funcao)
(c001,Josias,Silva,55,Analista de Mercado)
(1100002,Pedro,Malan,74,Professor)
(1100003,Maria,Maciel,34,Bombeiro)
(1100004,Suzana,Bustamante,66,Analista de TI)
(1100005,Karen,Moreira,74,Advogado)
(1100006,Patricio,Teixeira,42,Veterinario)
(1100007,Elisa,Haniero,43,Piloto)
(1100008,Mauro,Bender,63,Marceneiro)
(1100009,Mauricio,Wagner,39,Artista)
(1100010,Douglas,Macedo,60,Escritor)
(1100011,Francisco,McNamara,47,Cientista de Dados)
(1100012,Sidney,Raynor,26,Escritor)
(1100013,Maria,Moon,41,Gerente de Projetos)
(1100014,Bete,Balanaira,65,Musico)
(1100015,Julia,Peixoto,49,Especialista em TI)
(1100016,Jeronimo,Wallace,52,Engenheiro de Dados)
(1100017,Noeli,Laura,72,Cientista de Dados)
(1100018,Jean,Junior,45,Desenvolvedor RPA)
(1100019,Cristina,Garbim,63,Engenheiro Blockchain)
但是 STORE dados INTO 'hbase://clientes' 或 STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' 失败:
grunt> STORE dados INTO 'hbase://clientes' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1289080477_0002
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:03:51,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:03:51,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1289080477_0002]
2021-03-07 19:03:51,835 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for clientes
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:03:51,843 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:03:51,860 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation - Closing zookeeper sessionid=0x1780e985b4d000f
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0] INFO org.apache.zookeeper.ZooKeeper - Session: 0x1780e985b4d000f closed
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1780e985b4d000f
2021-03-07 19:03:51,867 [Thread-10] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:03:51,870 [Thread-10] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1289080477_0002
java.lang.Exception: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:992)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
... 18 more
2021-03-07 19:03:52,055 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:03:52,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1289080477_0002 has failed! Stop running all dependent jobs
2021-03-07 19:03:52,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:03:52,056 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:03:52,058 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:03:50 2021-03-07 19:03:52 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1289080477_0002 dados MAP_ONLY Message: Job failed! hbase://clientes,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "hbase://clientes"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1289080477_0002
2021-03-07 19:03:52,058 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
grunt> STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
java.lang.Exception: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:196)
at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:149)
at org.apache.hadoop.hbase.TableName.<init>(TableName.java:322)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:358)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:449)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.<init>(TableOutputFormat.java:107)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.getRecordWriter(TableOutputFormat.java:153)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:83)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:659)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1458581109_0003
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:05:10,477 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:05:10,477 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:05:10,477 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1458581109_0003 has failed! Stop running all dependent jobs
2021-03-07 19:05:10,478 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:05:10,478 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,479 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,480 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:05:10,480 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:05:10 2021-03-07 19:05:10 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1458581109_0003 dados MAP_ONLY Message: Job failed! file:///home/hadoop/hadloop/pig_output,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "file:///home/hadoop/hadloop/pig_output"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1458581109_0003
2021-03-07 19:05:10,480 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
服务运行:
(base) [hadoop@dataserver 1-HBase]$ jps
4160 SecondaryNameNode
11666 Main
5413 HQuorumPeer
5766 HRegionServer
6966 JobHistoryServer
4631 NodeManager
4457 ResourceManager
5578 HMaster
3835 DataNode
12382 Jps
3615 NameNode
Hadoop 版本:
SUBCOMMAND may print help when invoked w/o parameters or with -h.
(base) [hadoop@dataserver 1-HBase]$ hadoop version
Hadoop 3.2.2
Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932
Compiled by hexiaoqiao on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 5a8f564f46624254b27f6a33126ff4
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar
HBase 版本:
(base) [hadoop@dataserver 1-HBase]$ hbase version
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.2.0
Source code repository file:///opt/hbase-rm/output/hbase-2.2.0-bin revision=Unknown
Compiled by hbase-rm on Tue Jun 11 04:30:30 UTC 2019
From source with checksum 63a465554927aeea3f1f0bcae63decff
猪版:
(base) [hadoop@dataserver 1-HBase]$ pig version
2021-03-07 19:08:50,197 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2021-03-07 19:08:50,263 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2021-03-07 19:08:50,263 [main] INFO org.apache.pig.Main - Logging error messages to: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,536 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,557 [main] INFO org.apache.pig.Main - Pig script completed in 400 milliseconds (400 ms)
要解决这个问题,您需要从 Yarn 启动一个名为 Job History Server 的服务
运行 以下命令:
mr-jobhistory-daemon.sh start historyserver
并通过 jps 命令检查以下服务是否正常工作:
13153 HQuorumPeer
13314 HMaster
**20242 JobHistoryServer**
5043 NameNode
6003 NodeManager
30163 Jps
5845 ResourceManager
5514 SecondaryNameNode
5227 DataNode
28510 RunJar
13519 HRegionServer
第一种方法:使用 pig -x mapreduce
- Hbase table 通过 hbase shell 创建
Hbase table is created:
hbase(main):003:0> list
TABLE
clientes
1 row(s)
Took 0.0047 seconds
=> ["clientes"]
- 使用此代码将数据从 clientes.txt 加载到 dados (pig -x mapreduce)
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
id:chararray,
nome:chararray,
sobrenome:chararray,
idade:int,
funcao:chararray
);
- 使用 dump dados 检查 dados 但失败:
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1615152557282_0002
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:00:32,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:00:37,406 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:00:37,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1615152557282_0002 has failed! Stop running all dependent jobs
2021-03-07 19:00:37,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:00:37,410 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2021-03-07 19:00:37,492 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1615152557282_0002. Redirecting to job history server.
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:00:37,597 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:00:31 2021-03-07 19:00:37 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1615152557282_0002 dados MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1565)
at org.apache.hadoop.mapreduce.Job.run(Job.java:1562)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1562)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.run(MapReduceLauncher.java:301)
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:737)
at org.apache.hadoop.fs.RawLocalFileSystem.setWorkingDirectory(RawLocalFileSystem.java:604)
at org.apache.hadoop.fs.FilterFileSystem.setWorkingDirectory(FilterFileSystem.java:307)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:250)
... 18 more
hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1615152557282_0002
2021-03-07 19:00:37,597 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2021-03-07 19:00:37,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias dados. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154395936.log
第二种方法:使用 pig -x local(转储 dados 有效)
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
>> id:chararray,
>> nome:chararray,
>> sobrenome:chararray,
>> idade:int,
>> funcao:chararray
>> );
2021-03-07 19:02:17,219 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2021-03-07 19:02:17,222 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt:0+794
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 2
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2021-03-07 19:02:17,241 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:02:17,243 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:02:17,253 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:02:17,266 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2021-03-07 19:02:17,274 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task:attempt_local116575577_0001_m_000000_0 is done. And is in the process of committing
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task attempt_local116575577_0001_m_000000_0 is allowed to commit now
2021-03-07 19:02:17,285 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local116575577_0001_m_000000_0' to file:/tmp/temp2133275539/tmp1539690224
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - map
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local116575577_0001_m_000000_0' done.
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Final Counters for attempt_local116575577_0001_m_000000_0: Counters: 16
File System Counters
FILE: Number of bytes read=1264
FILE: Number of bytes written=530456
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=20
Map output records=20
Input split bytes=414
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
Total committed heap usage (bytes)=311427072
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
org.apache.pig.PigWarning
FIELD_DISCARDED_TYPE_CONVERSION_FAILED=1
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local116575577_0001_m_000000_0
2021-03-07 19:02:17,291 [Thread-7] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:02:17,485 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2021-03-07 19:02:17,492 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2021-03-07 19:02:17,493 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,536 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:02:17,540 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:02:16 2021-03-07 19:02:17 UNKNOWN
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local116575577_0001 1 0 n/a n/a n/a n/a 0 0 0 0 dados MAP_ONLY file:/tmp/temp2133275539/tmp1539690224,
Input(s):
Successfully read 20 records from: "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Successfully stored 20 records in: "file:/tmp/temp2133275539/tmp1539690224"
Counters:
Total records written : 20
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local116575577_0001
2021-03-07 19:02:17,542 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,544 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,551 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,558 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
2021-03-07 19:02:17,558 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2021-03-07 19:02:17,563 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2021-03-07 19:02:17,563 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2021-03-07 19:02:17,570 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 1
2021-03-07 19:02:17,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(id,nome,sobrenome,,funcao)
(c001,Josias,Silva,55,Analista de Mercado)
(1100002,Pedro,Malan,74,Professor)
(1100003,Maria,Maciel,34,Bombeiro)
(1100004,Suzana,Bustamante,66,Analista de TI)
(1100005,Karen,Moreira,74,Advogado)
(1100006,Patricio,Teixeira,42,Veterinario)
(1100007,Elisa,Haniero,43,Piloto)
(1100008,Mauro,Bender,63,Marceneiro)
(1100009,Mauricio,Wagner,39,Artista)
(1100010,Douglas,Macedo,60,Escritor)
(1100011,Francisco,McNamara,47,Cientista de Dados)
(1100012,Sidney,Raynor,26,Escritor)
(1100013,Maria,Moon,41,Gerente de Projetos)
(1100014,Bete,Balanaira,65,Musico)
(1100015,Julia,Peixoto,49,Especialista em TI)
(1100016,Jeronimo,Wallace,52,Engenheiro de Dados)
(1100017,Noeli,Laura,72,Cientista de Dados)
(1100018,Jean,Junior,45,Desenvolvedor RPA)
(1100019,Cristina,Garbim,63,Engenheiro Blockchain)
但是 STORE dados INTO 'hbase://clientes' 或 STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' 失败:
grunt> STORE dados INTO 'hbase://clientes' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1289080477_0002
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:03:51,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:03:51,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1289080477_0002]
2021-03-07 19:03:51,835 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for clientes
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:03:51,843 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:03:51,860 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation - Closing zookeeper sessionid=0x1780e985b4d000f
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0] INFO org.apache.zookeeper.ZooKeeper - Session: 0x1780e985b4d000f closed
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1780e985b4d000f
2021-03-07 19:03:51,867 [Thread-10] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:03:51,870 [Thread-10] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1289080477_0002
java.lang.Exception: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:992)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
... 18 more
2021-03-07 19:03:52,055 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:03:52,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1289080477_0002 has failed! Stop running all dependent jobs
2021-03-07 19:03:52,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:03:52,056 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:03:52,058 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:03:50 2021-03-07 19:03:52 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1289080477_0002 dados MAP_ONLY Message: Job failed! hbase://clientes,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "hbase://clientes"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1289080477_0002
2021-03-07 19:03:52,058 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
grunt> STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
java.lang.Exception: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:196)
at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:149)
at org.apache.hadoop.hbase.TableName.<init>(TableName.java:322)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:358)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:449)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.<init>(TableOutputFormat.java:107)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.getRecordWriter(TableOutputFormat.java:153)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:83)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:659)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1458581109_0003
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:05:10,477 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:05:10,477 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:05:10,477 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1458581109_0003 has failed! Stop running all dependent jobs
2021-03-07 19:05:10,478 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:05:10,478 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,479 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,480 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:05:10,480 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:05:10 2021-03-07 19:05:10 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1458581109_0003 dados MAP_ONLY Message: Job failed! file:///home/hadoop/hadloop/pig_output,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "file:///home/hadoop/hadloop/pig_output"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1458581109_0003
2021-03-07 19:05:10,480 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
服务运行:
(base) [hadoop@dataserver 1-HBase]$ jps
4160 SecondaryNameNode
11666 Main
5413 HQuorumPeer
5766 HRegionServer
6966 JobHistoryServer
4631 NodeManager
4457 ResourceManager
5578 HMaster
3835 DataNode
12382 Jps
3615 NameNode
Hadoop 版本:
SUBCOMMAND may print help when invoked w/o parameters or with -h.
(base) [hadoop@dataserver 1-HBase]$ hadoop version
Hadoop 3.2.2
Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932
Compiled by hexiaoqiao on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 5a8f564f46624254b27f6a33126ff4
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar
HBase 版本:
(base) [hadoop@dataserver 1-HBase]$ hbase version
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.2.0
Source code repository file:///opt/hbase-rm/output/hbase-2.2.0-bin revision=Unknown
Compiled by hbase-rm on Tue Jun 11 04:30:30 UTC 2019
From source with checksum 63a465554927aeea3f1f0bcae63decff
猪版:
(base) [hadoop@dataserver 1-HBase]$ pig version
2021-03-07 19:08:50,197 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2021-03-07 19:08:50,263 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2021-03-07 19:08:50,263 [main] INFO org.apache.pig.Main - Logging error messages to: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,536 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,557 [main] INFO org.apache.pig.Main - Pig script completed in 400 milliseconds (400 ms)
要解决这个问题,您需要从 Yarn 启动一个名为 Job History Server 的服务
运行 以下命令:
mr-jobhistory-daemon.sh start historyserver
并通过 jps 命令检查以下服务是否正常工作:
13153 HQuorumPeer
13314 HMaster
**20242 JobHistoryServer**
5043 NameNode
6003 NodeManager
30163 Jps
5845 ResourceManager
5514 SecondaryNameNode
5227 DataNode
28510 RunJar
13519 HRegionServer