Hive UDF 在 select 中抛出 Class 未找到异常
Hive UDF Throws Class Not Found Exception in select
我一直在使用 UDF jar。我需要在我的 UDF 中解析简单的 UserAgent。我找到了一个流行的 UserAgent 解析器 http://www.bitwalker.eu/software/user-agent-utils,我将其包含在我的项目中。在项目中我使用maven。我添加了所有依赖项,实施了所有内容并进行了测试。它在我的本地机器上运行良好。接下来我在 Maven 中进行全新安装以构建 jar。这个 jar 我通过添加 jar {MyJarName} 在 Hive 中使用,然后创建一个函数:创建临时函数 {functionName} 作为 {pathToUDFClass} 并得到这样的异常。
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"bidid":"8326c0ec49e5746f1af03400f37e5797","tstamp":20131022185001163,"logtype":1
,"ipinyouid":"D89E8S5bwWz","useragent":"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; InfoPath.2)","ip":"61.138.253.*","regionid":374,"cityid":375,"adexchange":1
,"domain":"449a7568331085d43d5867de26ce1ee1","url":"5ecba5b62bafd3428cdc1398b40cf88f","anonymousurl":"null","adslotid":null,"adslotwidth":300,"adslotheight":250,"adslotvisibility":"Na","adslotformat":"Na","adslo
tfloorprice":0,"creativeid":"10722","biddingprice":294,"payingprice":135,"landingpageurl":"null","advertiserid":2821,"userprofileids":[10006,10110,10063]}
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"bidid":"8326c0ec49e5746f1af03400f37e5797","tstamp":20131022185001163,"logtype":1,"ipinyouid":"D89E8S5bwWz","
useragent":"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; InfoPath.2)","ip":"61.138.253.*","regionid":374,"cityid":375,"adexchange":1,"domain":"449a7568331085d43
d5867de26ce1ee1","url":"5ecba5b62bafd3428cdc1398b40cf88f","anonymousurl":"null","adslotid":null,"adslotwidth":300,"adslotheight":250,"adslotvisibility":"Na","adslotformat":"Na","adslotfloorprice":0,"creativeid":
"10722","biddingprice":294,"payingprice":135,"landingpageurl":"null","advertiserid":2821,"userprofileids":[10006,10110,10063]}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
... 17 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text hive.homework3.UserAgentDetector.evaluate(org.apache.hadoop.io.Text) on object hive.homewor
k3.UserAgentDetector@1b340ab of class hive.homework3.UserAgentDetector with arguments {Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; InfoPath.2):org.apache.hadoo
p.io.Text} of size 1
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1019)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:186)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:81)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:133)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
... 18 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:995)
... 27 more
Caused by: java.lang.NoClassDefFoundError: eu/bitwalker/useragentutils/UserAgent
at hive.homework3.UserAgentDetector.formatter(UserAgentDetector.java:30)
at hive.homework3.UserAgentDetector.evaluate(UserAgentDetector.java:22)
... 32 more
Caused by: java.lang.ClassNotFoundException: eu.bitwalker.useragentutils.UserAgent
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 34 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:21, Vertex vertex_1501829365845_0009_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. fa
iledVertices:1 killedVertices:0
据我了解,最重要的是:
Caused by: java.lang.ClassNotFoundException: eu.bitwalker.useragentutils.UserAgent
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 34 more
我在我的 maven 项目中使用的这个外部库。
这是 UDF 创建,顺便说一句,在本地工作一切正常,测试通过。但在 Hive 中它不起作用。我想我使用的这个库有问题,但如果我在本地工作正常是否可能?
import eu.bitwalker.useragentutils.UserAgent;
import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
@Description(
name = "agentdetector",
value = "_FUNC_(str) - detects a user-agent of user",
extended = "Example:\n" +
" > SELECT agent(line) FROM test ipy; \n"
)
public class UserAgentDetector extends UDF {
public Text evaluate(Text text) {
Text value = new Text("");
if (text != null) {
value.set(formatter(text));
return value;
} else {
return null;
}
}
private Text formatter(Text text) {
UserAgent userAgent = UserAgent.parseUserAgentString(text.toString());
StringBuilder builder = new StringBuilder();
builder.append("Browser : ").append(userAgent.getBrowser().getName()).append("\n");
text.set(builder.toString());
return text;
}
maven 的依赖是:
<dependency>
<groupId>eu.bitwalker</groupId>
<artifactId>UserAgentUtils</artifactId>
<version>1.20</version>
</dependency>
要解决这个问题只需在你添加插件pom.xml
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>{pathToMainClass}</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
和 运行 assembly:assembly 在 maven 的插件中。
我一直在使用 UDF jar。我需要在我的 UDF 中解析简单的 UserAgent。我找到了一个流行的 UserAgent 解析器 http://www.bitwalker.eu/software/user-agent-utils,我将其包含在我的项目中。在项目中我使用maven。我添加了所有依赖项,实施了所有内容并进行了测试。它在我的本地机器上运行良好。接下来我在 Maven 中进行全新安装以构建 jar。这个 jar 我通过添加 jar {MyJarName} 在 Hive 中使用,然后创建一个函数:创建临时函数 {functionName} 作为 {pathToUDFClass} 并得到这样的异常。
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"bidid":"8326c0ec49e5746f1af03400f37e5797","tstamp":20131022185001163,"logtype":1
,"ipinyouid":"D89E8S5bwWz","useragent":"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; InfoPath.2)","ip":"61.138.253.*","regionid":374,"cityid":375,"adexchange":1
,"domain":"449a7568331085d43d5867de26ce1ee1","url":"5ecba5b62bafd3428cdc1398b40cf88f","anonymousurl":"null","adslotid":null,"adslotwidth":300,"adslotheight":250,"adslotvisibility":"Na","adslotformat":"Na","adslo
tfloorprice":0,"creativeid":"10722","biddingprice":294,"payingprice":135,"landingpageurl":"null","advertiserid":2821,"userprofileids":[10006,10110,10063]}
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"bidid":"8326c0ec49e5746f1af03400f37e5797","tstamp":20131022185001163,"logtype":1,"ipinyouid":"D89E8S5bwWz","
useragent":"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; InfoPath.2)","ip":"61.138.253.*","regionid":374,"cityid":375,"adexchange":1,"domain":"449a7568331085d43
d5867de26ce1ee1","url":"5ecba5b62bafd3428cdc1398b40cf88f","anonymousurl":"null","adslotid":null,"adslotwidth":300,"adslotheight":250,"adslotvisibility":"Na","adslotformat":"Na","adslotfloorprice":0,"creativeid":
"10722","biddingprice":294,"payingprice":135,"landingpageurl":"null","advertiserid":2821,"userprofileids":[10006,10110,10063]}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
... 17 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text hive.homework3.UserAgentDetector.evaluate(org.apache.hadoop.io.Text) on object hive.homewor
k3.UserAgentDetector@1b340ab of class hive.homework3.UserAgentDetector with arguments {Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; InfoPath.2):org.apache.hadoo
p.io.Text} of size 1
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1019)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:186)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:81)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:133)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
... 18 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:995)
... 27 more
Caused by: java.lang.NoClassDefFoundError: eu/bitwalker/useragentutils/UserAgent
at hive.homework3.UserAgentDetector.formatter(UserAgentDetector.java:30)
at hive.homework3.UserAgentDetector.evaluate(UserAgentDetector.java:22)
... 32 more
Caused by: java.lang.ClassNotFoundException: eu.bitwalker.useragentutils.UserAgent
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 34 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:21, Vertex vertex_1501829365845_0009_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. fa
iledVertices:1 killedVertices:0
据我了解,最重要的是:
Caused by: java.lang.ClassNotFoundException: eu.bitwalker.useragentutils.UserAgent
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 34 more
我在我的 maven 项目中使用的这个外部库。
这是 UDF 创建,顺便说一句,在本地工作一切正常,测试通过。但在 Hive 中它不起作用。我想我使用的这个库有问题,但如果我在本地工作正常是否可能?
import eu.bitwalker.useragentutils.UserAgent;
import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
@Description(
name = "agentdetector",
value = "_FUNC_(str) - detects a user-agent of user",
extended = "Example:\n" +
" > SELECT agent(line) FROM test ipy; \n"
)
public class UserAgentDetector extends UDF {
public Text evaluate(Text text) {
Text value = new Text("");
if (text != null) {
value.set(formatter(text));
return value;
} else {
return null;
}
}
private Text formatter(Text text) {
UserAgent userAgent = UserAgent.parseUserAgentString(text.toString());
StringBuilder builder = new StringBuilder();
builder.append("Browser : ").append(userAgent.getBrowser().getName()).append("\n");
text.set(builder.toString());
return text;
}
maven 的依赖是:
<dependency>
<groupId>eu.bitwalker</groupId>
<artifactId>UserAgentUtils</artifactId>
<version>1.20</version>
</dependency>
要解决这个问题只需在你添加插件pom.xml
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>{pathToMainClass}</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
和 运行 assembly:assembly 在 maven 的插件中。