有没有办法在配置单元 UDF 中获取数据库名称
Is there a way to get database name in a hive UDF
我正在写一个 Hive UDF。
我要获取数据库的名称(函数部署在)。然后,我需要根据数据库环境从 hdfs 访问一些文件。你能帮我看看哪个函数可以帮助 运行 来自 Hive UDF 的 HQL 查询。
- 编写UDFclass并准备jar文件
public class MyHiveUdf extends UDF {
public Text evaluate(String text,String dbName) {
if(text == null) {
return null;
} else {
return new Text(dbName+"."+text);
}
}
}
- 像下面提到的那样在配置单元查询中使用这个 UDF
hive> 使用 mydb;
好的
耗时:0.454 秒
hive> ADD jar /root/MyUdf.jar;
Added [/root/MyUdf.jar] to class path
Added resources: [/root/MyUdf.jar]
hive> create temporary function myUdfFunction as 'com.hiveudf.strmnp.MyHiveUdf';
OK
Time taken: 0.018 seconds
hive> select myUdfFunction(username,current_database()) from users;
Query ID = root_20170407151010_2ae29523-cd9f-4585-b334-e0b61db2c57b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1491484583384_0004, Tracking URL = http://mac127:8088/proxy/application_1491484583384_0004/
Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job -kill job_1491484583384_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2017-04-07 15:11:11,376 Stage-1 map = 0%, reduce = 0%
2017-04-07 15:11:19,766 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.12 sec
MapReduce Total cumulative CPU time: 3 seconds 120 msec
Ended Job = job_1491484583384_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 3.12 sec HDFS Read: 21659 HDFS Write: 381120 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 120 msec
OK
mydb.user1
mydb.user2
mydb.user3
Time taken: 2.137 seconds, Fetched: 3 row(s)
hive>
我正在写一个 Hive UDF。
我要获取数据库的名称(函数部署在)。然后,我需要根据数据库环境从 hdfs 访问一些文件。你能帮我看看哪个函数可以帮助 运行 来自 Hive UDF 的 HQL 查询。
- 编写UDFclass并准备jar文件
public class MyHiveUdf extends UDF { public Text evaluate(String text,String dbName) { if(text == null) { return null; } else { return new Text(dbName+"."+text); } } }
- 像下面提到的那样在配置单元查询中使用这个 UDF
hive> 使用 mydb; 好的 耗时:0.454 秒
hive> ADD jar /root/MyUdf.jar;
Added [/root/MyUdf.jar] to class path
Added resources: [/root/MyUdf.jar]
hive> create temporary function myUdfFunction as 'com.hiveudf.strmnp.MyHiveUdf';
OK
Time taken: 0.018 seconds
hive> select myUdfFunction(username,current_database()) from users;
Query ID = root_20170407151010_2ae29523-cd9f-4585-b334-e0b61db2c57b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1491484583384_0004, Tracking URL = http://mac127:8088/proxy/application_1491484583384_0004/
Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job -kill job_1491484583384_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2017-04-07 15:11:11,376 Stage-1 map = 0%, reduce = 0%
2017-04-07 15:11:19,766 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.12 sec
MapReduce Total cumulative CPU time: 3 seconds 120 msec
Ended Job = job_1491484583384_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 3.12 sec HDFS Read: 21659 HDFS Write: 381120 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 120 msec
OK
mydb.user1
mydb.user2
mydb.user3
Time taken: 2.137 seconds, Fetched: 3 row(s)
hive>