Cosmos 中的用户 HDFS 配额管理

User HDFS quota management in Cosmos

据我们所知,FIWARE 实验室(cosmos.lab.fiware.org)中的每个 Cosmos 用户在 HDFS 中最多有 5GB 可用空间。
尽管如此,当 运行 我们的 map-reduce Hadoop 作业尽管作业生成的数据不超过 5GB 配额时,我们还是得到了 DSQuotaExceededException

如果我们在执行 map-reduce 作业期间监控 HDFS 使用情况,我们会得到以下输出:

Command: "while true; do date; hadoop fs -count -q . ; sleep 20; done"
Format:
DATE
QUOTA  REMAINING_QUOTA SPACE_QUOTA    REMAINING_SPACE_QUOTA DIR_COUNT  FILE_COUNT CONTENT_SIZE   FILE_NAME

jue jul 28 18:50:12 CEST 2016
        none             inf      5368709120      1197734302           19           46         1389627219 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:50:34 CEST 2016
        none             inf      5368709120      2678747494           16           26          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:50:57 CEST 2016
        none             inf      5368709120      2678747494           16           26          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:51:20 CEST 2016
        none             inf      5368709120      2678747494           16           26          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:51:44 CEST 2016
        none             inf      5368709120      2678747494           16           26          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:52:07 CEST 2016
        none             inf      5368709120      2678747494           16           26          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:52:28 CEST 2016
        none             inf      5368709120      1198032544           22           35         1389528792 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:52:50 CEST 2016
        none             inf      5368709120      1197738517           19           39         1389625814 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:53:11 CEST 2016
        none             inf      5368709120      2678747494           16           27          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:53:35 CEST 2016
        none             inf      5368709120      2678747494           16           27          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:53:59 CEST 2016
        none             inf      5368709120      2678747494           16           27          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:54:22 CEST 2016
        none             inf      5368709120      2678747494           16           27          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:54:46 CEST 2016
        none             inf      5368709120      2678747494           16           27          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:55:09 CEST 2016
        none             inf      5368709120      2477420902           17           28          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:55:31 CEST 2016
        none             inf      5368709120      1197738514           19           39         1389625815 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:55:55 CEST 2016
        none             inf      5368709120      1197738514           20           48         1389625815 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:56:17 CEST 2016
        none             inf      5368709120      2678747506           16           28          895957138 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:56:40 CEST 2016
        none             inf      5368709120      2678747506           16           28          895957138 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:57:04 CEST 2016
        none             inf      5368709120      2678747506           16           28          895957138 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:57:28 CEST 2016
        none             inf      5368709120      2678747506           16           28          895957138 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:57:51 CEST 2016
        none             inf      5368709120      2678747506           16           28          895957138 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:58:13 CEST 2016
        none             inf      5368709120      1198032556           16           37         1389528788 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:58:34 CEST 2016
        none             inf      5368709120      1197738742           19           40         1389625760 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:58:56 CEST 2016
        none             inf      5368709120      2678747494           16           29          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:59:20 CEST 2016
        none             inf      5368709120      2678747494           16           29          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 18:59:43 CEST 2016
        none             inf      5368709120      2678747494           16           29          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:00:07 CEST 2016
        none             inf      5368709120      2678747494           16           29          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:00:31 CEST 2016
        none             inf      5368709120      2678747494           16           29          895957142 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:00:54 CEST 2016
        none             inf      5368709120      1076586601           22           38         1228684181 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:01:18 CEST 2016
        none             inf      5368709120      1197724648           19           41         1389630437 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:01:41 CEST 2016
        none             inf      5368709120      1197724648           19           41         1389630437 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:02:05 CEST 2016
        none             inf      5368709120      1197724648           19           41         1389630437 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:02:29 CEST 2016
        none             inf      5368709120      1197724648           19           41         1389630437 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:02:53 CEST 2016
        none             inf      5368709120      1197724648           19           41         1389630437 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:03:14 CEST 2016
        none             inf      5368709120       364004107           19           46         1667537284 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:03:36 CEST 2016
        none             inf      5368709120       197959591           20           48         1722885456 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:03:57 CEST 2016
        none             inf      5368709120       201060881           18           44         1722549413 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:04:19 CEST 2016
        none             inf      5368709120       201060881           18           44         1722549413 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:04:40 CEST 2016
        none             inf      5368709120       201060881           18           44         1722549413 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:05:02 CEST 2016
        none             inf      5368709120       201060881           18           44         1722549413 hdfs://cosmosmaster-gi/user/rbarriuso
jue jul 28 19:05:23 CEST 2016
        none             inf      5368709120       201060881           18           44         1722549413 hdfs://cosmosmaster-gi/user/rbarriuso

一段时间后,执行结束并出现以下异常:

16/07/28 19:03:11 INFO mapred.JobClient: Task Id : attempt_201604111313_157784_r_000006_0, Status : FAILED
org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/rbarriuso is exceeded: quota=5368709120 diskspace consumed=5.0g
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3778)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3640)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access00(DFSClient.java:2846)
    at org.apache.ha...

正如您在上面日志末尾看到的,最大 HDFS 使用量对应 1.722.549.413 字节和 201.060.881 字节的免费配额(根据 hadoop fs -count -q),这不相加可用用户的 5GB space.
此外,已使用的 space 与剩余的免费 space.

不匹配

剩余配额space是如何计算的?
有什么办法可以避免 DSQuotaExceededException?

提前致谢。

您必须考虑 HDFS 适用于所有数据的复制因子。默认情况下,这是 3,因此您的有效配额是 5GB/3。可以通过电子邮件联系管理员(我 :))来增加此配额。