HBase内部如何解析"hbase shell command"?
How does HBase internally analysis "hbase shell command"?
假设,我运行get 't1','r1'
命令在hbaseshell,HBase内部如何分析并执行这个命令?
这是一个 jruby 脚本。这是在一组 shell 命令下定义的。
I am quoting here java HashMap as an example for better
understanding..
- 插入时,您的 rowkey 就像 java HashMap 中的键,它将存储在其中一个区域服务器中(在哈希映射情况下,这些是均匀分布的桶..)
- 在取回行时,它使用 rowkey 并将定位特定区域服务器并为其带来值,来自您提到的 table。
That's the reason while dealing with hbase rowkey design should be perfect (with salting technique , using hashing algorithm for ex: mumur hash) and it should be uniformly distributed across region servers to prevent hot spotting...
有关详细信息,请查看 get.rb
module Shell
module Commands
class Get < Command
def help
return <<-EOF
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase> get 't1', 'r1', 'c1', 'c2'
hbase> get 't1', 'r1', ['c1', 'c2']
hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column. A user can define a FORMATTER by adding it to the column name in the get
specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot specify
a FORMATTER for all columns of a column family.
The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:
hbase> t.get 'r1'
hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
hbase> t.get 'r1', {COLUMN => 'c1'}
hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> t.get 'r1', 'c1'
hbase> t.get 'r1', 'c1', 'c2'
hbase> t.get 'r1', ['c1', 'c2']
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
EOF
end
def command(table, row, *args)
get(table(table), row, *args)
end
def get(table, row, *args)
@start_time = Time.now
formatter.header(["COLUMN", "CELL"])
count, is_stale = table._get_internal(row, *args) do |column, value|
formatter.row([ column, value ])
end
formatter.footer(count, is_stale)
end
end
end
end
#add get command to table
::Hbase::Table.add_shell_command('get')
如果你想像 hbase shell 命令一样获取一条记录,你可以按照下面的代码片段进行操作。
根据您的评论进行更新:如果您想在 java
中拥有相同的功能
/**
* Get a row
*/
@Override
public void getOneRecord(final String tableName, final String rowKey) throws IOException {
final HTable table = new HTable(HBaseConn.getHBaseConfig(), getTable(tableName));
final Get get = new Get(rowKey.getBytes());
final Result rs = table.get(get);
for (final KeyValue kv : rs.raw()) {
LOG.info(kv.getRow() + " " + kv.getFamily() + ":" + kv.getQualifier() + " " + +kv.getTimestamp());
LOG.info(new String(kv.getValue()));
}
}
注意:java 方法和 shell 方法是两种不同的方法。请。不要 将两者混用, 与 I have seen your other questions as well 一样,我认为您对它们有点困惑。如果你想像我解释的那样编写 jruby,你也可以这样做。但这不是常见的方法。
希望对您有所帮助。
假设,我运行get 't1','r1'
命令在hbaseshell,HBase内部如何分析并执行这个命令?
这是一个 jruby 脚本。这是在一组 shell 命令下定义的。
I am quoting here java HashMap as an example for better understanding..
- 插入时,您的 rowkey 就像 java HashMap 中的键,它将存储在其中一个区域服务器中(在哈希映射情况下,这些是均匀分布的桶..)
- 在取回行时,它使用 rowkey 并将定位特定区域服务器并为其带来值,来自您提到的 table。
That's the reason while dealing with hbase rowkey design should be perfect (with salting technique , using hashing algorithm for ex: mumur hash) and it should be uniformly distributed across region servers to prevent hot spotting...
有关详细信息,请查看 get.rb
module Shell
module Commands
class Get < Command
def help
return <<-EOF
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase> get 't1', 'r1', 'c1', 'c2'
hbase> get 't1', 'r1', ['c1', 'c2']
hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column. A user can define a FORMATTER by adding it to the column name in the get
specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot specify
a FORMATTER for all columns of a column family.
The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:
hbase> t.get 'r1'
hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
hbase> t.get 'r1', {COLUMN => 'c1'}
hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> t.get 'r1', 'c1'
hbase> t.get 'r1', 'c1', 'c2'
hbase> t.get 'r1', ['c1', 'c2']
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
EOF
end
def command(table, row, *args)
get(table(table), row, *args)
end
def get(table, row, *args)
@start_time = Time.now
formatter.header(["COLUMN", "CELL"])
count, is_stale = table._get_internal(row, *args) do |column, value|
formatter.row([ column, value ])
end
formatter.footer(count, is_stale)
end
end
end
end
#add get command to table
::Hbase::Table.add_shell_command('get')
如果你想像 hbase shell 命令一样获取一条记录,你可以按照下面的代码片段进行操作。
根据您的评论进行更新:如果您想在 java
中拥有相同的功能 /**
* Get a row
*/
@Override
public void getOneRecord(final String tableName, final String rowKey) throws IOException {
final HTable table = new HTable(HBaseConn.getHBaseConfig(), getTable(tableName));
final Get get = new Get(rowKey.getBytes());
final Result rs = table.get(get);
for (final KeyValue kv : rs.raw()) {
LOG.info(kv.getRow() + " " + kv.getFamily() + ":" + kv.getQualifier() + " " + +kv.getTimestamp());
LOG.info(new String(kv.getValue()));
}
}
注意:java 方法和 shell 方法是两种不同的方法。请。不要 将两者混用, 与 I have seen your other questions as well 一样,我认为您对它们有点困惑。如果你想像我解释的那样编写 jruby,你也可以这样做。但这不是常见的方法。
希望对您有所帮助。