java获取列族的列,我们可以从HBase表中获取所有列名吗？-蒲公英云

java获取列族的列,我们可以从HBase表中获取所有列名吗？

红太狼 2022-08-29 15:45 378阅读 0赞

![Image 1][]

Setup:

I have an HBase table, with 100M+ rows and 1 Million+ columns. Every row has data for only 2 to 5 columns. There is in just 1 Column Family.

Problem:

I want to find out all the distinct qualifiers (columns) in this column family. Is there a quick way to do that?

I can think of about scanning the whole table, then getting familyMap for each row, get qualifier and add it to a Set<>. But that would be awfully slow, as there are 100M+ rows.

Can we do any better?

解决方案

You can use a mapreduce for this. In this case you don’t need to install a custom libs for hbase as in case for coprocessor.

Below a code for creating a mapreduce task.

Job setup

Job job = Job.getInstance(config);

job.setJobName(“Distinct columns”);

Scan scan = new Scan();

scan.setBatch(500);

scan.addFamily(YOU_COLUMN_FAMILY_NAME);

scan.setFilter(new KeyOnlyFilter()); //scan only key part of KeyValue (raw, column family, column)

scan.setCacheBlocks(false); // don’t set to true for MR jobs

TableMapReduceUtil.initTableMapperJob(

YOU_TABLE_NAME,

scan,

OnlyColumnNameMapper.class, // mapper

Text.class, // mapper output key

Text.class, // mapper output value

job);

job.setNumReduceTasks(1);

job.setReducerClass(OnlyColumnNameReducer.class);

Mapper

public class OnlyColumnNameMapper extends TableMapper {

@Override

protected void map(ImmutableBytesWritable key, Result value, final Context context) throws IOException, InterruptedException {

CellScanner cellScanner = value.cellScanner();

while (cellScanner.advance()) {

Cell cell = cellScanner.current();

byte[] q = Bytes.copy(cell.getQualifierArray(),

cell.getQualifierOffset(),

cell.getQualifierLength());

context.write(new Text(q),new Text());

}

Reducer

public class OnlyColumnNameReducer extends Reducer {

@Override

protected void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException {

context.write(new Text(key), new Text());

}

[Image 1]:

java获取列族的列,我们可以从HBase表中获取所有列名吗？

发表评论取消回复

还没有评论，来说两句吧...

相关阅读

相关 Django获取models（数据库）中的中文列名和英文列名

相关【转载】C#通过遍历DataTable的列获取所有列名

相关 HBase 列族属性配置

相关【HBase】列族属性详解

相关 Oracle获取表的所有列名

相关 HBase 列族属性

相关 php mysql 获取列名_从MySQL查询对象（PHP）获取列名

相关 java获取列族的列,我们可以从HBase表中获取所有列名吗？

相关 Oracle获取某列的数据类型及某表的所有列名

相关 HBase列族高级配置

随便看看

MySQL 使用 SQLyog ，查看表的主键是否为自增长

kubernetes 500 square/go-jose: error in cryptographic primitive

mysql 从 frm 文件恢复 table 表结构的3种方法

centos7 集成docker 搭建nginx 实现tomcat反向代理

java操作solr

tkinter获取复选框（Checkbutton）的值

教程文章

热评文章

1江湖小白之一起学Python （二）爬取数据的保存

2Java Shiro：简化身份验证和授权的安全框架

3Java中try()catch{}的使用方法

4Swagger注解-@ApiModel 和 @ApiModelProperty

5windows下强制杀死tomcat进程

6uni-app 条形码(一维码)/二维码生成实现

标签列表