HBase Advantages and Disadvantages

Profile picture for user shiksha.dahiya
Submitted by shiksha.dahiya on

HBase Advantages:

  • It is Schema-less
  • It is a Column-oriented datastore
  • It is designed to store Denormalized Data
  • It Contains wide and sparsely populated tables
  • It Supports Automatic Partitioning
  • It is built for Low Latency operations
  • Provides access to single rows from billions of records
  • Data is accessed through shell commands, Client APIs in Java, REST, Avro or Thrift. 
  • Hbase provides java API (It includes all Java packages, classes, and interfaces, methods, fields and constructors) for client to perform parallel processing of huge data.
  • It is Easily integrates with Hadoop, both from the source and destination.
  • The Hbase is schema-less, i.e it does not have the concept of fixed columns schema, defines only column families.
  • It supports Distributed storage like HDFS.
  • Hbase designed for huge tables to store semi-structured as well as structured data.
  • It also supports LAN and WAN failover (switching to a redundant or standby computer server or network upon the failure) and recovery (resorting normal working operations on a computer network).
  • Hbase provides data replication across clusters for higher availability.
  • It is linearly scalable in nature.
  • It came with a feature of random access by using internal hash table to stores data in HDFS files for faster lookups.

HBase Disadvantages:

  • We cannot expect completely to use HBase as a replacement for traditional models. Some of the traditional models features cannot support by HBase
  • HBase cannot perform functions like SQL. It doesn't support SQL structure, so it does not contain any query optimizer
  • HBase is CPU and Memory intensive with large sequential input or output access while as Map Reduce jobs are primarily input or output bound with fixed memory. HBase integrated with Map-reduce jobs will result in unpredictable latencies
  • HBase integrated with pig and Hive jobs results in some time memory issues on cluster
  • In a shared cluster environment, the set up requires fewer task slots per node to allocate for HBase CPU requirements.
  • It has a single point of failure, i.e If HMaster goes down complete cluster fails to work.
  • Hbase does not support for transaction.
  • It has no built-in authentication or permissions. For example hbase allowed everyone to read from and write to all tables available in the system. For many enterprise setups, this kind of policy is unacceptable.
  • In hbase table joining and normalization is very difficult.
  • Indexing in hbase has to done manually, for which we have to write several LOC (lines of code) or script, i.e Hbase has no default indexing like PRIMARY KEY INDEX in traditional data base tables.
  • Very difficult to store large binary data in hbase.
  • Hbase is very expensive in case of hardware requirements and memory blocks allocations. For example:
    Costing and maintenance is too high and performance wise.
    It require high memory end configuration machine.
    Hbase support only one default sort per table.