Identifying Apache HBase and EMRFS tuning options - Migrating to Apache HBase on Amazon S3 on Amazon EMR

Identifying Apache HBase and EMRFS tuning options

Apache HBase on Amazon S3 configuration properties

This section helps you optimize components that support the read/write path for your application access patterns by identifying the components and properties to configure. Specifically, the goal of tuning is to prepare the initial configurations to influence cluster behavior, storage footprint behavior, and individual components behavior that support the read and write paths.

The whitepaper covers only HBase tuning properties that were critical to many HBase on Amazon S3 customers during migration. Make sure to test any additional HBase configuration properties that have been tuned on your HDFS backed cluster but not included in this section. You also need to tune EMRFS properties to prepare your cluster for scale.

This whitepaper should be used together with existing resources or materials on best practices and operational guidelines for HBase.

For a detailed description of the HBase properties mentioned on this document, refer to HBase default configurations and HBase-default.xml (HBase 1.4.6). For more details on the metrics mentioned on this document, refer to MetricsRegionServerSource.java (HBase 1.4.6). To monitor changes to some of the properties mentioned on this document, you require access to the Logs for the master and specific Region Servers.

To access the HBase logs during tuning, you can use the HBase Web UI. First select the HBase Master or the specific RegionServer, and then click the “Local Logs” tab. Or, you can SSH to the particular instance that hosts the RegionServer or HBase Master and visualize the last lines added to the logs under /var/log/hbase.

Next, we identify the several settings on HBase and later on EMRFS to take into consideration during the tuning stage of the migration.

For some of the HBase properties, we propose a starting value or a setting, for others you will need to iterate on a combination of configurations during performance tests to find adequate values.

All of the configuration settings that you decide to set can be applied to your Amazon EMR Cluster via a configuration object that the Amazon EMR service uses to configure HBase and EMRFS when deploying a new cluster. For more details, refer to the Applying HBase and EMRFS configurations to the cluster section in this document.

Speeding up the cluster initialization

Use the properties that follow when you want to speed up the cluster’s startup time, speed up region assignments, and speed up region initialization time. These properties are associated with the HBase Master and the HBase RegionServer.

HBase master tuning:

hbase.master.handler.count
  • This property sets the number of RPC handlers spun up on the HBase Master.

  • The default value is 30.

  • If your cluster has thousands of regions, you will likely need to increase this value. Monitor the queue size (ipc.queue.size), min and max time in queue, total calls time, min and max processing time, and then iterate on this value to find the best value for your use case.

  • Customers at the 20000 region scale have configured this property to 4 times the default value.

HBase RegionServer tuning:

hbase.regionserver.handler.count
  • This property sets the number of RPC handlers created on RegionServers to serve requests. For more information about this configuration setting, refer to hbase.regionserver.handler.count.

  • The default value is 30.

  • Monitor the number of RPC Calls Queued, the 99th percentile latency for RPC calls to stay in queue, and RegionServer memory. Iterate on this value to find the best value for your use case.

  • Customers at the 20000 region scale have configured this property to 4 times the default value.

    hbase.regionserver.executor.openregion.threads
  • This property sets the number of concurrent threads for region opening.

  • The default value is 3.

  • Increase this value if the number of regions per RegionServer is high.

  • For clusters with thousands of regions, it is common to see this value at 10–20 times the default.

Preventing initialization loops

The default values of the properties that follow may be too conservative for some use cases. Depending on the number of regions, number of RegionServers, and the settings you have chosen to control initialization and assignment times, the default values for the master timeout can prevent your cluster from starting up.

Relevant Master initialization timeouts:

hbase.master.initializationmonitor.timeout
  • This property sets the amount of time to sleep in milliseconds before checking the Master’s initialization status.

  • The default value is 900000 (15 minutes).

  • Monitor masterFinishedInitializationTime and the HBase Master logs for a “master failed to complete initialization” timeout message. Iterate on this value to find the best value for your use case.

    hbase.master.namespace.init.timeout
  • This property sets the time for the master to wait for the namespace table to initialize.

  • The default value is 300000 (5 minutes).

  • Monitor the HBase Master logs for a “waiting for namespace table to be assigned” timeout message. Iterate on this value to find the best value for your use case.

Scaling to a high number of regions

This property allows the HBase Master to handle high number of regions.

Getting a balanced cluster after initialization

To ensure that the HBase Master only allocates regions when a target number of RegionServers is available, tune the following properties:

hbase.master.wait.on.regionservers.mintostart hbase.master.wait.on.regionservers.maxtostart
  • These properties set the minimum and maximum number of RegionServers the HBase Master will wait for before starting to assign regions.

  • By default, hbase.master.wait.on.regionservers.mintostart is set to 1.

  • An adequate value for the min and max is 90 to of the total number of RegionServers. A high value for both min and max results in a more uniform distribution of regions across RegionServers.

    hbase.master.wait.on.regionservers.timeout hbase.master.wait.on.regionservers.interval
  • The timeout property sets the time the master will wait for RegionServers to check in. The default value is 4500.

  • The interval property sets the time period used by the master to decide if no new RegionServers have checked in. The default value is 1500.

  • These properties are especially relevant for a cluster with a large number of regions.

  • If your use case requires aggressive initialization times, these properties can be set to lower values so that the condition that is dependent on these properties is evaluated earlier.

  • Customers at the 20000 region scale and with a requirement of low initialization time, have set timeout to 400 and interval to 300.

  • For more information on the condition used by the master to trigger allocation, refer to HBASE-6389.

Preventing timeouts during snapshot operations

Tune the following properties to prevent timeouts during snapshot operations:

hbase.snapshot.master.timeout.millis
  • This property sets the time the master will wait for a snapshot to conclude. This property is especially relevant for tables with a large number of regions.

  • The default value is 300000 (5 minutes).

  • Monitor the logs for snapshot timeout messages and iterate on this value.

  • Customers at the 20000 Region scale have set this property to 1800000 (30 minutes).

    hbase.snapshot.thread.pool.max
  • This property sets the number of threads used by the snapshot manifest loader operation.

  • Default value is 8.

  • This value depends on the instance type and the number of regions in your cluster. Monitor snapshot average time, CPU usage, and your application API to ensure the value you choose does not impact application requests.

  • Customers at the 20000 Region scale have used 2–8 times the default value for this property.

If you will be performing online snapshots while serving traffic, set the following properties to prevent timeouts during the online snapshot operation.

hbase.snapshot.region.timeout
  • Sets the timeout for RegionServers to keep threads in the snapshot request pool waiting.

  • Default value is 300000 (5 minutes).

  • This property is especially relevant for tables with a large number of regions.

  • Monitor memory usage on the RegionServers, monitor the logs for snapshot timeout messages, and iterate on this value. Increasing this value consumes memory on the Region Servers.

  • Customers at the 20000 Region scale have used 1800000 (30 minutes) for this property.

    hbase.snapshot.region.pool.threads
  • Sets the number of threads or snapshotting regions on the RegionServer.

  • Default value is 10.

  • If you decide to increase the value for this property, consider setting a lower value for hbase.snapshot.region.timeout.

  • Monitor snapshot average time, CPU usage, and your application API to ensure the value that you choose does not impact application requests.

Running the balancer for specific periods to minimize the impact of region movements on snapshots

Controlling the operation of the Balancer is crucial for smooth operation of the cluster. These properties provide control over the balancer.

hbase.balancer.period hbase.balancer.max.balancing
  • The hbase.balancer.period property configures when the balancer runs, and the hbase.balancer.max.balancing property configures how long the balancer runs.

  • These properties are especially relevant if you programmatically take snapshots of the data every few hours because the snapshot operation will fail if regions are being moved. You can monitor the snapshot average time to have more insight into the snapshot operation.

At a high level, balancing requires flushing data, closing the region, moving the region and then opening it on a new RegionServer. For this reason, for busy clusters, consider running the balancer every couple of hours and configuring the balancer to run for only one hour.

Tuning the balancer

Consider the following additional properties when configuring the balancer:

  • hbase.master.loadbalancer.class

  • hbase.balancer.period

  • hbase.balancer.max.balancing

We recommend that you test your current LoadBalancer settings, and then iterate on the configurations. The default LoadBalancer is the Stochastic Balancer. If you choose to use the default LoadBalancer, refer to StochasticLoadBalancer for more details on the various factors and costs associated with this balancer. Most use cases can use the default values.

Access pattern considerations and read/write path tuning

This section covers tuning the diverse HBase components that support the read/update/write paths.

To properly tune the components that support the read/update/write paths, you start by identifying the overall access pattern of your application.

If the access pattern is read-heavy, then you can reduce the resources allocated to the write path. The same guidelines apply for write-heavy access patterns. For mixed access patterns, you should strive for a balance.

Tuning the read path

This section identifies the properties used t when tuning the read path. The properties that follow are beneficial on both random-read and sequential-read access patterns.

Tuning the size of the BucketCache

The BucketCache is central to HBase on Amazon S3. The properties that follow set the overall size of the BucketCache per instance and allocate a percentage of the total size of the BucketCache to specialized areas, such as single access BucketCache, multiple access BucketCache, and in-memory BucketCache. For more details, refer to HBASE-18533.

The goal of this section is to configure the BucketCache to support your access pattern. For an access pattern of random reads and sequential reads, it is recommended to cache all data in the BucketCache, which is stored in disk. In other words, each instance allocates part of its disk to the bucket cache so that the total size of the BucketCache across all the instances in the cluster equals the size of the data on Amazon S3.

To configure the BucketCache, tune the following properties:

hbase.bucketcache.size
  • As a baseline, set the BucketCache to a value equal to the size of data you would like cached.

  • This property impacts Amazon S3 traffic. If the data is not in the cache, HBase must retrieve the data from Amazon S3.

  • If the BucketCache size is smaller than the amount of data being cached, it may cause many cache evictions, which will also increase overhead on GC. Moreover, it will increase Amazon S3 traffic. Set the BucketCache size to the total size of the dataset required for your application’s read access pattern.

  • Take into account the available disk resources when setting this property. HBase on Amazon S3 uses HDFS for the write path so the total disk available for the BucketCache must consider any storage required by Apache Hadoop/OS/HDFS. Refer to the Amazon EMR cluster setup section of this document for recommendations on sizing the cluster local storage for the BucketCache, choosing storage type and its mix (multiple disks versus a single large disk).

  • Monitor GC, cache evictions metrics, cache hit ratio, cache miss ratio (you can use HBase UI to quickly access these metrics) to support the process of choosing the best value. Moreover, consider the application SLA requirements to increase or decrease the BucketCache size. Iterate on this value to find the best value for your use case.

    hbase.bucketcache.single.factor hbase.bucketcache.multi.factor hbase.bucketcache.memory.factor
  • Note that the bucket areas follow the same areas as LRU cache. A block initially read from Amazon S3 is populated in the single-access area (hbase.bucketcache.single.factor) and consecutive accesses promote that block into the multi-access area (hbase.bucketcache.multi.factor). The in-memory area is reserved for blocks loaded from column families flagged as IN_MEMORY (hbase.bucketcache.memory.factor).

  • By default, these areas are sized at 25%, 50%, 25% of the total BucketCache size, respectively.

  • Tune this value according to the access pattern of your application.

  • This property impacts Amazon S3 traffic. For example, if single access is more prevalent than multi access, you can reduce the size allocated to multi access. If multi access is common, ensure that multi access areas are large enough to avoid cache evictions.

    hbase.rs.cacheblocksonwrite
  • This property forces all blocks that are written to be added to the cache automatically. Set this property to true.

  • This property is especially relevant to read-heavy workloads and setting it to true will populate the cache and reduce Amazon S3 traffic when a read request to the data is issued. Setting this to false in read-heavy workloads will result in reduced read performance and increased Amazon S3 activity.

  • HBase on Amazon S3 uses the file base BucketCache together with on- heap cache, BlockCache. This setup is commonly referred as a combined cache. The BucketCache only stores data blocks and the BlockCache stores bloom filters and indices. The physical location of the file base BucketCache is the disk, and the location of the BlockCache is the heap.

Pre-warming the BucketCache

HBase provides additional properties that control the prefetch of blocks when a region is opening. This is a form of cache pre-warming and recommended for HBase on Amazon S3, especially for read access patterns. Pre-warming the BucketCache results in reduced Amazon S3 traffic for subsequent requests.

Disabling pre-warming in read-heavy workloads results in reduced read performance and increased Amazon S3 activity.

To configure HBase to prefetch blocks, set the following properties:

hbase.rs.prefetchblocksonopen
  • This property controls whether the server should asynchronously load all of the blocks when a store file is opened (data, meta, and index). Note that enabling this property contributes to the time the RegionServer takes to open a region and therefore initialize.

  • Set this value to true to apply the property to all tables. This can also be applied per CF instead of using a global setting. Prefer this over enabling it cluster-wide.

  • If you set hbase.rs.prefetchblocksonopen to true, the properties that follow increase the number of threads and the size of the queue for the pre-fetch operation:

  • Set hbase.bucketcache.writer.queuelength to 1024 as a starting value. The default value is 64.

  • Set hbase.bucketcache.writer.threads to 6 as a starting value.

  • The default value is 3.

  • The values should be configured together and consider the instance type for the RegionServer and the number of regions per RegionServer. By increasing the number of threads, you may be able to choose a lower value for hbase.bucketcache.writer.queuelength.

  • Property hbase.rs.prefetchblocksonopen will control how fast you get data from Amazon S3 during the pre-fetch.

  • Monitor HBase logs to understand how fast the bucket cache is being initialized and monitor cluster resources to see the impact of the properties on memory and CPU. Iterate on these values to find the best value for your use case.

  • For more details, refer to HBASE-15240.

Modifying the table schema to support pre-warming

Finally, prefetching can be applied globally or per column family. In addition, the IN_MEMORY region of the BucketCache can be applied per column family.

You must change the schema of the tables to set these properties. For each column family and for the access patterns, you must identify which column families should always be placed in memory and which column families that benefit from prefetching. For example, if one column family is never read by the HBase read path (only read by an ETL job), you can save resources on the cluster and avoid using the PREFETCH_BLOCKS_ON_OPEN key or the IN_MEMORY for that column family. To modify an existing table to use PREFETCH_BLOCKS_ON_OPEN or IN_MEMORY keys, see the following examples:

hbase shell hbase(main):001:0> alter 'MyTable', NAME => 'myCF', PREFETCH_BLOCKS_ON_OPEN => 'true' hbase(main):002:0> alter 'MyTable', NAME => 'myCF2', IN_MEMORY => 'true'

Tuning the updates/write path

This section shows you how to tune and size the MemStore to avoid having frequent flushes and small HFiles. As a result, the frequency of compactions and Amazon S3 traffic is reduced.

hbase.regionserver.global.memstore.size
  • This property sets the maximum size of all MemStores in a RegionServer.

  • The memory allocated to the MemStores is kept in the main memory of the RegionServers.

  • If the value of hbase.regionserver.global.memstore.size is exceeded, updates are blocked, and flushes are forced until the total size of all the MemStores in a RegionServer is at or below the value of hbase.regionserver.global.memstore.size.lower.limit.

  • The default value is 0.4 (40% of the heap).

  • For write-heavy access patterns, you can increase this value to increase the heap area dedicated to all MemStores.

  • Consider the number of regions per Region Server and the number of column families with high write activity when setting this value.

  • For read-heavy access patterns, this setting can be decreased to free up resources that support the read path.

    hbase.hregion.memstore.flush.size
  • This property sets the flush size per MemStore.

  • Depending on the SLA of your API, the flush size may need to be higher than the flush size configured on the source cluster.

  • This setting impacts the traffic to Amazon S3, the size of HFiles, and the impact of compactions in your cluster. The higher you set the value, the fewer Amazon S3 operations are required, and the higher the size of each resulting HFile.

  • This value is dependent on the total number of regions per RegionServer and the number of column families with high write activity.

  • Use 536870912 bytes (512 MB) as the starting value, then monitor the frequency of flushes, the Memstore Flush Queue Size, and Application APIs response time. If frequency of flushes and queue size is high, increase this setting.

    hbase.regionserver.global.memstore.size.lower.limit
  • When the size of all Memstores exceeds this value, flushes are forced. This property prevents the Memstore from being blocked for updates.

  • By default, this property is set to 0.95, 95% of the value set for hbase.regionserver.global.memstore.size.

  • This value depends on the number of Regions per RegionServer and the write activity in your cluster.

  • You might want to decrease this value if as soon as the Memstores reach this safety threshold, the write activity quickly fills the missing 0.05 and the MemStore is blocked for writes.

  • Monitor the frequency of flushes, the Memstore Flush Queue Size, and Application APIs response time. If frequency and queue size is high, increase this setting.

    hbase.hregion.memstore.block.multiplier
  • This property is a safety threshold and controls spikes in write traffic.

  • Specifically, this property sets the threshold at which writes are blocked. If the MemStore reaches hbase.hregion.memstore.block.multiplier times hbase.hregion.memstore.flush.size bytes writes are blocked.

  • In case of spikes in traffic, this property prevents the Memstore from continuing to grow and in this way prevents HFiles with large sizes.

  • The default value is 4.

  • If your traffic has a constant pattern, consider keeping the default value for this property and tune only hbase.hregion.memstore.flush.size.

    hbase.hregion.percolumnfamilyflush.size.lower.bound.min
  • For the tables that have multiple column families, this property forces HBase to flush only the MemStores of each column family that exceed hbase.hregion.percolumnfamilyflush.size.lower.bound.min.

  • The default value for this property is 16777216 bytes.

  • This setting impacts the traffic to Amazon S3. A higher value reduces traffic to Amazon S3.

  • For write-heavy access patterns with multiple column families, this property should be changed to a value higher than the default of 16777216 bytes but less than half of the value of hbase.hregion.memstore.flush.size.

    hfile.block.cache.size
  • This property sets the percentage of the heap to be allocated to the BlockCache.

  • Use the default value of 0.4 for this property.

  • By default, the BucketCache stores data blocks, and the BlockCache stores bloom filters and indices.

  • You will need to allocate enough of the heap to cache indices and bloom filters, if applicable. To measure HFile indices and bloom filters sizes, access the web UI of one of the RegionServers.

  • Iterate on this value to find the best value for your use case.

    hbase.hstore.flusher.count
  • This property controls the number of threads available to flush writes from memory to Amazon S3.

  • The default value is 2.

  • This setting impacts the traffic to Amazon S3. A higher value reduces the MemStore flush queue and speeds up writes to Amazon S3. This setting is valuable for write-heavy environments. The value is dependent on the instance type used by the cluster.

  • Test the value and gradually increase it to 10.

  • Monitor the MemStore flush queue size, the 99th percentile for flush time, and application API response times. Iterate on this value to find the best value for your use case.

Note

Small clusters with high region density and high write activity should also tune HDFS properties that allow the HDFS NameNode and the HDFS DataNode to scale. Specifically, properties dfs.datanode.handler.count and dfs.namenode.handler.count should be increased to at least 3x their default value of 10.

Region size considerations

Since this process is a migration, set the region size to the same region size on your HDFS backed cluster.

As a reference, on HBase on Amazon S3, customer regions fall into these categories: small-sized regions (1-10 GB), mid-sized regions (10-50 GB), and large-sized regions (50-100 GB).

Controlling the Size of Regions and Region Splits

This property sets the size of the regions in your cluster. This property should be configured together with the property hbase.regionserver.region.split.policy which is not covered on this whitepaper.

  • Use your current cluster’s value for hbase.hregion.max.filesize

  • As a starting point you can use the value in your HDFS backed cluster.

  • Set hbase.regionserver.region.split.policy to the same policy in your HDFS backed cluster.

  • This property controls when a region should be split.

  • The default value is org.apache.hadoop.hbase.regionserver.SteppingSplit Policy.

  • Set hbase.regionserver.regionSplitLimit to the same value in your HDFS backed cluster.

  • This property acts as a guideline/limit for the RegionServer to stop splitting.

  • Its default value is 1000.

Tuning HBase compactions

This section shows you how to configure properties that control major compactions, reduce the frequency of minor compactions, and control the size of HFiles to reduce Amazon S3 traffic.

Controlling major compactions

In production environments, we recommend you disable major compaction. However, there should always be a process to run major compactions. Some customers opt to have an application that incrementally runs major compactions in the background, in a table, or RegionServer basis.

Set hbase.hregion.majorcompaction to 0 to disable automatically scheduled major compactions.

Reduce the frequency of minor compactions and control the size of HFiles to reduce Amazon S3 traffic

The following properties are dependent on the MemStore size, flush size, and the need to control the frequency of minor compactions.

The properties that follow should be set according to response time needs and average size of generated StoreFiles during a MemStore flush.

To control the behavior of minor compactions, tune these properties:

hbase.hstore.compaction.min.size

If a StoreFile is smaller than the value set by this property, the StoreFile will be selected for compaction. If StoreFiles have a size equal or larger than the value of hbase.hstore.compaction.min.size, hbase.hstore.compaction.ratio is used to determine if the files are eligible for compaction.

  • By default, this value is set to 134217728 bytes.

  • This setting depends on the frequency of flushes, the size of StoreFiles generated by flushes, and hbase.hregion.memstore.flush.size.

  • This setting impacts the traffic to Amazon S3. The higher you set the value, the more frequent minor compactions will occur, and therefore trigger Amazon S3 activity.

  • For write-heavy environments with many small StoreFiles, you may want to decrease this value to reduce the frequency of minor compactions and therefore Amazon S3 activity.

  • Monitor the frequency of compactions, the overall StoreFile size, and iterate on this value to find the best value for your use case.

    hbase.hstore.compaction.max.size
  • If a StoreFile is larger than the value set by this property, the StoreFile is not selected for compaction.

  • This value setting depends on the size of the HFiles generated by flushes and the frequency of minor compactions.

  • If you increase this value, you will have fewer, larger StoreFiles and increased Amazon S3 activity. If you decrease this value, you will have more, smaller StoreFiles, and reduced Amazon S3 activity.

  • Monitor the frequency of compactions, the compaction output size, the overall StoreFile size, and iterate on this value.

    hbase.hstore.compaction.ratio

Accept the default of 1.0 as a starting value for this property. For more details on this property, refer to hbase-default.xml.

hbase.hstore.compactionThreshold
  • If a store reaches hbase.hstore.compactionThreshold, a compaction is run to re-write the StoreFiles into one.

  • A high value will result in less frequent minor compactions, larger StoreFiles, longer minor compactions, and less Amazon S3 activity.

  • The default value is 3.

  • Start with a value of 6, monitor Compaction Frequency, the average size of StoreFiles, compaction output size, compaction time, and iterate on this value to get the best value for your use case.

    hbase.hstore.blockingStoreFiles
  • This property sets the total number of StoreFiles a single store can have before updates are blocked for this region. If this value is exceeded, updates are blocked until a compaction concludes or hbase.hstore.blockingWaitTime is exceeded.

  • For write-heavy workloads, use two to three times the default value as a starting value.

  • The default value is 16.

  • Monitor the frequency of flushes, blocked requests count, and compaction time to set the proper value for this property.

Minor and major compactions will flush the BucketCache. For more details, refer to HBASE-1597.

Controlling the storage footprint locally and on Amazon S3

At a high level, on HBase on Amazon S3, WALs are stored on HDFS. When a compaction occurs, previous HFiles are moved to the archive and only deleted if they are not associated with a snapshot. HBase relies on a Cleaner Chore that is responsible for deleting unnecessary HFiles and expired WALs.

Ensuring the Cleaner Chore is always running

With HBase 1.4.6 (Amazon EMR version 5.17.0 and later), we recommend that you deploy the cluster with the cleaner enabled. This is the default behavior. The property that sets this behavior is hbase.master.cleaner.interval.

We recommend that you use the latest Amazon EMR release. For versions prior to Amazon EMR 5.17.0, see the Operational Considerations section for the HBase shell commands that control the cleaner behavior.

To enable the cleaner globally, set the hbase.master.cleaner.interval to 1.

Speeding up the cleaner chore

HBASE-20555, HBASE-20352, and HBASE-17215 add additional control to the Cleaner Chore that deletes expired WALs (HLogCleaner) and expired archived HFiles (HFileCleaner). These controls are available on HBase 1.4.6 (Amazon EMR version 5.17.0) and later.

The number of threads allocated to the preceding properties should be configured together and take into consideration the capacity and instance type of the Amazon EMR Master node.

hbase.regionserver.hfilecleaner.large.thread.count
  • This property sets the number of threads allocated to clean expired large HFiles.

  • hbase.regionserver.thread.hfilecleaner.throttle sets the size that distinguishes between a small and large file. The default value is 64 MB.

  • The value for this property is dependent on the number of flushes, write activity in the cluster, and snapshot deletion frequency.

  • The higher the write and snapshot deletion activity, the higher the value should be.

  • By default, this property is set to 1.

  • Monitor the size of the HBase root directory on Amazon S3 and iterate on this value to find the best value for your use case. Consider the Amazon EMR Master CPU resources and the values set for the other configuration properties identified in this section. For more information, see the Enabling Amazon S3 metrics for the HBase on Amazon S3 root directory section of this document.

    hbase.regionserver.hfilecleaner.small.thread.count
  • This property sets the number of threads allocated to clean expired small HFiles.

  • The value for this property is dependent on the number of flushes, write activity in the cluster, and snapshot deletion frequency.

  • By default, this property is set to 1.

  • The higher the write and snapshot deletion activity, the higher the value should be.

  • Monitor the size of the HBase root directory on Amazon S3 and iterate on this value to find the best value for your use case. Consider the Amazon EMR Master CPU resources and the values set for the other configuration properties identified in this section.

    hbase.cleaner.scan.dir.concurrent.size
  • This property sets the number of threads to scan the oldWALs directories.

  • The value for this property is dependent on the write activity in the cluster.

  • By default, this property is set to ¼ of all available cores.

  • Monitor HDFS use and iterate on this value to find the best value for your use case. Consideration the Amazon EMR Master CPU resources and the values set for the other configuration properties identified in this section.

    hbase.oldwals.cleaner.thread.size
  • This property sets the number of threads to clean the WALs under the oldWALs directory.

  • The value for this property is dependent on the write activity in the cluster.

  • By default, this property is set to 2.

  • Monitor HDFS use and iterate on this value to find the best value for your use case. Consider the Amazon EMR Master CPU resources and the values set for the other configuration properties identified in this section.

For more details on how to set the configuration properties to clean expired WALs, refer to HBASE-20352.

Porting existing settings to HBase on Amazon S3

Some properties you have tuned in your on-premises cluster but were not included in the Apache HBase tuning section. These configurations include the heap size for HBase and supporting Apache Hadoop components, Apache HBase Split Policy, Apache Zookeeper timeouts, and so on. For these configuration properties, use the value in your HDFS backed cluster as a starting point. Follow the same process to iterate and determine the best value that supports your use case.

EMRFS configuration properties

Starting December 1, 2020, Amazon S3 delivers strong read-after-write consistency automatically for all applications. Therefore, there is no need to enable EMRFS consistent view and other consistent view-related configurations as detailed in Configure consistent view in the Amazon EMR Management Guide. For more details on Amazon S3 strong read-after-write consistency, refer to Amazon S3 now delivers strong read-after-write consistency automatically for all applications.

Setting the total number of connections used by EMRFS to read/write data from/to Amazon S3

With HBase on Amazon S3, access to data is done via EMRFS. This means that tasks such as an Apache HBase Region opening, MemStore flushes and compactions all will initiate a request to Amazon S3. To support workloads for a large number of regions and datasets, you must tune the total number of connections to Amazon S3 that EMRFS can make (fs.s3.maxConnections).

To tune fs.s3.maxConnections, account for the average size of the HFiles, number of regions, the frequency of minor compactions, and the overall read and write throughput the cluster is experiencing.

fs.s3.maxConnections
  • The default value for HBase on Amazon S3 is 10000. This value should be set to more than 10000 for clusters with a large number of regions (2500+), large datasets (1 TB+), high minor compactions activity, and intense read/write activity.

  • Monitor the logs for the ERROR message “Unable to execute HTTP request: Timeout waiting for connection” and iterate on this value. See more details about this error message in the Troubleshooting section of this document.

  • Several customers at the +50TB/20k regions scale set this property to 50000.