Use Cases - Comparing the Use of Amazon DynamoDB and Apache HBase for NoSQL

Use Cases

Amazon DynamoDB and Apache HBase are optimized to process massive amounts of data. Popular use cases for Amazon DynamoDB and Apache HBase include the following:

  • Serverless applications—Amazon DynamoDB provides a durable backend for storing data at any scale and has become the de facto database for powering Web and mobile backends for e-commerce/retail, education, and media verticals.

  • High volume special events—Special events and seasonal events, such as national electoral campaigns, are of relatively short duration and have variable workloads with the potential to consume large amounts of resources. Amazon DynamoDB lets you increase capacity when you need it and decrease as needed to handle variable workloads. This quality renders Amazon DynamoDB a suitable choice for such high volume special events.

  • Social media applications—Community-based applications, such as online gaming, photo sharing, location-aware applications, and so on, have unpredictable usage patterns with the potential to go viral anytime. The elasticity and flexibility of Amazon DynamoDB make it suitable for such high volume, variable workloads.

  • Regulatory and compliance requirements—Both Amazon DynamoDB and Amazon EMR are in scope of the AWS compliance efforts and therefore suitable for healthcare and financial services workloads as described in AWS Services in Scope by Compliance Program.

  • Batch-oriented processing—For large datasets, such as log data, weather data, product catalogs, and so on, you may already have large amounts of historical data that you want to maintain for historical trend analysis but need to ingest and batch process current data for predictive purposes. For these types of workloads, Apache HBase is a good choice because of its high read and write throughput and efficient storage of sparse data.

  • Reporting—To process and report on high volume transactional data, such as daily stock market trades, Apache HBase is a good choice because it supports high throughput writes and update rates, which make it suitable for storage of high frequency counters and complex aggregations.

  • Real-time analytics—The payload or message size in event data, such as tweets, E-commerce, and so on, is relatively small when compared with application logs. If you want to ingest streaming event data in real-time for sentiment analysis, ad serving, trending analysis, and so on, Amazon DynamoDB lets you increase throughout capacity when you need it, and decrease it when you are done, with no downtime. Apache HBase can handle real-time ingestion of data, such as application logs, with ease due to its high write throughput and efficient storage of sparse data. Combining this capability with Hadoop's ability to handle sequential reads and scans in a highly optimized way renders Apache HBase a powerful tool for real-time data analytics.