Summary Prerequisites and limitations Architecture Tools Best practices Epics Related resources

Assess query performance for migrating SQL Server databases to MongoDB Atlas on AWS

Created by Battulga Purevragchaa (AWS), Krishnakumar Sathyanarayana (PeerIslands US Inc), and Babu Srinivasan (MongoDB)

Environment: PoC or pilot	Source: Microsoft SQL Server	Target: MongoDB Atlas or MongoDB Enterprise Advanced
R Type: Replatform	Workload: Microsoft	Technologies: Databases; Migration

Summary

This pattern provides guidance for loading MongoDB with near real-world data and assessing MongoDB query performance that is as close to the production scenario as possible. The assessment provides input to help you plan your migration to MongoDB from a relational database. The pattern uses PeerIslands Test Data Generator and Performance Analyzer to test query performance.

This pattern is particularly useful for Microsoft SQL Server migration to MongoDB, because performing schema transformations and loading data from current SQL Server instances to MongoDB can be very complex. Instead, you can load near real-world data into MongoDB, understand MongoDB performance, and fine-tune the schema design before you start the actual migration.

Prerequisites and limitations

Prerequisites

An active AWS account
Familiarity with MongoDB Atlas
Target MongoDB schema
Typical query patterns

Limitations

Data load times and performance will be limited by the MongoDB cluster instance size. We recommend that you choose instances that are recommended for production use to understand real-world performance.
PeerIslands Test Data Generator and Performance Analyzer currently supports only online data loads and queries. Offline batch processing (for example, loading data into MongoDB by using Spark connectors) isn’t yet supported.
PeerIslands Test Data Generator and Performance Analyzer supports field relations within a collection. It doesn’t support relationships across collections.

Product editions

This pattern supports both MongoDB Atlas and MongoDB Enterprise Advanced.

Architecture

Target technology stack

MongoDB Atlas or MongoDB Enterprise Advanced

Architecture

PeerIslands Test Data Generator and Performance Analyzer is built by using Java and Angular, and stores its generated data on Amazon Elastic Block Store (Amazon EBS). The tool consists of two workflows: test data generation and performance testing.

In test data generation, you create a template, which is the JSON representation of the data model that has to be generated. After you create the template, you can generate the data in a target collection, as defined by the load generation configuration.
In performance testing, you create a profile. A profile is a multi-stage testing scenario where you can configure create, read, update, and delete (CRUD) operations, aggregation pipelines, the weightage for each operation, and the duration of each stage. After you create the profile, you can run performance testing on the target database based on the configuration.

PeerIslands Test Data Generator and Performance Analyzer stores its data on Amazon EBS, so you can connect Amazon EBS to MongoDB by using any MongoDB-supported connection mechanism, including peering, allow lists, and private endpoints. By default, the tool doesn’t include operational components; however, it can be configured with Amazon Managed Service for Prometheus, Amazon Managed Grafana, Amazon CloudWatch, and AWS Secrets Manager if required.

Tools

PeerIslands Test Data Generator and Performance Analyzer includes two components. The Test Data Generator component helps you generate highly customer-specific, real-world data based on your MongoDB schema. The tool is fully UI-driven with a rich data library and can be used to quickly generate billions of records on MongoDB. The tool also provides capabilities to implement relationships between fields in the MongoDB schema. The Performance Analyzer component helps you generate highly customer-specific queries and aggregations, and perform realistic performance testing on MongoDB. You can use the Performance Analyzer to test MongoDB performance with rich load profiles and parameterized queries for your specific use case.

Best practices

See the following resources:

MongoDB Schema Design Best Practices (MongoDB Developer website)
Best Practices of Deploying MongoDB Atlas on AWS (MongoDB website)
Connecting Applications Securely to a MongoDB Atlas Data Plane with AWS PrivateLink (AWS blog post)
Best Practices Guide for MongoDB Performance (MongoDB website)

Epics

Task	Description	Skills required
Understand the database footprint of the current SQL Server source.	Understand your current SQL Server footprint. This can be achieved by running queries against the `INFORMATION` schema of the database. Determine the number of tables and size of each table. Analyze the index associated with each table. For more information about SQL analysis, see the blog post SQL2Mongo: Data Migration Journey on the PeerIslands website.	DBA
Understand the source schema.	Determine the table schema and the business representation of the data (for example, zip codes, names, and currency). Use your existing entity relationship (ER) diagram or generate the ER diagram from the existing database. For more information, see the blog post SQL2Mongo: Data Migration Journey on the PeerIslands website.	DBA
Understand query patterns.	Document the top 10 SQL queries you use. You can use the performance_schema.events_statements_summary_by_digest tables that are available in the database to understand the top queries. For more information, see the blog post SQL2Mongo: Data Migration Journey on the PeerIslands website.	DBA
Understand SLA commitments.	Document the target service-level agreements (SLAs) for database operations. Typical measures include query latency and queries per second. The measures and their targets are typically available in non-functional requirements (NFR) documents.	DBA

Task	Description	Skills required
Define the target schema.	Define various options for the target MongoDB schema. For more information, see Schemas in the MongoDB Atlas documentation. Consider the best practices and design patterns based on the table relations. See Data Model Examples and Patterns in the MongoDB documentation for details.	MongoDB engineer
Define target query patterns.	Define MongoDB queries and aggregation pipelines. These queries are the equivalent of the top queries you captured for your SQL Server workload. To understand how to construct MongoDB aggregation pipelines, see the MongoDB documentation.	MongoDB engineer
Define the MongoDB instance type.	Determine the size of the instance that you plan to use for testing. For guidance, see the MongoDB documentation.	MongoDB engineer

Task	Description	Skills required
Set up the MongoDB Atlas cluster.	To set up a MongoDB cluster on AWS, follow the instructions in the MongoDB documentation.	MongoDB engineer
Create users in the target database.	Configure the MongoDB Atlas cluster for access and network security by following the instructions in the MongoDB documentation.	MongoDB engineer
Create appropriate roles in AWS and configure role-based access control for Atlas.	If required, set up additional users by following the instructions in the MongoDB documentation. Configure authentication and authorization through AWS roles.	MongoDB engineer
Set up Compass for MongoDB Atlas access.	Set up the MongoDB Compass GUI utility for ease of navigation and access.	MongoDB engineer

Task	Description	Skills required
Install Test Data Generator.	Install PeerIsland Test Data Generator in your environment.	MongoDB engineer
Configure Test Data Generator to generate the appropriate data.	Create a template by using the data library to generate specific data for each field in the MongoDB schema. For more information, see the MongoDB Data Generator & Perf. Analyzer video.	MongoDB engineer
Horizontally scale Test Data Generator to generate the required load.	Use the template you created to start the load generation against the target collection by configuring the required parallelism. Determine the time frames and scale to generate the necessary data.	MongoDB engineer
Validate the load in MongoDB Atlas.	Check the data loaded into MongoDB Atlas.	MongoDB engineer
Generate required indexes on MongoDB.	Define indexes as required, based on query patterns. For best practices, see the MongoDB documentation.	MongoDB engineer

Task	Description	Skills required
Set up load profiles in Performance Analyzer.	Create a performance testing profile in Performance Analyzer by configuring specific queries and their corresponding weightage, duration of the test run, and stages. For more information, see the MongoDB Data Generator & Perf. Analyzer video.	MongoDB engineer
Run performance testing.	Use the performance testing profile you created to start the test against the target collection by configuring the required parallelism. Horizontally scale the performance test tool to run queries against MongoDB Atlas.	MongoDB engineer
Record test results.	Record P95, P99 latency for the queries.	MongoDB engineer
Tune your schema and query patterns.	Modify indexes and query patterns to address any performance issues.	MongoDB engineer

Task	Description	Skills required
Shut down temporary AWS resources.	Delete all temporary resources that you used for Test Data Generator and Performance Analyzer.	AWS administrator
Update performance test results.	Understand MongoDB query performance and compare it against your SLAs. If necessary, fine-tune the MongoDB schema and rerun the process.	MongoDB engineer
Conclude the project.	Close out the project and provide feedback.	MongoDB engineer

Related resources

GitHub repository: S3toAtlas
Schema: MongoDB Schema design
Aggregation pipelines : MongoDB aggregation pipelines
MongoDB Atlas sizing : Sizing tier selection
Video: MongoDB Data Generator & Perf. Analyzer
References: MongoDB documentation
Tutorials: MongoDB developer guide, MongoDB Jumpstart
AWS Marketplace: MongoDB Atlas on AWS Marketplace
AWS Partner Solutions: MongoDB Atlas on AWS Reference Deployment

Additional resources:

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Add HA to Oracle PeopleSoft on AWS

Automate failover and failback with DR Orchestrator Framework