SUS04-BP07 Minimize data movement across networks
This best practice was updated with new guidance on July 13th, 2023. |
Use shared file systems or object storage to access common data and minimize the total networking resources required to support data movement for your workload.
Common anti-patterns:
-
You store all data in the same AWS Region independent of where the data users are.
-
You do not optimize data size and format before moving it over the network.
Benefits of establishing this best practice: Optimizing data movement across the network reduces the total networking resources required for the workload and lowers its environmental impact.
Level of risk exposed if this best practice is not established: Medium
Implementation guidance
Moving data around your organization requires compute, networking, and storage resources. Use techniques to minimize data movement and improve the overall efficiency of your workload.
Implementation steps
-
Consider proximity to the data or users as a decision factor when selecting a Region for your workload
. -
Partition Regionally consumed services so that their Region-specific data is stored within the Region where it is consumed.
-
Use efficient file formats (such as Parquet or ORC) and compress data before moving it over the network.
-
Don't move unused data. Some examples that can help you avoid moving unused data:
-
Reduce API responses to only relevant data.
-
Aggregate data where detailed (record-level information is not required).
-
See Well-Architected Lab - Optimize Data Pattern Using Amazon Redshift Data Sharing
.
-
-
Use services that can help you run code closer to users of your workload.
Service When to use Use for compute-heavy operations that are run when objects are not in the cache.
Use for simple use cases such as HTTP(s) request/response manipulations that can be initiated by short-lived functions.
Run local compute, messaging, and data caching for connected devices.
Resources
Related documents:
Related videos:
Related examples: