Understanding image sets
Image sets are an AWS concept that serve as the foundation for AWS HealthImaging. Image sets are created when you import your DICOM data into HealthImaging, so having a good understanding of them is required when working with the service.
Image sets were introduced for the following reasons:
-
Support a wide variety of medical imaging workflows (clinical and nonclinical) through flexible APIs.
-
Maximize patient safety by grouping only related data.
-
Encourage data to be cleaned to help increase the visibility of inconsistencies. For more information, see Modifying image sets.
Important
Clinical use of DICOM data before it has been cleaned can result in patient harm.
The following menus describe image sets in further detail and provide examples and diagrams to help you comprehend their functionality and purpose in HealthImaging.
An image set is an AWS concept that defines an abstract grouping mechanism for optimizing related medical imaging data. When you import your DICOM P10 imaging data into an AWS HealthImaging data store, it is transformed into image sets comprised of metadata and image frames (pixel data).
Note
Image set metadata is normalized. In other
words, one common set of attributes and values maps to Patient, Study, and Series level
elements listed in the Registry of
DICOM Data Elements
During import, some image sets retain their original transfer syntax encoding, while others are transcoded to High-Throughput JPEG 2000 (HTJ2K) lossless by default. If an image set is encoded in HTJ2K, it must be decoded prior to viewing. For more information, see Supported transfer syntaxes and HTJ2K decoding libraries.
Image frames (pixel data) are encoded in High-Throughput JPEG 2000 (HTJ2K) and must be decoded prior to viewing.
Image sets are AWS resources, so they are assigned Amazon Resource Names (ARNs). They can be tagged with up to 50 key-value pairs and granted role-based access control (RBAC) and attribute-based access control (ABAC) through IAM. In addition, image sets are versioned, so all changes are preserved and prior versions can be accessed.
Importing DICOM P10 data results in image sets that contain DICOM metadata and image frames for one or more Service-Object Pair (SOP) instances in the same DICOM Series.
Note
DICOM import jobs:
-
Always create new image sets and never update existing image sets.
-
Do not deduplicate SOP Instance storage, as each import of the same SOP Instance uses additional storage.
-
May create multiple image sets for a single DICOM Series. For example, when there is a variant of a normalized metadata attribute such as a
PatientName
mismatch.
Use the GetImageSetMetadata
action to retrieve image set metadata. The
returned metadata is compressed with gzip
, so you must unzip it before viewing. For
more information, see Getting image set metadata.
The following example shows the structure of image set metadata in JSON format.
{ "SchemaVersion": "1.1", "DatastoreID": "2aa75d103f7f45ab977b0e93f00e6fe9", "ImageSetID": "46923b66d5522e4241615ecd64637584", "Patient": { "DICOM": { "PatientBirthDate": null, "PatientSex": null, "PatientID": "2178309", "PatientName": "MISTER^CT" } }, "Study": { "DICOM": { "StudyTime": "083501", "PatientWeight": null }, "Series": { "1.2.840.113619.2.30.1.1762295590.1623.978668949.887": { "DICOM": { "Modality": "CT", "PatientPosition": "FFS" }, "Instances": { "1.2.840.113619.2.30.1.1762295590.1623.978668949.888": { "DICOM": { "SourceApplicationEntityTitle": null, "SOPClassUID": "1.2.840.10008.5.1.4.1.1.2", "HighBit": 15, "PixelData": null, "Exposure": "40", "RescaleSlope": "1", "ImageFrames": [ { "ID": "0d1c97c51b773198a3df44383a5fd306", "PixelDataChecksumFromBaseToFullResolution": [ { "Width": 256, "Height": 188, "Checksum": 2598394845 }, { "Width": 512, "Height": 375, "Checksum": 1227709180 } ], "MinPixelValue": 451, "MaxPixelValue": 1466, "FrameSizeInBytes": 384000 } ] } } } } } }
The following example shows how multiple import jobs always create new image sets and never add to existing ones.
The following example shows a single import job creating two image sets because instances 1 and 2 have different patient names than instances 3 and 4.
The following example shows a single import job creating two image sets to improve throughput, even though the patient names match.