Creating a HealthOmics sequence store

聚焦模式

Creating a HealthOmics sequence store - AWS HealthOmics

Creating a sequence store using the console Creating a sequence store using the CLI Updating a sequence store Updating read set tags for a sequence store Importing genomic files

此页面尚未翻译为您的语言。请求翻译

HealthOmics sequence stores support storage of genomic files in the unaligned formats of FASTQ (gzip-only) and uBAM. It also supports the aligned formats of BAM and CRAM.

Imported files are stored as read sets. You can add tags to read sets and use IAM policies to control access to read sets. Aligned read sets require a reference genome to align genomic sequences, but it's optional for unaligned read sets.

To store read sets, you first create a sequence store. When you create a sequence store, you can specify an optional Amazon S3 bucket as a fallback location and the location where S3 access logs are stored. The fallback location is used for storing any files that fail to create a read set during a direct upload. Fallback locations are available for sequence stores created after May 15, 2023. You specify the fallback location when you create the sequence store.

You can specify up to five read set tag keys. When you create or update a read set with a tag key that matches one of these keys, the read set tags are propagated to the corresponding Amazon S3 object. System tags created by HealthOmics are propagated by default.

Topics

Creating a sequence store using the console
Creating a sequence store using the CLI
Updating a sequence store
Updating read set tags for a sequence store
Importing genomic files

Creating a sequence store using the console

To create a sequence store

Open the HealthOmics console https://console.aws.amazon.com/omics/.
In the left navigation pane, choose Sequence stores.
On the Create sequence store page, provide the following information
- Sequence store name - A unique name for this store.
- Description (optional) - A description of this sequence store.
For Fallback location in S3, specify an Amazon S3 location. HealthOmics uses the fallback location for storing any files that fail to create a read set during a direct upload. You need to grant the HealthOmics service write access to the Amazon S3 fallback location. For an example policy, see Configure a fallback location.

Fallback locations aren't available for sequence stores created before May 16, 2023.
(Optional) For Read set tag keys for S3 propagation, you can enter up to five read set keys to propagate from a read set to the underlying S3 Objects. By propagating tags from a read set to the S3 object, you can grant S3 access permissions based on tags and/or end users to see the propagated tags through the Amazon S3 getObjectTagging API operation.
1. Enter one key value in the text box. The console creates a new text box to add the next key.
2. (Optional) Choose Remove to remove all the keys.
Under Data Encryption, select whether you want data encryption to be owned and managed by AWS or to use a customer managed CMK.
(Optional) Under S3 Data access, select whether to create a new role and policy to access the sequence store through Amazon S3.
(Optional) For S3 access logging, select Enabled if you want Amazon S3 to collect access log records.

For Access logging location in S3, specify an Amazon S3 location to store the logs. This field is visible only if you enabled S3 access logging.
Tags (optional) - Provide up to 50 tags for this sequence store. These tags are separate from read set tags that are set during read set import/tag update

After you create the store, it's ready for Importing genomic files.

Creating a sequence store using the CLI

In the following example, replace sequence store name with the name you chose for your sequence store.


aws omics create-sequence-store --name sequence store name --fallback-location "s3://amzn-s3-demo-bucket"

You receive the following response in JSON, which includes the ID number for your newly created sequence store.


{
    "id": "3936421177",
    "arn": "arn:aws:omics:us-west-2:111122223333:sequenceStore/3936421177",
    "name": "sequence_store_example_name",
    "creationTime": "2022-07-13T20:09:26.038Z"
    "fallbackLocation" : "s3://amzn-s3-demo-bucket"
}

You can also view all sequence stores associated with your account by using the list-sequence-stores command, as shown in the following.


aws omics list-sequence-stores

You receive the following response.


{
    "sequenceStores": [
        {
            "arn": "arn:aws:omics:us-west-2:111122223333:sequenceStore/3936421177",
            "id": "3936421177",
            "name": "MySequenceStore",
            "creationTime": "2022-07-13T20:09:26.038Z",
            "updatedTime": "2024-09-13T04:11:31.242Z",
            "fallbackLocation" : "s3://amzn-s3-demo-bucket",
            "status": "Active"
        }
    ]
}

You can use get-sequence-store to learn more about a sequence store by using its ID, as shown in the following example:


aws omics get-sequence-store --id sequence store ID

You receive the following response:


{
  "arn": "arn:aws:omics:us-west-2:123456789012:sequenceStore/sequencestoreID",
  "creationTime": "2024-01-12T04:45:29.857Z",
  "updatedTime": "2024-09-13T04:11:31.242Z",
  "description": null,
  "fallbackLocation": null,
  "id": "2015356892",
  "name": "MySequenceStore",
  "s3Access": {
      "s3AccessPointArn": "arn:aws:s3:us-west-2:123456789012:accesspoint/592761533288-2015356892",
      "s3Uri": "s3://592761533288-2015356892-ajdpi90jdas90a79fh9a8ja98jdfa9jf98-s3alias/592761533288/sequenceStore/2015356892/",
      "accessLogLocation": "s3://IAD-seq-store-log/2015356892/"
  },
  "sseConfig": {
      "keyArn": "arn:aws:kms:us-west-2:123456789012:key/eb2b30f5-635d-4b6d-b0f9-d3889fe0e648",
      "type": "KMS"
  },
  "status": "Active",
  "statusMessage": null,
  "setTagsToSync": ["withdrawn","protocol"],
}

After creation, several store parameters can also be updated. This can be done through the Console or the API updateSequenceStore operation.

Updating a sequence store

To update a sequence store, follow these steps:

Open the HealthOmics console https://console.aws.amazon.com/omics/.
In the left navigation pane, choose Sequence stores.
Choose the sequence store to update.
In the Details panel, choose Edit.
On the Edit details page, you can update the following fields:
- Sequence store name - A unique name for this store.
- Description - A description of this sequence store.
- Fallback location in S3, specify an Amazon S3 location. HealthOmics uses the fallback location for storing any files that fail to create a read set during a direct upload.
- Read set tag keys for S3 propagation you can enter up to five read set keys to propagate to Amazon S3.
- (Optional) For S3 access logging, select Enabled if you want Amazon S3 to collect access log records.
  
  For Access logging location in S3, specify an Amazon S3 location to store the logs. This field is visible only if you enabled S3 access logging.
- Tags (optional) - Provide up to 50 tags for this sequence store.

Updating read set tags for a sequence store

To update read set tags or other fields for a sequence store, follow these steps:

Open the HealthOmics console https://console.aws.amazon.com/omics/.
In the left navigation pane, choose Sequence stores.
Choose the sequence store that you want to update.
Choose the Details tab.
Choose Edit.
Add new read set tags or delete existing tags, as required.
Update the name, description, fallback location, or S3 data access, as required.
Choose Save changes.

Importing genomic files

To import genomic files to a sequence store, follow these steps:

To import a genomics file

Open the HealthOmics console https://console.aws.amazon.com/omics/.
In the left navigation pane, choose Sequence stores.
On the Sequence stores page, choose the sequence store that you want to import your files into.
On the individual sequence store page, choose Import genomic files.
On the Specify import details page, provide the following information
- IAM role - The IAM role that can access the genomic files on Amazon S3.
- Reference genome - The reference genome for this genomics data.
On the Specify import manifest page, specify the following information Manifest file. The manifest file is a JSON or YAML file that describes essential information of your genomics data. For information about the manifest file, see Importing read sets into a HealthOmics sequence store.
Click Create import job.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Creating a reference store

Deleting stores

本页内容

选择您的 Cookie 首选项

自定义 Cookie 首选项

关键

性能

功能

广告

无法保存 Cookie 首选项