Running queries on HealthOmics variant stores
You can perform queries on your variant store using Amazon Athena. Note that genomic coordinates in variant and annotation stores are represented as zero-based, half-closed half-open intervals.
Run a simple query using the Athena console
The following example shows how to run a simple query.
-
Open the Athena Query editor: Athena Query editor
-
Under Workgroup, select the workgroup that you created during setup.
-
Verify that Data source is AwsDataCatalog.
-
For Database, select the database resource link that you created during the Lake Formation setup.
-
Copy the following query into the Query Editor under the Query 1 tab:
SELECT * from omicsvariants limit 10
-
Choose Run to run the query. The console populates the results table with the first 10 rows of the omicsvariants table.
Run a complex query using the Athena console
The following example shows how to run a complex query. To run this query, import ClinVar
into the annotation
store.
Run a complex query
-
Open the Athena Query editor: Athena Query editor
-
UnderWorkgroup, select the workgroup that you created during setup.
-
Verify that Data source is AwsDataCatalog.
-
For Database, select the database resource link that you created during the Lake Formation setup.
-
Choose the + at the top right to create a new query tab named Query 2.
-
Copy the following query into the Query Editor under the Query 2 tab:
SELECT variants.sampleid, variants.contigname, variants.start, variants."end", variants.referenceallele, variants.alternatealleles, variants.attributes AS variant_attributes, clinvar.attributes AS clinvar_attributes FROM omicsvariants as variants INNER JOIN omicsannotations as clinvar ON variants.contigname=CONCAT('chr',clinvar.contigname) AND variants.start=clinvar.start AND variants."end"=clinvar."end" AND variants.referenceallele=clinvar.referenceallele AND variants.alternatealleles=clinvar.alternatealleles WHERE clinvar.attributes['CLNSIG']='Likely_pathogenic'
-
Choose Run to start running the query.