Example SQL queries with complex filtering - AWS HealthLake

Example SQL queries with complex filtering

The folloiwng examples demonstrate how to use Amazon Athena SQL queries with complex filtering to locate FHIR data from a HealthLake data store.

Example Create filtering criteria based on demographic data

Identifying the correct patient demographics is important when creating a patient cohort. This sample query demonstrates how you can use Trino dot notation and json_extract to filter data in your HealthLake data store.

SELECT id , CONCAT(name[1].family, ' ', name[1].given[1]) as name , (year(current_date) - year(date(birthdate))) as age , gender as gender , json_extract(extension[1], '$.valueString') as MothersMaidenName , json_extract(extension[2], '$.valueAddress.city') as birthPlace , maritalstatus.coding[1].display as maritalstatus , address[1].line[1] as addressline , address[1].city as city , address[1].district as district , address[1].state as state , address[1].postalcode as postalcode , address[1].country as country , json_extract(address[1].extension[1], '$.extension[0].valueDecimal') as latitude , json_extract(address[1].extension[1], '$.extension[1].valueDecimal') as longitude , telecom[1].value as telNumber , deceasedboolean as deceasedIndicator , deceaseddatetime FROM database.patient;

Using the Athena Console, you can further sort and download the results.

Example Create filters for a patient and their related conditions

The following example query demonstrates how you can find and sort all the related conditions for the patients found in a HealthLake data store.

SELECT patient.id as patientId , condition.id as conditionId , CONCAT(name[1].family, ' ', name[1].given[1]) as name , condition.meta.tag[1].display , json_extract(condition.modifierextension[1], '$.valueDecimal') AS confidenceScore , category[1].coding[1].code as categoryCode , category[1].coding[1].display as categoryDescription , code.coding[1].code as diagnosisCode , code.coding[1].display as diagnosisDescription , onsetdatetime , severity.coding[1].code as severityCode , severity.coding[1].display as severityDescription , verificationstatus.coding[1].display as verificationStatus , clinicalstatus.coding[1].display as clinicalStatus , encounter.reference as encounterId , encounter.type as encountertype FROM database.patient, condition WHERE CONCAT('Patient/', patient.id) = condition.subject.reference ORDER BY name;

You can use the Athena console to further sort the results or download them for further analysis.

Example Create filters for patients and their related observations

The following example query demonstrates how to find and sort all related observations for patients found in a HealthLake data store.

SELECT patient.id as patientId , observation.id as observationId , CONCAT(name[1].family, ' ', name[1].given[1]) as name , meta.tag[1].display , json_extract(modifierextension[1], '$.valueDecimal') AS confidenceScore , status , category[1].coding[1].code as categoryCode , category[1].coding[1].display as categoryDescription , code.coding[1].code as observationCode , code.coding[1].display as observationDescription , effectivedatetime , CASE WHEN valuequantity.value IS NOT NULL THEN CONCAT(CAST(valuequantity.value AS VARCHAR),' ',valuequantity.unit) WHEN valueCodeableConcept.coding [ 1 ].code IS NOT NULL THEN CAST(valueCodeableConcept.coding [ 1 ].code AS VARCHAR) WHEN valuestring IS NOT NULL THEN CAST(valuestring AS VARCHAR) WHEN valueboolean IS NOT NULL THEN CAST(valueboolean AS VARCHAR) WHEN valueinteger IS NOT NULL THEN CAST(valueinteger AS VARCHAR) WHEN valueratio IS NOT NULL THEN CONCAT(CAST(valueratio.numerator.value AS VARCHAR),'/',CAST(valueratio.denominator.value AS VARCHAR)) WHEN valuerange IS NOT NULL THEN CONCAT(CAST(valuerange.low.value AS VARCHAR),'-',CAST(valuerange.high.value AS VARCHAR)) WHEN valueSampledData IS NOT NULL THEN CAST(valueSampledData.data AS VARCHAR) WHEN valueTime IS NOT NULL THEN CAST(valueTime AS VARCHAR) WHEN valueDateTime IS NOT NULL THEN CAST(valueDateTime AS VARCHAR) WHEN valuePeriod IS NOT NULL THEN valuePeriod.start WHEN component[1] IS NOT NULL THEN CONCAT(CAST(component[2].valuequantity.value AS VARCHAR),' ',CAST(component[2].valuequantity.unit AS VARCHAR), '/', CAST(component[1].valuequantity.value AS VARCHAR),' ',CAST(component[1].valuequantity.unit AS VARCHAR)) END AS observationvalue , encounter.reference as encounterId , encounter.type as encountertype FROM database.patient, observation WHERE CONCAT('Patient/', patient.id) = observation.subject.reference ORDER BY name;
Example Create filtering conditions for a patient and their related procedures

Connecting procedures to patients is an important aspect of healthcare. The following SQL example query demonstrates how to use FHIR Patient and Procedure resource types to accomplish this. The following SQL query will return all patients and their related procedures found in your HealthLake data store.

SELECT patient.id as patientId , PROCEDURE.id as procedureId , CONCAT(name[1].family, ' ', name[1].given[1]) as name , status , category.coding[1].code as categoryCode , category.coding[1].display as categoryDescription , code.coding[1].code as procedureCode , code.coding[1].display as procedureDescription , performeddatetime , performer[1] , encounter.reference as encounterId , encounter.type as encountertype FROM database.patient, procedure WHERE CONCAT('Patient/', patient.id) = procedure.subject.reference ORDER BY name;

You can use the Athena console to download the results for further analysis or sort them to better understand the results.

Example Create filtering conditions for a patient and their related prescriptions

Seeing a current list of medications that patients are taking is important. Using Athena, you can write a SQL query that uses both the Patient and MedicationRequest resource types found in your HealthLake data store.

The following SQL query joins the Patient and MedicationRequest tables imported into Athena. It also organizes the prescriptions into their individual entries by using dot notation.

SELECT patient.id as patientId , medicationrequest.id as medicationrequestid , CONCAT(name[1].family, ' ', name[1].given[1]) as name , status , statusreason.coding[1].code as categoryCode , statusreason.coding[1].display as categoryDescription , category[1].coding[1].code as categoryCode , category[1].coding[1].display as categoryDescription , priority , donotperform , encounter.reference as encounterId , encounter.type as encountertype , medicationcodeableconcept.coding[1].code as medicationCode , medicationcodeableconcept.coding[1].display as medicationDescription , dosageinstruction[1].text as dosage FROM database.patient, medicationrequest WHERE CONCAT('Patient/', patient.id ) = medicationrequest.subject.reference ORDER BY name

You can use the Athena console to sort the results or download them for further analysis.

Example See medications found in the MedicationStatement resource type

The following example query shows you how to organize the nested JSON imported into Athena using SQL. The query uses the FHIR meta element to indicate when a medication has been added by HealthLake's integrated natural language processing (NLP). It also uses json_extract to search for data inside the array of JSON strings. For more information, see Natural language processing.

SELECT medicationcodeableconcept.coding[1].code as medicationCode , medicationcodeableconcept.coding[1].display as medicationDescription , meta.tag[1].display , json_extract(modifierextension[1], '$.valueDecimal') AS confidenceScore FROM medicationstatement;

You can use the Athena console to download these results or sort them.

Example Filter for a specific disease type

The example shows how you can find a group of patients, aged 18 to 75, who have been diagnosed with diabetes.

SELECT patient.id as patientId, condition.id as conditionId, CONCAT(name [ 1 ].family, ' ', name [ 1 ].given [ 1 ]) as name, (year(current_date) - year(date(birthdate))) AS age, CASE WHEN condition.encounter.reference IS NOT NULL THEN condition.encounter.reference WHEN observation.encounter.reference IS NOT NULL THEN observation.encounter.reference END as encounterId, CASE WHEN condition.encounter.type IS NOT NULL THEN observation.encounter.type WHEN observation.encounter.type IS NOT NULL THEN observation.encounter.type END AS encountertype, condition.code.coding [ 1 ].code as diagnosisCode, condition.code.coding [ 1 ].display as diagnosisDescription, observation.category [ 1 ].coding [ 1 ].code as categoryCode, observation.category [ 1 ].coding [ 1 ].display as categoryDescription, observation.code.coding [ 1 ].code as observationCode, observation.code.coding [ 1 ].display as observationDescription, effectivedatetime AS observationDateTime, CASE WHEN valuequantity.value IS NOT NULL THEN CONCAT(CAST(valuequantity.value AS VARCHAR),' ',valuequantity.unit) WHEN valueCodeableConcept.coding [ 1 ].code IS NOT NULL THEN CAST(valueCodeableConcept.coding [ 1 ].code AS VARCHAR) WHEN valuestring IS NOT NULL THEN CAST(valuestring AS VARCHAR) WHEN valueboolean IS NOT NULL THEN CAST(valueboolean AS VARCHAR) WHEN valueinteger IS NOT NULL THEN CAST(valueinteger AS VARCHAR) WHEN valueratio IS NOT NULL THEN CONCAT(CAST(valueratio.numerator.value AS VARCHAR),'/',CAST(valueratio.denominator.value AS VARCHAR)) WHEN valuerange IS NOT NULL THEN CONCAT(CAST(valuerange.low.value AS VARCHAR),'-',CAST(valuerange.high.value AS VARCHAR)) WHEN valueSampledData IS NOT NULL THEN CAST(valueSampledData.data AS VARCHAR) WHEN valueTime IS NOT NULL THEN CAST(valueTime AS VARCHAR) WHEN valueDateTime IS NOT NULL THEN CAST(valueDateTime AS VARCHAR) WHEN valuePeriod IS NOT NULL THEN valuePeriod.start WHEN component[1] IS NOT NULL THEN CONCAT(CAST(component[2].valuequantity.value AS VARCHAR),' ',CAST(component[2].valuequantity.unit AS VARCHAR), '/', CAST(component[1].valuequantity.value AS VARCHAR),' ',CAST(component[1].valuequantity.unit AS VARCHAR)) END AS observationvalue, CASE WHEN condition.meta.tag [ 1 ].display = 'SYSTEM GENERATED' THEN 'YES' WHEN condition.meta.tag [ 1 ].display IS NULL THEN 'NO' WHEN observation.meta.tag [ 1 ].display = 'SYSTEM GENERATED' THEN 'YES' WHEN observation.meta.tag [ 1 ].display IS NULL THEN 'NO' END AS IsSystemGenerated, CAST( json_extract( condition.modifierextension [ 1 ], '$.valueDecimal' ) AS int ) AS confidenceScore FROM database.patient, database.condition, database.observation WHERE CONCAT('Patient/', patient.id) = condition.subject.reference AND CONCAT('Patient/', patient.id) = observation.subject.reference AND (year(current_date) - year(date(birthdate))) >= 18 AND (year(current_date) - year(date(birthdate))) <= 75 AND condition.code.coding [ 1 ].display like ('%diabetes%');

Now you can use the Athena console to sort the results or download them for further analysis.