Save SQL query results in a pandas DataFrame
You can store the results of your SQL query in a pandas DataFrame. The easiest way to output query results to a DataFrame is to use the SQL editor features of the JupyterLab SQL extension query-result dropdown and choose the Pandas dataframe option.
Alternatively, you can add the parameter --output '{"format": "DATAFRAME",
"dataframe_name": "
to your connection
string.dataframe_name
"}'
For example, the following query extracts details of customers with the highest balance
from the Customer
table in Snowflake's TPCH_SF1
database, using
both pandas and SQL:
-
In this example, we extract all the data from the customer table and save then in a DataFrame named
all_customer_data
.%%sm_sql --output '{"format": "DATAFRAME", "dataframe_name": "all_customer_data"}' --metastore-id
snowflake-connection-name
--metastore-type GLUE_CONNECTION SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMERSaved results to all_customer_data
-
Next, we extract the details of the highest account balance from the DataFrame.
all_customer_data.loc[all_customer_data['C_ACCTBAL'].idxmax()].values
array([61453, 'Customer#000061453', 'RxNgWcyl5RZD4qOYnyT3', 15, '25-819-925-1077', Decimal('9999.99'), 'BUILDING','es. carefully regular requests among the blithely pending requests boost slyly alo'], dtype=object)