Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# Amazon Redshift Python connector
<a name="python-redshift-driver"></a>

By using the Amazon Redshift connector for Python, you can integrate work with [the AWS SDK for Python (Boto3)](https://github.com/boto/boto3), and also pandas and Numerical Python (NumPy). For more information on pandas, see the [pandas GitHub repository](https://github.com/pandas-dev/pandas). For more information on NumPy, see the [NumPy GitHub repository](https://github.com/numpy/numpy). 

The Amazon Redshift Python connector provides an open source solution. You can browse the source code, request enhancements, report issues, and provide contributions. 

To use the Amazon Redshift Python connector, make sure that you have Python version 3.6 or later. For more information, see the [Amazon Redshift Python driver license agreement](https://github.com/aws/amazon-redshift-python-driver/blob/master/LICENSE). 

The Amazon Redshift Python connector provides the following:
+ AWS Identity and Access Management (IAM) authentication. For more information, see [Identity and access management in Amazon Redshift](redshift-iam-authentication-access-control.md).
+ Identity provider authentication using federated API access. Federated API access is supported for corporate identity providers such as the following:
  + Azure AD. For more information, see the AWS Big Data blog post [Federate Amazon Redshift access with Microsoft Azure AD single sign-on](https://aws.amazon.com/blogs/big-data/federate-amazon-redshift-access-with-microsoft-azure-ad-single-sign-on/).
  + Active Directory Federation Services. For more information, see the AWS Big Data blog post [Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 1](https://aws.amazon.com/blogs/big-data/federate-access-to-your-amazon-redshift-cluster-with-active-directory-federation-services-ad-fs-part-1/). 
  + Okta. For more information, see the AWS Big Data blog post [Federate Amazon Redshift access with Okta as an identity provider](https://aws.amazon.com/blogs/big-data/federate-amazon-redshift-access-with-okta-as-an-identity-provider/).
  + PingFederate. For more information, see the [PingFederate site](https://www.pingidentity.com/en/software/pingfederate.html).
  + JumpCloud. For more information, see the [JumpCloud site](https://jumpcloud.com/).
+ Amazon Redshift data types.

The Amazon Redshift Python connector implements Python Database API Specification 2.0. For more information, see [PEP 249—Python Database API Specification v2.0](https://www.python.org/dev/peps/pep-0249/) on the Python website.

**Topics**
+ [Installing the Amazon Redshift Python connector](python-driver-install.md)
+ [Configuration options for the Amazon Redshift Python connector](python-configuration-options.md)
+ [Importing the Python connector](python-start-import.md)
+ [Integrating the Python connector with NumPy](python-connect-integrate-numpy.md)
+ [Integrating the Python connector with pandas](python-connect-integrate-pandas.md)
+ [Using identity provider plugins](python-connect-identity-provider-plugins.md)
+ [Examples of using the Amazon Redshift Python connector](python-connect-examples.md)
+ [API reference for the Amazon Redshift Python connector](python-api-reference.md)

# Installing the Amazon Redshift Python connector
<a name="python-driver-install"></a>

You can use any of the following methods to install the Amazon Redshift Python connector:
+ Python Package Index (PyPI)
+ Conda
+ Cloning the GitHub repository

## Installing the Python connector from PyPI
<a name="python-pip-install-pypi"></a>

To install the Python connector from the Python Package Index (PyPI), you can use pip. To do this, run the following command.

```
>>> pip install redshift_connector
```

You can install the connector within a virtual environment. To do this, run the following command.

```
>>> pip install redshift_connector
```

Optionally, you can install pandas and NumPy with the connector.

```
>>> pip install 'redshift_connector[full]'
```

For more information on pip, see the [pip site](https://pip.pypa.io/en/stable/).

## Installing the Python connector from Conda
<a name="python-pip-install-from-conda"></a>

You can install the Python connector from Anaconda.org.

```
>>>conda install -c conda-forge redshift_connector
```

## Installing the Python connector by cloning the GitHub repository from AWS
<a name="python-pip-install-from-source"></a>

To install the Python connector from source, clone the GitHub repository from AWS. After you install Python and virtualenv, set up your environment and install the required dependencies by running the following commands.

```
$ git clone https://github.com/aws/amazon-redshift-python-driver.git
$ cd amazon-redshift-python-driver
$ virtualenv venv
$ . venv/bin/activate
$ python -m pip install -r requirements.txt
$ python -m pip install -e .
$ python -m pip install redshift_connector
```

# Configuration options for the Amazon Redshift Python connector
<a name="python-configuration-options"></a>

Following, you can find descriptions for the options that you can specify for the Amazon Redshift Python connector. The options below apply to the latest available connector version unless specified otherwise.

## access\$1key\$1id
<a name="python-access-key-id-option"></a>
+ **Default value** – None
+ **Data type** – String

The access key for the IAM role or user configured for IAM database authentication. 

This parameter is optional.

## allow\$1db\$1user\$1override
<a name="python-allow-db-user-override-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

True  
Specifies that the connector uses the `DbUser` value from the Security Assertion Markup Language (SAML) assertion.

False  
Specifies that the value in the `DbUser` connection parameter is used.

This parameter is optional.

## app\$1name
<a name="python-app-name-option"></a>
+ **Default value** – None
+ **Data type** – String

The name of the identity provider (IdP) application used for authentication. 

This parameter is optional.

## application\$1name
<a name="python-application_name-option"></a>
+ **Default value** – None
+ **Data type** – String

The name of the client application to pass to Amazon Redshift for audit purposes. The application name that you provide appears in the 'application\$1name' column of the [SYS\$1CONNECTION\$1LOG](https://docs.aws.amazon.com/redshift/latest/dg/SYS_CONNECTION_LOG.html) table. This helps track and troubleshoot connection sources when debugging issues.

This parameter is optional.

## auth\$1profile
<a name="python-auth-profile-option"></a>
+ **Default value** – None
+ **Data type** – String

The name of an Amazon Redshift authentication profile having connection properties as JSON. For more information about naming connection parameters, see the `RedshiftProperty` class. The `RedshiftProperty` class stores connection parameters provided by the end user and, if applicable, generated during the IAM authentication process (for example, temporary IAM credentials). For more information, see the [RedshiftProperty class](https://github.com/aws/amazon-redshift-python-driver/blob/master/redshift_connector/redshift_property.py#L9). 

This parameter is optional.

## auto\$1create
<a name="python-auto-create-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

A value that indicates whether to create the user if the user doesn't exist. 

This parameter is optional.

## client\$1id
<a name="python-client-id-option"></a>
+ **Default value** – None
+ **Data type** – String

The client ID from Azure IdP. 

This parameter is optional.

## client\$1secret
<a name="python-client-secret-option"></a>
+ **Default value** – None
+ **Data type** – String

The client secret from Azure IdP. 

This parameter is optional.

## cluster\$1identifier
<a name="python-cluster-identifier-option"></a>
+ **Default value** – None
+ **Data type** – String

The cluster identifier of the Amazon Redshift cluster. 

This parameter is optional.

## credentials\$1provider
<a name="python-credential-provider-option"></a>
+ **Default value** – None
+ **Data type** – String

The IdP that is used for authenticating with Amazon Redshift. Following are valid values: 
+ `AdfsCredentialsProvider`
+ `AzureCredentialsProvider`
+ `BrowserAzureCredentialsProvider`
+ `BrowserAzureOAuth2CredentialsProvider`
+ `BrowserIdcAuthPlugin` – An authorization plugin using AWS IAM Identity Center.
+ `BrowserSamlCredentialsProvider`
+ `IdpTokenAuthPlugin` – An authorization plugin that accepts an AWS IAM Identity Center token or OpenID Connect (OIDC) JSON-based identity tokens (JWT) from any web identity provider linked to the AWS IAM Identity Center.
+ `PingCredentialsProvider`
+ `OktaCredentialsProvider`

This parameter is optional.

## database
<a name="python-database-option"></a>
+ **Default value** – None
+ **Data type** – String

The name of the database to which you want to connect. 

This parameter is required.

## database\$1metadata\$1current\$1db\$1only
<a name="python-database-metadata-current-db-only-option"></a>
+ **Default value** – True
+ **Data type** – Boolean

A value that indicates whether an application supports multidatabase datashare catalogs. The default value of True indicates that the application doesn't support multidatabase datashare catalogs for backward compatibility. 

This parameter is optional.

## db\$1groups
<a name="python-db-groups-option"></a>
+ **Default value** – None
+ **Data type** – String

A comma-separated list of existing database group names that the user indicated by DbUser joins for the current session. 

This parameter is optional.

## db\$1user
<a name="python-db-user-option"></a>
+ **Default value** – None
+ **Data type** – String

The user ID to use with Amazon Redshift. 

This parameter is optional.

## endpoint\$1url
<a name="python-endpoint-url-option"></a>
+ **Default value** – None
+ **Data type** – String

The Amazon Redshift endpoint URL. This option is only for AWS internal use. 

This parameter is optional.

## group\$1federation
<a name="python-group-federation-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

This option specifies whether to use Amazon Redshift IDP groups.

This parameter is optional.

**true**  
Use Amazon Redshift Identity Provider (IDP) groups.

**false**  
Use STS API and GetClusterCredentials for user federation and specify **db\$1groups** for the connection.

## host
<a name="python-host-option"></a>
+ **Default value** – None
+ **Data type** – String

The hostname of Amazon Redshift cluster. 

This parameter is optional.

## iam
<a name="python-iam-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

IAM authentication is enabled. 

This parameter is required.

## iam\$1disable\$1cache
<a name="python-iam-disable-cache-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

This option specifies whether the IAM credentials are cached. By default, the IAM credentials are cached. This improves performance when requests to the API gateway are throttled. 

This parameter is optional.

## idc\$1client\$1display\$1name
<a name="python-idc_client_display_name-option"></a>
+ **Default Value** – Amazon Redshift Python connector
+ **Data Type** – String

The display name to be used for the client that's using BrowserIdcAuthPlugin.

This parameter is optional.

## idc\$1region
<a name="python-idc_region"></a>
+ **Default Value** – None
+ **Data Type** – String

The AWS region where the AWS IAM Identity Center instance is located.

This parameter is required only when authenticating using `BrowserIdcAuthPlugin` in the credentials\$1provider configuration option.

## idp\$1partition
<a name="python-idp_partition-option"></a>
+ **Default Value** – None
+ **Data Type** – String

Specifies the cloud partition where your identity provider (IdP) is configured. This determines which IdP authentication endpoint the driver connects to.

If this parameter is left blank, the driver defaults to the commercial partition. Possible values are:
+  `us-gov`: Use this value if your IdP is configured in Azure Government. For example, Azure AD Government uses the endpoint `login.microsoftonline.us`.
+  `cn`: Use this value if your IdP is configured in the China cloud partition. For example, Azure AD China uses the endpoint `login.chinacloudapi.cn`. 

This parameter is optional.

## idpPort
<a name="python-idp-port-option"></a>
+ **Default value** – 7890
+ **Data type** – Integer

The listen port to which IdP sends the SAML assertion. 

This parameter is required.

## idp\$1response\$1timeout
<a name="python-idp-response-timeout-option"></a>
+ **Default value** – 120
+ **Data type** – Integer

The timeout for retrieving SAML assertion from IdP. 

This parameter is required.

## idp\$1tenant
<a name="python-idp-tenant-option"></a>
+ **Default value** – None
+ **Data type** – String

The IdP tenant. 

This parameter is optional.

## issuer\$1url
<a name="python-issuer_url"></a>
+ **Default Value** – None
+ **Data Type** – String

 Points to the AWS IAM Identity Center server's instance endpoint. 

This parameter is required only when authenticating using `BrowserIdcAuthPlugin` in the credentials\$1provider configuration option.

## listen\$1port
<a name="python-listen-port-option"></a>
+ **Default value** – 7890
+ **Data type** – Integer

The port that the driver uses to receive the SAML response from the identity provider or authorization code when using SAML, Azure AD, or AWS IAM Identity Center services through a browser plugin.

This parameter is optional.

## login\$1url
<a name="python-login-url-option"></a>
+ **Default value** – None
+ **Data type** – String

The single sign-on Url for the IdP. 

This parameter is optional.

## max\$1prepared\$1statements
<a name="python-max-prepared-statements-option"></a>
+ **Default value** – 1000
+ **Data type** – Integer

The maximum number of prepared statements that will be cached per connection. Setting this parameter to 0 disables the caching mechanism. Entering a negative number for this parameter sets it to the default value. 

This parameter is optional.

## numeric\$1to\$1float
<a name="python-numeric-to-float-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

This option specifies if the connector converts numeric data type values from decimal.Decimal to float. By default, the connector receives numeric data type values as decimal.Decimal and does not convert them. 

We don't recommend enabling numeric\$1to\$1float for use cases that require precision, as results may be rounded. 

For more information on decimal.Decimal and the tradeoffs between it and float, see [decimal — Decimal fixed point and floating point arithmetic](https://docs.python.org/3/library/decimal.html) on the Python website. 

This parameter is optional.

## partner\$1sp\$1id
<a name="python-partner-sp-id-option"></a>
+ **Default value** – None
+ **Data type** – String

The Partner SP ID used for authentication with Ping. 

This parameter is optional.

## password
<a name="python-password-option"></a>
+ **Default value** – None
+ **Data type** – String

The password to use for authentication. 

This parameter is optional.

## port
<a name="python-port-option"></a>
+ **Default value** – 5439
+ **Data type** – Integer

The port number of the Amazon Redshift cluster. 

This parameter is required.

## preferred\$1role
<a name="python-preferred-role-option"></a>
+ **Default value** – None
+ **Data type** – String

The IAM role preferred for the current connection. 

This parameter is optional.

## principal\$1arn
<a name="python-principal-arn-option"></a>
+ **Default value** – None
+ **Data type** – String

The Amazon Resource Name (ARN) of the user or IAM role for which you are generating a policy. It's recommended that you attach a policy to a role and then assign the role to your user, for access. 

This parameter is optional.

## profile
<a name="python-profile-option"></a>
+ **Default value** – None
+ **Data type** – String

The name of a profile in an AWS credentials file that contains AWS credentials. 

This parameter is optional.

## provider\$1name
<a name="python-provider_name-option"></a>
+ **Default value** – None
+ **Data type** – String

The name of the Redshift Native Authentication Provider. 

This parameter is optional.

## region
<a name="python-region-option"></a>
+ **Default value** – None
+ **Data type** – String

The AWS Region where the cluster is located. 

This parameter is optional.

## role\$1arn
<a name="python-role-arn-option"></a>
+ **Default value** – None
+ **Data type** – String

The Amazon Resource Name (ARN) of the role that the caller is assuming. This parameter is used by the provider indicated by `JwtCredentialsProvider`. 

For the `JwtCredentialsProvider` provider, this parameter is mandatory. Otherwise, this parameter is optional.

## role\$1session\$1name
<a name="python-role-session-name-option"></a>
+ **Default value** – jwt\$1redshift\$1session
+ **Data type** – String

An identifier for the assumed role session. Typically, you pass the name or identifier that is associated with the user who is using your application. The temporary security credentials that your application uses are associated with that user. This parameter is used by the provider indicated by `JwtCredentialsProvider`. 

This parameter is optional.

## scope
<a name="python-scope-option"></a>
+ **Default value** – None
+ **Data type** – String

A space-separated list of scopes to which the user can consent. You specify this parameter so that your application can get consent for APIs that you want to call. You can specify this parameter when you specify BrowserAzureOAuth2CredentialsProvider for the credentials\$1provider option.

This parameter is required for the BrowserAzureOAuth2CredentialsProvider plug-in.

## secret\$1access\$1key\$1id
<a name="python-secret-access-key-id-option"></a>
+ **Default value** – None
+ **Data type** – String

The secret access key for the IAM role or user configured for IAM database authentication. 

This parameter is optional.

## session\$1token
<a name="python-session-token-option"></a>
+ **Default value** – None
+ **Data type** – String

The access key for the IAM role or user configured for IAM database authentication. This parameter is required if temporary AWS credentials are being used. 

This parameter is optional.

## serverless\$1acct\$1id
<a name="python-serverless-acct-id-option"></a>
+ **Default value** – None
+ **Data type** – String

The Amazon Redshift Serverless account ID.

This parameter is optional.

## serverless\$1work\$1group
<a name="python-serverless-work-group-option"></a>
+ **Default value** – None
+ **Data type** – String

The Amazon Redshift Serverless workgroup name.

This parameter is optional.

## ssl
<a name="python-ssl-option"></a>
+ **Default value** – True
+ **Data type** – Boolean

Secure Sockets Layer (SSL) is enabled. 

This parameter is required.

## ssl\$1insecure
<a name="python-ssl-insecure-option"></a>
+ **Default value** – False
+ **Data type** – Boolean

A value that specifies whether to disable the verification of the IdP host's server SSL certificate. Setting this parameter to True will disable the verification of the IdP host's server SSL certificate. We recommend that you keep the default value of False in production environments.

This parameter is optional.

## sslmode
<a name="python-sslmode-option"></a>
+ **Default value** – verify-ca
+ **Data type** – String

The security of the connection to Amazon Redshift. You can specify either of the following: 
+ verify-ca
+ verify-full

This parameter is required.

## tcp\$1keepalive
<a name="python-tcp_keepalive-option"></a>
+ **Default value** – True
+ **Data type** – Boolean

Whether to use TCP keepalives to keep connections from timing out. You can specify the following values:
+ True: The driver will use TCP keepalives to keep connections from timing out.
+ False: The driver won’t use TCP keepalives.

This parameter is optional.

## tcp\$1keepalive\$1count
<a name="python-tcp_keepalive_count-option"></a>
+ **Default value** – None
+ **Data type** – Integer

The number of unacknowledged probes to send before considering the connection inactive. For example, setting the value to 3 means that the driver will send 3 unanswered keepalive packets before determining that the connection is no longer active.

If this parameter is not specified, Amazon Redshift uses the system's default value.

This parameter is optional.

## tcp\$1keepalive\$1interval
<a name="python-tcp_keepalive_interval-option"></a>
+ **Default value** – None
+ **Data type** – Integer

The interval, in seconds, between subsequent keepalive probes if the driver doesn’t received acknowledgement for the probe before it. If you specify this parameter, it must be a positive integer.

If this parameter is not specified, Amazon Redshift uses the system's default value.

This parameter is optional.

## tcp\$1keepalive\$1idle
<a name="python-tcp_keepalive_idle-option"></a>
+ **Default value** – None
+ **Data type** – Integer

The duration of inactivity, in seconds, after which the driver sends the first keepalive probe. For example, setting the value to 120 means that the driver will wait for 2 minutes of inactivity before sending the first keepalive packet. If you specify this parameter, it must be a positive integer. 

If this parameter is not specified, Amazon Redshift uses the system's default value.

This parameter is optional.

## timeout
<a name="python-timeout-option"></a>
+ **Default value** – None
+ **Data type** – Integer

The number of seconds before the connection to the server times out. 

This parameter is optional.

## token
<a name="python-token-option"></a>
+ **Default Value** – None
+ **Data Type** – String

An AWS IAM Identity Center provided access token or an OpenID Connect (OIDC) JSON Web Token (JWT) provided by a web identity provider that's linked with AWS IAM Identity Center. Your application must generate this token by authenticating the user of your application with AWS IAM Identity Center or an identity provider linked with AWS IAM Identity Center. 

This parameter works with `IdpTokenAuthPlugin`.

## token\$1type
<a name="python-token_type-option"></a>
+ **Default Value** – None
+ **Data Type** – String

The type of token that is being used in `IdpTokenAuthPlugin`.

You can specify the following values:

**ACCESS\$1TOKEN**  
Enter this if you use an AWS IAM Identity Center provided access token.

**EXT\$1JWT**  
Enter this if you use an OpenID Connect (OIDC) JSON Web Token (JWT) provided by a web-based identity provider that's integrated with AWS IAM Identity Center.

This parameter works with `IdpTokenAuthPlugin`.

## user
<a name="python-user-option"></a>
+ **Default value** – None
+ **Data type** – String

The user name to use for authentication. 

This parameter is optional.

## web\$1identity\$1token
<a name="python-web-identity-token-option"></a>
+ **Default value** – None
+ **Data type** – String

The OAuth 2.0 access token or OpenID Connect ID token that is provided by the identity provider. Make sure that your application gets this token by authenticating the user who is using your application with a web identity provider. The provider indicated by `JwtCredentialsProvider` uses this parameter. 

For the `JwtCredentialsProvider` provider, this parameter is mandatory. Otherwise, this parameter is optional.

# Importing the Python connector
<a name="python-start-import"></a>

To import the Python connector, run the following command.

```
>>> import redshift_connector
```

To connect to an Amazon Redshift cluster using AWS credentials, run the following command.

```
conn = redshift_connector.connect(
    host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
    port=5439,
    database='dev',
    user='awsuser',
    password='my_password'
 )
```

# Integrating the Python connector with NumPy
<a name="python-connect-integrate-numpy"></a>

Following is an example of integrating the Python connector with NumPy.

```
>>>  import numpy
#Connect to the cluster
>>> import redshift_connector
>>> conn = redshift_connector.connect(
     host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
     port=5439,
     database='dev',
     user='awsuser',
     password='my_password'
  )
  
# Create a Cursor object
>>> cursor = conn.cursor()

# Query and receive result set            
cursor.execute("select * from book")

result: numpy.ndarray = cursor.fetch_numpy_array()
print(result)
```

Following is the result.

```
[['One Hundred Years of Solitude' 'Gabriel García Márquez']
['A Brief History of Time' 'Stephen Hawking']]
```

# Integrating the Python connector with pandas
<a name="python-connect-integrate-pandas"></a>

Following is an example of integrating the Python connector with pandas.

```
>>> import pandas

#Connect to the cluster
>>> import redshift_connector
>>> conn = redshift_connector.connect(
     host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
     port=5439,
     database='dev',
     user='awsuser',
     password='my_password'
  )
  
# Create a Cursor object
>>> cursor = conn.cursor()

# Query and receive result set
cursor.execute("select * from book")
result: pandas.DataFrame = cursor.fetch_dataframe()
print(result)
```

# Using identity provider plugins
<a name="python-connect-identity-provider-plugins"></a>

For general information on how to use identity provider plugins, see [Options for providing IAM credentials](options-for-providing-iam-credentials.md). For more information about managing IAM identities, including best practices for IAM roles, see [Identity and access management in Amazon Redshift](redshift-iam-authentication-access-control.md).

## Authentication using the ADFS identity provider plugin
<a name="python-connect-identity-provider-active-dir"></a>

Following is an example of using the Active Directory Federation Service (ADFS) identity provider plugin to authenticate a user connecting to an Amazon Redshift database.

```
>>> con = redshift_connector.connect(
    iam=True,
    database='dev',
    host='my-testing-cluster.abc.us-east-2.redshift.amazonaws.com',
    cluster_identifier='my-testing-cluster',
    credentials_provider='AdfsCredentialsProvider',
    user='brooke@myadfshostname.com',
    password='Hunter2',
    idp_host='myadfshostname.com'
)
```

## Authentication using the Azure identity provider plugin
<a name="python-connect-identity-provider-azure"></a>

Following is an example of authentication using the Azure identity provider plugin. You can create values for a `client_id` and `client_secret` for an Azure Enterprise application as shown following. 

```
>>>  con = redshift_connector.connect(
    iam=True,
    database='dev',
    host='my-testing-cluster.abc.us-east-2.redshift.amazonaws.com',
    cluster_identifier='my-testing-cluster',
    credentials_provider='AzureCredentialsProvider',
    user='brooke@myazure.org',
    password='Hunter2',
    idp_tenant='my_idp_tenant',
    client_id='my_client_id',
    client_secret='my_client_secret',
    preferred_role='arn:aws:iam:123:role/DataScientist'
)
```

## Authentication using the AWS IAM Identity Center identity provider plugin
<a name="python-connect-identity-provider-aws-idc"></a>

 Following is an example of authentication using the AWS IAM Identity Center identity provider plugin. 

```
with redshift_connector.connect(
credentials_provider='BrowserIdcAuthPlugin',
host='my-testing-cluster.abc.us-east-2.redshift.amazonaws.com',
database='dev',
idc_region='us-east-1',
issuer_url='https://identitycenter.amazonaws.com/ssoins-790723ebe09c86f9',
idp_response_timeout=60,
listen_port=8100,
idc_client_display_name='Test Display Name',
# port value of 5439 is specified by default
)
```

## Authentication using Azure Browser identity provider plugin
<a name="python-connect-identity-provider-azure-browser"></a>

Following is an example of using the Azure Browser identity provider plugin to authenticate a user connecting to an Amazon Redshift database.

Multi-factor authentication occurs in the browser, where the sign-in credentials are provided by the user.

```
>>>con = redshift_connector.connect(
    iam=True,
    database='dev',
    host='my-testing-cluster.abc.us-east-2.redshift.amazonaws.com',
    cluster_identifier='my-testing-cluster',
    credentials_provider='BrowserAzureCredentialsProvider',
    idp_tenant='my_idp_tenant',
    client_id='my_client_id',
)
```

## Authentication using the Okta identity provider plugin
<a name="python-connect-identity-provider-okta"></a>

Following is an example of authentication using the Okta identity provider plugin. You can obtain the values for `idp_host`, `app_id` and `app_name` through the Okta application.

```
>>> con = redshift_connector.connect(
    iam=True,
    database='dev',
    host='my-testing-cluster.abc.us-east-2.redshift.amazonaws.com',
    cluster_identifier='my-testing-cluster',
    credentials_provider='OktaCredentialsProvider',
    user='brooke@myazure.org',
    password='hunter2',
    idp_host='my_idp_host',
    app_id='my_first_appetizer',
    app_name='dinner_party'
)
```

## Authentication using JumpCloud with a generic SAML browser identity provider plugin
<a name="python-connect-identity-provider-jumpcloud"></a>

Following is an example of using JumpCloud with a generic SAML browser identity provider plugin for authentication.

The password parameter is required. However, you don't have to enter this parameter because multi-factor authentication occurs in the browser.

```
>>> con = redshift_connector.connect(
    iam=True,
    database='dev',
    host='my-testing-cluster.abc.us-east-2.redshift.amazonaws.com',
    cluster_identifier='my-testing-cluster',
    credentials_provider='BrowserSamlCredentialsProvider',
    user='brooke@myjumpcloud.org',
    password='',
    login_url='https://sso.jumpcloud.com/saml2/plustwo_melody'
)
```

# Examples of using the Amazon Redshift Python connector
<a name="python-connect-examples"></a>

Following are examples of how to use the Amazon Redshift Python connector. To run them, you must first install the Python connector. For more information on installing the Amazon Redshift Python connector, see [Installing the Amazon Redshift Python connector](python-driver-install.md). For more information on configuration options you can use with the Python connector, see [Configuration options for the Amazon Redshift Python connector](python-configuration-options.md).

**Topics**
+ [Connecting to and querying an Amazon Redshift cluster using AWS credentials](#python-connect-cluster)
+ [Enabling autocommit](#python-connect-enable-autocommit)
+ [Configuring cursor paramstyle](#python-connect-config-paramstyle)
+ [Using COPY to copy data from an Amazon S3 bucket and UNLOAD to write data to it](#python-connect-copy-unload-s3)

## Connecting to and querying an Amazon Redshift cluster using AWS credentials
<a name="python-connect-cluster"></a>

The following example guides you through connecting to an Amazon Redshift cluster using your AWS credentials, then querying a table and retrieving the query results.

```
#Connect to the cluster
>>> import redshift_connector
>>> conn = redshift_connector.connect(
     host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
     database='dev',
     port=5439,
     user='awsuser',
     password='my_password'
  )
  
# Create a Cursor object
>>> cursor = conn.cursor()

# Query a table using the Cursor
>>> cursor.execute("select * from book")
                
#Retrieve the query result set
>>> result: tuple = cursor.fetchall()
>>> print(result)
 >> (['One Hundred Years of Solitude', 'Gabriel García Márquez'], ['A Brief History of Time', 'Stephen Hawking'])
```

## Enabling autocommit
<a name="python-connect-enable-autocommit"></a>

The autocommit property is off by default, following the Python Database API Specification. You can use the following commands to turn on the connection's autocommit property after performing a rollback command to make sure that a transaction is not in progress.

```
#Connect to the cluster
>>> import redshift_connector
>>> conn = redshift_connector.connect(...)

# Run a rollback command
>>>  conn.rollback()

# Turn on autocommit
>>>  conn.autocommit = True
>>>  conn.run("VACUUM")

# Turn off autocommit
>>>  conn.autocommit = False
```

## Configuring cursor paramstyle
<a name="python-connect-config-paramstyle"></a>

The paramstyle for a cursor can be modified via cursor.paramstyle. The default paramstyle used is `format`. Valid values for paramstyle are `qmark`, `numeric`, `named`, `format`, and `pyformat`.

The following are examples of using various paramstyles to pass parameters to a sample SQL statement.

```
# qmark
redshift_connector.paramstyle = 'qmark'
sql = 'insert into foo(bar, jar) VALUES(?, ?)'
cursor.execute(sql, (1, "hello world"))

# numeric
redshift_connector.paramstyle = 'numeric'
sql = 'insert into foo(bar, jar) VALUES(:1, :2)'
cursor.execute(sql, (1, "hello world"))

# named
redshift_connector.paramstyle = 'named'
sql = 'insert into foo(bar, jar) VALUES(:p1, :p2)'
cursor.execute(sql, {"p1":1, "p2":"hello world"})

# format
redshift_connector.paramstyle = 'format'
sql = 'insert into foo(bar, jar) VALUES(%s, %s)'
cursor.execute(sql, (1, "hello world"))

# pyformat
redshift_connector.paramstyle = 'pyformat'
sql = 'insert into foo(bar, jar) VALUES(%(bar)s, %(jar)s)'
cursor.execute(sql, {"bar": 1, "jar": "hello world"})
```

## Using COPY to copy data from an Amazon S3 bucket and UNLOAD to write data to it
<a name="python-connect-copy-unload-s3"></a>

The following example shows how to copy data from an Amazon S3 bucket into a table and then unload from that table back into the bucket.

A text file named `category_csv.txt` containing the following data is uploaded to an Amazon S3 bucket:.

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,"All ""non-musical"" theatre"
14,Shows,Opera,"All opera, light, and ""rock"" opera"
15,Concerts,Classical,"All symphony, concerto, and choir concerts"
```

Following is an example of the Python code, which first connects to the Amazon Redshift database. It then creates a table called `category` and copies the CSV data from the S3 bucket into the table.

```
#Connect to the cluster and create a Cursor
>>> import redshift_connector
>>> with redshift_connector.connect(...) as conn:
>>> with conn.cursor() as cursor:

#Create an empty table
>>>     cursor.execute("create table category (catid int, cargroup varchar, catname varchar, catdesc varchar)")

#Use COPY to copy the contents of the S3 bucket into the empty table 
>>>     cursor.execute("copy category from 's3://testing/category_csv.txt' iam_role 'arn:aws:iam::123:role/RedshiftCopyUnload' csv;")

#Retrieve the contents of the table
>>>     cursor.execute("select * from category")
>>>     print(cursor.fetchall())

#Use UNLOAD to copy the contents of the table into the S3 bucket
>>>     cursor.execute("unload ('select * from category') to 's3://testing/unloaded_category_csv.txt'  iam_role 'arn:aws:iam::123:role/RedshiftCopyUnload' csv;")

#Retrieve the contents of the bucket
>>>     print(cursor.fetchall())
 >> ([12, 'Shows', 'Musicals', 'Musical theatre'], [13, 'Shows', 'Plays', 'All "non-musical" theatre'], [14, 'Shows', 'Opera', 'All opera, light, and "rock" opera'], [15, 'Concerts', 'Classical', 'All symphony, concerto, and choir concerts'])
```

If you don't have `autocommit` set to true, commit with `conn.commit()` after running the `execute()` statements.

The data is unloaded into the file `unloaded_category_csv.text0000_part00` in the S3 bucket, with the following content:

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,"All ""non-musical"" theatre"
14,Shows,Opera,"All opera, light, and ""rock"" opera"
15,Concerts,Classical,"All symphony, concerto, and choir concerts"
```

# API reference for the Amazon Redshift Python connector
<a name="python-api-reference"></a>

Following, you can find a description of the Amazon Redshift Python connector API operations.

## redshift\$1connector
<a name="python-api-redshift_connector"></a>

Following, you can find a description of the `redshift_connector` API operation.

`connect(user, database, password[, port, …])`  
Establishes a connection to an Amazon Redshift cluster. This function validates user input, optionally authenticates using an identity provider plugin, and then constructs a connection object.

`apilevel`  
The DBAPI level supported, currently "2.0".

`paramstyle``str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str`  
The database API parameter style to use globally.

## Connection
<a name="python-api-connection"></a>

Following, you can find a description of the connection API operations for the Amazon Redshift Python connector.

`__init__(user, password, database[, host, …])`  
Initializes a raw connection object.

`cursor`  
Creates a cursor object bound to this connection.

`commit`  
Commits the current database transaction.

`rollback`  
Rolls back the current database transaction.

`close`  
Closes the database connection.

`execute(cursor, operation, vals)`  
Runs the specified SQL command. You can provide the parameters as a sequence or as a mapping, depending upon the value of `redshift_connector.paramstyle`.

`run(sql[, stream])`  
Runs the specified SQL command. Optionally, you can provide a stream for use with the COPY command.

`xid(format_id, global_transaction_id, …)`  
Create a transaction ID. Only the `global_transaction_id` parameter is used in postgres. format\$1id and branch\$1qualifier are not used in postgres. The `global_transaction_id` can be any string identifier supported by postgres that returns a tuple (`format_id`, `global_transaction_id`, `branch_qualifier`).

`tpc_begin(xid)`  
Begins a TPC transaction with a transaction ID `xid` consisting of a a format ID, global transaction ID, and branch qualifier. 

`tpc_prepare`  
Performs the first phase of a transaction started with .tpc\$1begin.

`tpc_commit([xid])`  
When called with no arguments, .tpc\$1commit commits a TPC transaction previously prepared with .tpc\$1prepare().

`tpc_rollback([xid])`  
When called with no arguments, .tpc\$1rollback rolls back a TPC transaction.

`tpc_recover`  
Returns a list of pending transaction IDs suitable for use with .tpc\$1commit(xid) or .tpc\$1rollback(xid).

## Cursor
<a name="python-api-cursor"></a>

Following, you can find a description of the cursor API operation.

`__init__(connection[, paramstyle])`  
Initializes a raw cursor object.

`insert_data_bulk(filename, table_name, parameter_indices, column_names, delimiter, batch_size)`  
Runs a bulk INSERT statement.

`execute(operation[, args, stream, …])`  
Runs a database operation.

`executemany(operation, param_sets)`  
Prepares a database operation, and then runs it for all parameter sequences or mappings provided.

`fetchone`  
Fetches the next row of a query result set.

`fetchmany([num])`  
Fetches the next set of rows of a query result.

`fetchall`  
Fetches all remaining rows of a query result.

`close`  
Closes the cursor now. 

`__iter__`  
A cursor object can be iterated to retrieve the rows from a query.

`fetch_dataframe([num])`  
Returns a dataframe of the last query results.

`write_dataframe(df, table)`  
Writes the same structure dataframe into an Amazon Redshift database.

`fetch_numpy_array([num])`  
Returns a NumPy array of the last query results.

`get_catalogs`  
Amazon Redshift doesn't support multiple catalogs from a single connection. Amazon Redshift only returns the current catalog.

`get_tables([catalog, schema_pattern, …])`  
Returns the unique public tables which are user-defined within the system.

`get_columns([catalog, schema_pattern, …])`  
Returns a list of all columns in a specific table in an Amazon Redshift database.

## AdfsCredentialsProvider plugin
<a name="python-adfs-credentials-plugin"></a>

Following is the syntax for the AdfsCredentialsProvider plugin API operation for the Amazon Redshift Python connector. 

```
redshift_connector.plugin.AdfsCredentialsProvider()
```

## AzureCredentialsProvider plugin
<a name="python-azure-credentials-plugin"></a>

Following is the syntax for the AzureCredentialsProvider plugin API operation for the Amazon Redshift Python connector.

```
redshift_connector.plugin.AzureCredentialsProvider()
```

## BrowserAzureCredentialsProvider plugin
<a name="python-browser-azure-credentials-plugin"></a>

Following is the syntax for the BrowserAzureCredentialsProvider plugin API operation for the Amazon Redshift Python connector.

```
redshift_connector.plugin.BrowserAzureCredentialsProvider()
```

## BrowserSamlCredentialsProvider plugin
<a name="python-browser-saml-credentials-plugin"></a>

Following is the syntax for the BrowserSamlCredentialsProvider plugin API operation for the Amazon Redshift Python connector.

```
redshift_connector.plugin.BrowserSamlCredentialsProvider()
```

## OktaCredentialsProvider plugin
<a name="python-okta-credentials-plugin"></a>

Following is the syntax for the OktaCredentialsProvider plugin API operation for the Amazon Redshift Python connector.

```
redshift_connector.plugin.OktaCredentialsProvider()
```

## PingCredentialsProvider plugin
<a name="python-ping-credentials-plugin"></a>

Following is the syntax for the PingCredentialsProvider plugin API operation for the Amazon Redshift Python connector.

```
redshift_connector.plugin.PingCredentialsProvider()
```

## SamlCredentialsProvider plugin
<a name="python-saml-credentials-plugin"></a>

Following is the syntax for the SamlCredentialsProvider plugin API operation for the Amazon Redshift Python connector.

```
redshift_connector.plugin.SamlCredentialsProvider()
```