Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# Managing concurrent write operations
<a name="c_Concurrent_writes"></a>

Some applications require not only concurrent querying and loading, but also the ability to write to multiple tables or the same table concurrently. In this context, *concurrently* means overlapping, not scheduled to run at precisely the same time. Two transactions are considered to be concurrent if the second one starts before the first commits. Concurrent operations can originate from different sessions that are controlled either by the same user or by different users. 

Amazon Redshift supports these types of applications by allowing tables to be read while they are being incrementally loaded or modified. Queries simply see the latest committed version, or *snapshot*, of the data, rather than waiting for the next version to be committed. If you want a particular query to wait for a commit from another write operation, you have to schedule it accordingly.

**Note**  
Amazon Redshift supports a default *automatic commit* behavior in which each separately run SQL command commits individually. If you enclose a set of commands in a transaction block (defined by [BEGIN](r_BEGIN.md) and [END](r_END.md) statements), the block commits as one transaction, so you can roll it back if necessary. Exceptions to this behavior are the TRUNCATE and VACUUM commands, which automatically commit all outstanding changes made in the current transaction.   
Some SQL clients issue BEGIN and COMMIT commands automatically, so the client controls whether a group of statements are run as a transaction or each individual statement is run as its own transaction. Check the documentation for the interface you are using. For example, when using the Amazon Redshift JDBC driver, a JDBC `PreparedStatement` with a query string that contains multiple (semicolon separated) SQL commands runs all the statements as a single transaction. In contrast, if you use SQL Workbench/J and set AUTO COMMIT ON, then if you run multiple statements, each statement runs as its own transaction. 

The following topics describe some of the key concepts and use cases that involve transactions, database snapshots, updates, and concurrent behavior.

**Topics**
+ [Isolation levels in Amazon Redshift](c_serial_isolation.md)
+ [Write and read/write operations](c_write_readwrite.md)
+ [Concurrent write examples](r_Serializable_isolation_example.md)
+ [Troubleshooting serializable isolation errors](c_serial_isolation-serializable-isolation-troubleshooting.md)

# Isolation levels in Amazon Redshift
<a name="c_serial_isolation"></a>

Concurrent write operations are supported in Amazon Redshift in a protective way, using write locks on tables and the principle of *serializable isolation*. Serializable isolation preserves the illusion that a transaction running against a table is the only transaction that is running against that table.

Amazon Redshift databases support concurrent write operations by having each operation use the latest committed version, or snapshot, of their data at the start of the transaction. A database snapshot is created within a transaction on the first occurrence of most SELECT statements, DML commands such as COPY, DELETE, INSERT, UPDATE, and TRUNCATE, and the following DDL commands:
+  ALTER TABLE (to add or drop columns) 
+  CREATE TABLE 
+  DROP TABLE 
+  TRUNCATE TABLE 

No other transaction is able to change this snapshot, meaning transactions are isolated from one another. That is, concurrent transactions are invisible to each other and can’t detect each other's changes.

ny concurrent execution of transactions must produce the same results as the serial execution of those transactions. If no serial execution of those transactions can produce the same results, the transaction that runs a statement that might break the ability to serialize is stopped and rolled back.

For example, suppose that a user attempts to run two concurrent transactions, T1 and T2. Running T1 and T2 must produce the same results as at least one of the following scenarios:
+ T1 and T2 run serially in that order.
+ T2 and T1 run serially in that order.

 Isolation levels in Amazon Redshift prevent the following problems: 
+  Dirty reads ‐ A dirty read occurs when a transaction reads data that has not yet been committed. For example, suppose transaction 1 updates a row. Transaction 2 reads the updated row before T1 commits the update. If T1 rolls back the change, T2 will have read data in uncommitted rows that Amazon Redshift now considers to never have existed. 
+  Non-repeatable reads ‐ A non-repeatable read occurs when a single transaction reads the same row twice but gets different data each time. For example, suppose transaction 1 reads a row. Transaction 2 updates or deletes that row and commits the update or delete. If T1 rereads the row, it retrieves different row values or discovers that the row has been deleted. 
+  Phantoms – A phantom is a row that matches the search criteria but is not initially seen. For example, suppose transaction 1 reads a set of rows that satisfies its search criteria. Transaction 2 generates a new row in an UPDATE or INSERT statement that matches the search criteria for T1. If T1 reruns its search statement, it gets a different set of rows. 

## SNAPSHOT and SERIALIZABLE isolation
<a name="c_serial_isolation-snapshot_and_serializable"></a>

SNAPSHOT and SERIALIZABLE isolation are the two serializable isolation levels available in Amazon Redshift. 

SNAPSHOT isolation is the default isolation level when creating provisioned clusters and serverless workgroups, letting you process larger volumes of data than SERIALIZABLE isolation in less time.

SERIALIZABLE isolation takes more time, but implements stricter constraints on concurrent transactions. This isolation level prevents problems such as write-skew anomalies by only allowing one transaction to commit, while canceling all other concurrent transaction with an serializable isolation violation error.

Following is a timeline example of how two concurrent write operations would be handled when using SNAPSHOT isolation. Each user’s UPDATE statement is allowed to commit because they don’t conflict by attempting to update the same rows.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/c_serial_isolation.html)

If the same scenario is run using serializable isolation, then Amazon Redshift terminates user 2 due to a serializable violation and returns error `1023`. For more information, see [Troubleshooting serializable isolation errors](c_serial_isolation-serializable-isolation-troubleshooting.md). In this case, only user 1 can commit successfully. 

## Considerations
<a name="c_serial_isolation-considerations"></a>

When using isolation levels in Amazon Redshift, consider the following:
+  Query the STV\$1DB\$1ISOLATION\$1LEVEL catalog view to view which isolation level your database is using. For more information, see [STV\$1DB\$1ISOLATION\$1LEVEL](r_STV_DB_ISOLATION_LEVEL.md). 
+  Query the PG\$1DATABASE\$1INFO view to see how many concurrent transactions are supported for your database. For more information, see [PG\$1DATABASE\$1INFO](r_PG_DATABASE_INFO.md). 
+  System catalog tables (PG) and other Amazon Redshift system tables aren't locked in a transaction. Therefore, changes to database objects that arise from DDL and TRUNCATE operations are visible on commit to any concurrent transactions. 

   For example, suppose that table A exists in the database when two concurrent transactions, T1 and T2, start. Suppose that T2 returns a list of tables by selecting from the PG\$1TABLES catalog table. Then T1 drops table A and commits, and then T2 lists the tables again. Table A is now no longer listed. If T2 tries to query the dropped table, Amazon Redshift returns a "relation does not exist" error. The catalog query that returns the list of tables to T2 or checks that table A exists isn't subject to the same isolation rules as operations performed on user tables. 

   Transactions for updates to these tables run in a read committed isolation mode. 
+  PG-prefix catalog tables don't support SNAPSHOT isolation. 

# Write and read/write operations
<a name="c_write_readwrite"></a>

You can manage the specific behavior of concurrent write operations by deciding when and how to run different types of commands. The following commands are relevant to this discussion: 
+ COPY commands, which perform loads (initial or incremental)
+ INSERT commands that append one or more rows at a time
+ UPDATE commands, which modify existing rows
+ DELETE commands, which remove rows 

COPY and INSERT operations are pure write operations. DELETE and UPDATE operations are read/write operations (for rows to be deleted or updated, they have to be read first). The results of concurrent write operations depend on the specific commands that are being run concurrently. 

UPDATE and DELETE operations behave differently because they rely on an initial table read before they do any writes. Given that concurrent transactions are invisible to each other, both UPDATEs and DELETEs have to read a snapshot of the data from the last commit. When the first UPDATE or DELETE releases its lock, the second UPDATE or DELETE needs to determine whether the data that it is going to work with is potentially stale. It will not be stale, because the second transaction does not obtain its snapshot of data until after the first transaction has released its lock.

## Potential deadlock situation for concurrent write transactions involving multiple tables
<a name="c_write_readwrite-potential-deadlock"></a>

When transactions involve updates of more than one table, there is always the possibility of concurrently running transactions becoming deadlocked when they both try to write to the same set of tables. A transaction releases all of its table locks at once when it either commits or rolls back; it doesn't relinquish locks one at a time.

For example, suppose that transactions T1 and T2 start at roughly the same time. If T1 starts writing to table A and T2 starts writing to table B, both transactions can proceed without conflict. However, if T1 finishes writing to table A and needs to start writing to table B, it won’t be able to proceed because T2 still holds the lock on B. Similarly, if T2 finishes writing to table B and needs to start writing to table A, it will not be able to proceed either because T1 still holds the lock on A. Because neither transaction can release its locks until all its write operations are committed, neither transaction can proceed. To avoid this kind of deadlock, you need to schedule concurrent write operations carefully. For example, you should always update tables in the same order in transactions and, if specifying locks, lock tables in the same order before you perform any DML operations. 

## Potential deadlock situation for concurrent write transactions involving a single table
<a name="c_write_readwrite-potential-deadlock-single"></a>

In a snapshot isolation environment, deadlocks can occur when running concurrent write transactions on the same table. The snapshot isolation deadlock happens when concurrent INSERT or COPY statements are sharing a lock and making progress, and another statement needs to perform an operation (UPDATE, DELETE, MERGE, or DDL operation) that requires an exclusive lock on the same table. 

Consider the following scenario:

Transaction 1 (T1):

```
INSERT/COPY INTO table_A;
```

Transaction 2 (T2):

```
INSERT/COPY INTO table_A; 
            <UPDATE/DELETE/MERGE/DDL statement> table_A
```

A deadlock can occur when multiple transactions with INSERT or COPY operations are running concurrently on the same table with a shared lock, and one of those transactions follows its pure write operation with an operation that requires an exclusive lock, such as an UPDATE, MERGE, DELETE, or DDL statement.

To avoid the deadlock in these situations, you can separate statements requiring an exclusive lock (UPDATE/MERGE/DELETE/DDL statements) to a different transaction so that any INSERT/COPY statements can progress simultaneously, and the statements requiring exclusive locks can execute after them. Alternatively, for transactions with INSERT/COPY statements and MERGE/UPDATE/MERGE statements on same table, you can include retry logic in your applications to work around potential deadlocks. 

# Concurrent write examples
<a name="r_Serializable_isolation_example"></a>

The following pseudo-code examples demonstrate how transactions either proceed or wait when they are run concurrently.

## Concurrent write examples with serializable isolation
<a name="r_Serializable_isolation_example-serializable"></a>

### Concurrent COPY operations into the same table with serializable isolation
<a name="r_Serializable_isolation_example-concurrent-copy-operations-into-the-same-table"></a>

Transaction 1 copies rows into the LISTING table: 

```
begin;
copy listing from ...;
end;
```

Transaction 2 starts concurrently in a separate session and attempts to copy more rows into the LISTING table. Transaction 2 must wait until transaction 1 releases the write lock on the LISTING table, then it can proceed. 

```
begin;
[waits]
copy listing from ;
end;
```

The same behavior would occur if one or both transactions contained an INSERT command instead of a COPY command.

### Concurrent DELETE operations from the same table with serializable isolation
<a name="r_Serializable_isolation_example-concurrent-delete-operations-from-the-same-table"></a>

Transaction 1 deletes rows from a table: 

```
begin;
delete from listing where ...;
end;
```

Transaction 2 starts concurrently and attempts to delete rows from the same table. It will succeed because it waits for transaction 1 to complete before attempting to delete rows.

```
begin
[waits]
delete from listing where ;
end;
```

The same behavior would occur if one or both transactions contained an UPDATE command to the same table instead of a DELETE command.

### Concurrent transactions with a mixture of read and write operations with serializable isolation
<a name="r_Serializable_isolation_example-concurrent-transactions"></a>

In this example, transaction 1 deletes rows from the USERS table, reloads the table, runs a COUNT(\$1) query, and then ANALYZE, before committing: 

```
begin;
delete one row from USERS table;
copy ;
select count(*) from users;
analyze ;
end;
```

Meanwhile, transaction 2 starts. This transaction attempts to copy additional rows into the USERS table, analyze the table, and then run the same COUNT(\$1) query as the first transaction:

```
begin;
[waits]
copy users from ...;
select count(*) from users;
analyze;
end;
```

The second transaction will succeed because it must wait for the first to complete. Its COUNT query will return the count based on the load it has completed.

## Concurrent write examples with snapshot isolation
<a name="r_Serializable_isolation_example-snapshot"></a>

### Concurrent COPY operations into the same table with snapshot isolation
<a name="r_Serializable_isolation_example-concurrent-copy-operations-into-the-same-table-snapshot"></a>

Transaction 1 copies rows into the LISTING table:

```
begin;
copy listing from ...;
end;
```

Transaction 2 starts concurrently in a separate session and attempts to copy more rows into the LISTING table. Transaction 2 can progress simultaneously until either transaction needs to write data to the target table `listing`, at which point they will run sequentially. 

```
begin; 
//When the COPY statement from T1 needs to write data to the table, the COPY statement from T2 waits.
copy listing from ...; 
end;
```

The same behavior would occur if one or both transactions contained an INSERT command instead of a COPY command.

### Concurrent DELETE operations from the same table with snapshot isolation
<a name="r_Serializable_isolation_example-concurrent-delete-operations-from-the-same-table-snapshot"></a>

Concurrent DELETE or UPDATE operations from the same table with snapshot isolation run the same as operations run with serializable isolation.

### Concurrent transactions with a mixture of read and write operations with snapshot isolation
<a name="r_Serializable_isolation_example-concurrent-transactions-snapshot"></a>

Concurrent transactions that are run with mixes of operations with snapshot isolation run the same as transactions with mixes of operations that are run with serializable isolation.

# Troubleshooting serializable isolation errors
<a name="c_serial_isolation-serializable-isolation-troubleshooting"></a>

## ERROR:1023 DETAIL: Serializable isolation violation on a table in Redshift
<a name="c_serial_isolation-serialization-isolation-1023"></a>

When Amazon Redshift detects a serializable isolation error, you see an error message such as the following.

```
ERROR:1023 DETAIL: Serializable isolation violation on table in Redshift
```

To address a serializable isolation error, you can try the following methods:
+ Retry the canceled transaction.

   Amazon Redshift detected that a concurrent workload is not serializable. It suggests gaps in the application logic, which can usually be worked around by retrying the transaction that encountered the error. If the issue persists, try one of the other methods. 
+ Move any operations that don't have to be in the same atomic transaction outside of the transaction.

  This method applies when individual operations inside two transactions cross-reference each other in a way that can affect the outcome of the other transaction. For example, the following two sessions each start a transaction. 

  ```
  Session1_Redshift=# begin;
  ```

  ```
  Session2_Redshift=# begin;
  ```

  The result of a SELECT statement in each transaction might be affected by an INSERT statement in the other. In other words, suppose that you run the following statements serially, in any order. In every case, the result is one of the SELECT statements returning one more row than if the transactions were run concurrently. There is no order in which the operations can run serially that produces the same result as when run concurrently. Thus, the last operation that is run results in a serializable isolation error.

  ```
  Session1_Redshift=# select * from tab1;
  Session1_Redshift=# insert into tab2 values (1);
  ```

  ```
  Session2_Redshift=# insert into tab1 values (1);
  Session2_Redshift=# select * from tab2;
  ```

  In many cases, the result of the SELECT statements isn't important. In other words, the atomicity of the operations in the transactions isn't important. In these cases, move the SELECT statements outside of their transactions, as shown in the following examples.

  ```
  Session1_Redshift=# begin;
  Session1_Redshift=# insert into tab1 values (1)
  Session1_Redshift=# end;
  Session1_Redshift=# select * from tab2;
  ```

  ```
  Session2_Redshift # select * from tab1;
  Session2_Redshift=# begin;
  Session2_Redshift=# insert into tab2 values (1)
  Session2_Redshift=# end;
  ```

  In these examples, there are no cross-references in the transactions. The two INSERT statements don't affect each other. In these examples, there is at least one order in which the transactions can run serially and produce the same result as if run concurrently. This means that the transactions are serializable.
+ Force serialization by locking all tables in each session.

  The [LOCK](r_LOCK.md) command blocks operations that can result in serializable isolation errors. When you use the LOCK command, be sure to do the following:
  + Lock all tables affected by the transaction, including those affected by read-only SELECT statements inside the transaction.
  + Lock tables in the same order, regardless of the order that operations are performed in.
  + Lock all tables at the beginning of the transaction, before performing any operations.
+ Use snapshot isolation for concurrent transactions

  Use an ALTER DATABASE command with snapshot isolation. For more information about the SNAPSHOT parameter for ALTER DATABASE, see [Parameters](r_ALTER_DATABASE.md#r_ALTER_DATABASE-parameters).

## ERROR:1018 DETAIL: Relation does not exist
<a name="c_serial_isolation-serialization-isolation-1018"></a>

When you run concurrent Amazon Redshift operations in different sessions, you see an error message such as the following.

```
ERROR: 1018 DETAIL: Relation does not exist.
```

Transactions in Amazon Redshift follow snapshot isolation. After a transaction begins, Amazon Redshift takes a snapshot of the database. For the entire lifecycle of the transaction, the transaction operates on the state of the database as reflected in the snapshot. If the transaction reads from a table that doesn't exist in the snapshot, it throws the 1018 error message shown previously. Even when another concurrent transaction creates a table after the transaction has taken the snapshot, the transaction can't read from the newly created table.

To address this serialization isolation error, you can try to move the start of the transaction to a point where you know the table exists.

If the table is created by another transaction, this point is at least after that transaction has been committed. Also, ensure that no concurrent transaction has been committed that might have dropped the table.

```
session1 = # BEGIN;
session1 = # DROP TABLE A;
session1 = # COMMIT;
```

```
session2 = # BEGIN;
```

```
session3 = # BEGIN;
session3 = # CREATE TABLE A (id INT);
session3 = # COMMIT;
```

```
session2 = # SELECT * FROM A;
```

The last operation that is run as the read operation by session2 results in a serializable isolation error. This error happens when session2 takes a snapshot and the table has already been dropped by a committed session1. In other words, even though a concurrent session3 has created the table, session2 doesn't see the table because it's not in the snapshot.

To resolve this error, you can reorder the sessions as follows.

```
session1 = # BEGIN;
session1 = # DROP TABLE A;
session1 = # COMMIT;
```

```
session3 = # BEGIN;
session3 = # CREATE TABLE A (id INT);
session3 = # COMMIT;
```

```
session2 = # BEGIN;
session2 = # SELECT * FROM A;
```

Now when session2 takes its snapshot, session3 has already been committed, and the table is in the database. Session2 can read from the table without any error.