Amazon QLDB concurrency model
Important
End of support notice: Existing customers will be able to use Amazon QLDB until end of support on 07/31/2025. For more details, see
Migrate an Amazon QLDB Ledger to Amazon Aurora PostgreSQL
Amazon QLDB is intended to address the needs of high-performance online transaction processing (OLTP) workloads. QLDB supports SQL-like query capabilities, and delivers full ACID transactions. In addition, QLDB data items are documents, delivering schema flexibility and intuitive data modeling. With a journal at the core, you can use QLDB to access the complete and verifiable history of all changes to your data, and stream coherent transactions to other data services as needed.
Topics
Optimistic concurrency control
In QLDB, concurrency control is implemented using optimistic concurrency control (OCC). OCC operates on the principle that multiple transactions can frequently complete without interfering with each other.
Using OCC, transactions in QLDB don't acquire locks on database resources and operate with full serializable isolation. QLDB runs concurrent transactions in a serial manner, such that it produces the same effect as if those transactions were started serially.
Before committing, each transaction performs a validation check to ensure that no other committed transaction has modified the data that it's accessing. If this check reveals conflicting modifications, or the state of the data changes, the committing transaction is rejected. However, the transaction can be restarted.
When a transaction writes to QLDB, the validation checks of the OCC model are
implemented by QLDB itself. If a transaction can't be written to the journal due to a
failure in the verification phase of OCC, QLDB returns an
OccConflictException
to the application layer. The application software
is responsible for ensuring that the transaction is restarted. The application should
abort the rejected transaction and then retry the whole transaction from the
start.
To learn how the QLDB driver handles and retries OCC conflicts and other transient exceptions, see Understanding retry policy with the driver in Amazon QLDB.
Using indexes to avoid full table scans
In QLDB, every PartiQL statement (including every SELECT
query) is
processed in a transaction and is subject to a transaction
timeout limit.
As a best practice, you should run statements with a WHERE
predicate
clause that filters on an indexed field or a document ID. QLDB requires an
equality operator on an indexed field to efficiently look up a
document; for example, WHERE indexedField = 123
or WHERE indexedField
IN (456, 789)
.
Without this indexed lookup, QLDB needs to do a full table scan when reading documents. This can cause query latency and transaction timeouts, and also increases the chances of an OCC conflict with competing transactions.
For example, consider a table named Vehicle
that has an index on the
VIN
field only. It contains the following documents.
VIN | Make | Model | Color |
---|---|---|---|
"1N4AL11D75C109151" |
"Audi" |
"A5" |
"Silver" |
"KM8SRDHF6EU074761" |
"Tesla" |
"Model S" |
"Blue" |
"3HGGK5G53FM761765" |
"Ducati" |
"Monster 1200" |
"Yellow" |
"1HVBBAANXWH544237" |
"Ford" |
"F 150" |
"Black" |
"1C4RJFAG0FC625797" |
"Mercedes" |
"CLK 350" |
"White" |
Two concurrent users named Alice and Bob are working with the same table in a ledger. They want to update two different documents, as follows.
Alice:
UPDATE Vehicle AS v
SET v.Color = 'Blue'
WHERE v.VIN = '1N4AL11D75C109151'
Bob:
UPDATE Vehicle AS v
SET v.Color = 'Red'
WHERE v.Make = 'Tesla' AND v.Model = 'Model S'
Suppose that Alice and Bob start their transactions at the same time. Alice's
UPDATE
statement does an indexed lookup on the VIN
field,
so it only needs to read that one document. Alice finishes and successfully commits her
transaction first.
Bob's statement filters on non-indexed fields, so it does a table scan and encounters
an OccConflictException
. This is because Alice's committed transaction
modified the data that Bob's statement is accessing, which includes every
document in the table—not only the document that Bob is updating.
Insertion OCC conflicts
OCC conflicts can include documents that are newly inserted—not only documents that previously existed. Consider the following diagram, in which two concurrent users (Alice and Bob) are working with the same table in a ledger. They both want to insert a new document only under the condition that a predicate value does not yet exist.
In this example, both Alice and Bob run the following SELECT
and
INSERT
statements within a single transaction. Their application runs
the INSERT
statement only if the SELECT
statement returns no
results.
SELECT * FROM Vehicle v WHERE v.VIN = 'ABCDE12345EXAMPLE'
INSERT INTO Vehicle VALUE
{
'VIN' : 'ABCDE12345EXAMPLE',
'Type' : 'Wagon',
'Year' : 2019,
'Make' : 'Subaru',
'Model' : 'Outback',
'Color' : 'Gray'
}
Suppose that Alice and Bob start their transactions at the same time. Both of their
SELECT
queries return no existing document with a VIN
of
ABCDE12345EXAMPLE
. So, their applications proceed with the
INSERT
statement.
Alice finishes and successfully commits her transaction first. Then, Bob tries to
commit his transaction, but QLDB rejects it and throws an
OccConflictException
. This is because Alice's committed transaction
modified the result set of Bob's SELECT
query, and OCC detects this
conflict before committing Bob's transaction.
The SELECT
query is required for this transaction example to be idempotent. Bob can then retry his whole
transaction from the start. But his next SELECT
query will return the
document that Alice inserted, so Bob's application won't run the
INSERT
.
Making transactions idempotent
The insert transaction in the previous
section is also an example of an idempotent transaction.
In other words, running the same transaction multiple times produces identical results.
If Bob runs the INSERT
without first checking if a particular
VIN
already exists, the table might end up with documents that have
duplicate VIN
values.
Consider other retry scenarios in addition to OCC conflicts. For example, it's possible that QLDB successfully commits a transaction on the server side, but the client times out while waiting for a response. As a best practice, make your write transactions idempotent to avoid any unexpected side effects in the case of concurrency or retries.
Redaction OCC conflicts
QLDB prevents concurrent redactions of
revisions on the same journal block. Consider an example where two concurrent
users (Alice and Bob) want to redact two different document revisions that are committed
on the same block in a ledger. First, Alice requests the redaction of one revision by
running the REDACT_REVISION
stored procedure, as follows.
EXEC REDACT_REVISION `{strandId:"JdxjkR9bSYB5jMHWcI464T", sequenceNo:17}`, '5PLf9SXwndd63lPaSIa0O6', 'ADR2Ll1fGsU4Jr4EqTdnQF'
Then, while Alice's request is still processing, Bob requests the redaction of another revision, as follows.
EXEC REDACT_REVISION `{strandId:"JdxjkR9bSYB5jMHWcI464T", sequenceNo:17}`, '8F0TPCmdNQ6JTRpiLj2TmW', '05K8zpGYWynDlEOK5afDRc'
QLDB rejects Bob's request with an OccConflictException
even though
they're trying to redact two different document revisions. This is because Bob's
revision is located on the same block as the revision that Alice is redacting. After
Alice's request finishes processing, Bob can then retry his redaction request.
Similarly, if two concurrent transactions try to redact the same revision, only one request can be processed. The other request fails with an OCC conflict exception until the redaction is complete. Afterwards, any requests to redact the same revision will result in an error that indicates the revision is already redacted.
Managing concurrent sessions
If you have experience using a relational database management system (RDBMS), you might be familiar with concurrent connection limits. QLDB doesn't have the same concept of a traditional RDBMS connection because transactions are run with HTTP request and response messages.
In QLDB, the analogous concept is an active session. A session is conceptually similar to a user login—it manages information about your data transaction requests to a ledger. An active session is one that is actively running a transaction. It can also be a session that recently finished a transaction where the service anticipates it will start another transaction immediately. QLDB supports one actively running transaction per session.
The limit of concurrent active sessions per ledger is defined in Quotas and limits in Amazon QLDB. After this limit is reached, any
session that tries to start a transaction will result in an error
(LimitExceededException
).
For information about the lifecycle of a session and how the QLDB driver handles sessions when running data transactions, see Session management with the driver. For best practices for configuring a session pool in your application using the QLDB driver, see Configuring the QldbDriver object in Amazon QLDB driver recommendations.