sending data
The sending data
thread state indicates that a thread is reading and filtering rows for a query to
determine the correct result set. The name is misleading because it implies the state is transferring data, not
collecting and preparing data to be sent later.
Supported engine versions
This thread state information is supported for the following versions:
-
Aurora MySQL version 2 up to 2.09.2
Context
Many thread states are short-lasting. Operations occurring during sending
data
tend to perform large numbers of disk or cache reads. Therefore,
sending data
is often the longest-running state over the lifetime of a
given query. This state appears when Aurora MySQL is doing the following:
-
Reading and processing rows for a
SELECT
statement -
Performing a large number of reads from either disk or memory
-
Completing a full read of all data from a specific query
-
Reading data from a table, an index, or the work of a stored procedure
-
Sorting, grouping, or ordering data
After the sending data
state finishes preparing the data, the thread
state writing to net
indicates the return of data to the client. Typically,
writing to net
is captured only when the result set is very large or
severe network latency is slowing the transfer.
Likely causes of increased waits
The appearance of sending data
doesn't by itself indicate a problem. If
performance is poor, and you see frequent instances of sending data
, the
most likely causes are as follows.
Inefficient query
In most cases, what's responsible for this state is a query that isn't using an appropriate index to find the result set of a specific query. For example, consider a query reading a 10 million record table for all orders placed in California, where the state column isn't indexed or is poorly indexed. In the latter case, the index might exist, but the optimizer ignores it because of low cardinality.
Suboptimal server configuration
If several queries appear in the sending data
state, the database
server might be configured poorly. Specifically, the server might have the following
issues:
-
The database server doesn't have enough computing capacity: disk I/O, disk type and speed, CPU, or number of CPUs.
-
The server is starved for allocated resources, such as the InnoDB buffer pool for InnoDB tables or the key buffer for MyIsam tables.
-
Per-thread memory settings such as
sort_buffer
,read_buffer
, andjoin_buffer
consume more RAM than required, starving the physical server for memory resources.
Actions
The general guideline is to find queries that return large numbers of rows by checking the Performance Schema. If logging queries that don't use indexes is turned on, you can also examine the results from the slow logs.
Topics
Turn on the Performance Schema if it isn't turned on
Performance Insights reports thread states only if Performance Schema instruments aren't turned on. When Performance Schema instruments are turned on, Performance Insights reports wait events instead. Performance Schema instruments provide additional insights and better tools when you investigate potential performance problems. Therefore, we recommend that you turn on the Performance Schema. For more information, see Overview of the Performance Schema for Performance Insights on Aurora MySQL.
Examine memory settings
Examine the memory settings for the primary buffer pools. Make sure that these pools are appropriately sized for the workload. If your database uses multiple buffer pool instances, make sure that they aren't divided into many small buffer pools. Threads can only use one buffer pool at a time.
Make sure that the following memory settings used for each thread are properly sized:
-
read_buffer
-
read_rnd_buffer
-
sort_buffer
-
join_buffer
-
binlog_cache
Unless you have a specific reason to modify the settings, use the default values.
Examine the explain plans for index usage
For queries in the sending data
thread state, examine the plan to
determine whether appropriate indexes are used. If a query isn't using a useful
index, consider adding hints like USE INDEX
or FORCE
INDEX
. Hints can greatly increase or decrease the time it takes to run a
query, so use care before adding them.
Check the volume of data returned
Check the tables that are being queried and the amount of data that they contain. Can any of this data be archived? In many cases, the cause of poor query execution times isn't the result of the query plan, but the volume of data to be processed. Many developers are very efficient in adding data to a database but seldom consider dataset life cycle in the design and development phases.
Look for queries that perform well in low-volume databases but perform poorly in your current system. Sometimes developers who design specific queries might not realize that these queries are returning 350,000 rows. The developers might have developed the queries in a lower-volume environment with smaller datasets than production environments have.
Check for concurrency issues
Check whether multiple queries of the same type are running at the same time. Some forms of queries run efficiently when they run alone. However, if similar forms of query run together, or in high volume, they can cause concurrency issues. Often, these issues are caused when the database uses temp tables to render results. A restrictive transaction isolation level can also cause concurrency issues.
If tables are read and written to concurrently, the database might be using locks.
To help identify periods of poor performance, examine the use of databases through
large-scale batch processes. To see recent locks and rollbacks, examine the output
of the SHOW ENGINE INNODB STATUS
command.
Check the structure of your queries
Check whether captured queries from these states use subqueries. This type of query often leads to poor performance because the database compiles the results internally and then substitutes them back into the query to render data. This process is an extra step for the database. In many cases, this step can cause poor performance in a highly concurrent loading condition.
Also check whether your queries use large numbers of ORDER BY
and
GROUP BY
clauses. In such operations, often the database must first
form the entire dataset in memory. Then it must order or group it in a specific
manner before returning it to the client.