Data types - Amazon Redshift

Data types

Each value that Amazon Redshift stores or retrieves has a data type with a fixed set of associated properties. Data types are declared when tables are created. A data type constrains the set of values that a column or argument can contain.

The following table lists the data types that you can use in Amazon Redshift tables.

Data type Aliases Description
SMALLINT INT2 Signed two-byte integer
INTEGER INT, INT4 Signed four-byte integer
BIGINT INT8 Signed eight-byte integer
DECIMAL NUMERIC Exact numeric of selectable precision
REAL FLOAT4 Single precision floating-point number
DOUBLE PRECISION FLOAT8, FLOAT Double precision floating-point number
CHAR CHARACTER, NCHAR, BPCHAR Fixed-length character string
VARCHAR CHARACTER VARYING, NVARCHAR, TEXT Variable-length character string with a user-defined limit
DATE Calendar date (year, month, day)
TIME TIME WITHOUT TIME ZONE Time of day
TIMETZ TIME WITH TIME ZONE Time of day with time zone
TIMESTAMP TIMESTAMP WITHOUT TIME ZONE Date and time (without time zone)
TIMESTAMPTZ TIMESTAMP WITH TIME ZONE Date and time (with time zone)
INTERVAL YEAR TO MONTH Time duration in year to month order
INTERVAL DAY TO SECOND Time duration in day to second order
BOOLEAN BOOL Logical Boolean (true/false)
HLLSKETCH Type used with HyperLogLog sketches.
SUPER A superset data type that encompasses all scalar types of Amazon Redshift including complex types such as ARRAY and STRUCTS.
VARBYTE VARBINARY, BINARY VARYING Variable-length binary value
GEOMETRY Spatial data
GEOGRAPHY Spatial data
Note

For information about unsupported data types, such as "char" (notice that char is enclosed in quotation marks), see Unsupported PostgreSQL data types.

Multibyte characters

The VARCHAR data type supports UTF-8 multibyte characters up to a maximum of four bytes. Five-byte or longer characters are not supported. To calculate the size of a VARCHAR column that contains multibyte characters, multiply the number of characters by the number of bytes per character. For example, if a string has four Chinese characters, and each character is three bytes long, then you will need a VARCHAR(12) column to store the string.

The VARCHAR data type doesn't support the following invalid UTF-8 codepoints:

0xD800 – 0xDFFF (Byte sequences: ED A0 80ED BF BF)

The CHAR data type doesn't support multibyte characters.

Type compatibility and conversion

Following, you can find a discussion about how type conversion rules and data type compatibility work in Amazon Redshift.

Compatibility

Data type matching and matching of literal values and constants to data types occurs during various database operations, including the following:

  • Data manipulation language (DML) operations on tables

  • UNION, INTERSECT, and EXCEPT queries

  • CASE expressions

  • Evaluation of predicates, such as LIKE and IN

  • Evaluation of SQL functions that do comparisons or extractions of data

  • Comparisons with mathematical operators

The results of these operations depend on type conversion rules and data type compatibility. Compatibility implies that a one-to-one matching of a certain value and a certain data type is not always required. Because some data types are compatible, an implicit conversion, or coercion, is possible (for more information, see Implicit conversion types). When data types are incompatible, you can sometimes convert a value from one data type to another by using an explicit conversion function.

General compatibility and conversion rules

Note the following compatibility and conversion rules:

  • In general, data types that fall into the same type category (such as different numeric data types) are compatible and can be implicitly converted.

    For example, with implicit conversion you can insert a decimal value into an integer column. The decimal is rounded to produce a whole number. Or you can extract a numeric value, such as 2008, from a date and insert that value into an integer column.

  • Numeric data types enforce overflow conditions that occur when you attempt to insert out-of-range values. For example, a decimal value with a precision of 5 does not fit into a decimal column that was defined with a precision of 4. An integer or the whole part of a decimal is never truncated; however, the fractional part of a decimal can be rounded up or down, as appropriate. However, results of explicit casts of values selected from tables are not rounded.

  • Different types of character strings are compatible; VARCHAR column strings containing single-byte data and CHAR column strings are comparable and implicitly convertible. VARCHAR strings that contain multibyte data are not comparable. Also, you can convert a character string to a date, time, timestamp, or numeric value if the string is an appropriate literal value; any leading or trailing spaces are ignored. Conversely, you can convert a date, time, timestamp, or numeric value to a fixed-length or variable-length character string.

    Note

    A character string that you want to cast to a numeric type must contain a character representation of a number. For example, you can cast the strings '1.0' or '5.9' to decimal values, but you cannot cast the string 'ABC' to any numeric type.

  • If you compare DECIMAL values with character strings, Amazon Redshift attempts to convert the character string to a DECIMAL value. When comparing all other numeric values with character strings, the numeric values are converted to character strings. To enforce the opposite conversion (for example, converting character strings to integers, or converting DECIMAL values to character strings), use an explicit function, such as CAST.

  • To convert 64-bit DECIMAL or NUMERIC values to a higher precision, you must use an explicit conversion function such as the CAST or CONVERT functions.

  • When converting DATE or TIMESTAMP to TIMESTAMPTZ, or converting TIME to TIMETZ, the time zone is set to the current session time zone. The session time zone is UTC by default. For more information about setting the session time zone, see timezone.

  • Similarly, TIMESTAMPTZ is converted to DATE, TIME, or TIMESTAMP based on the current session time zone. The session time zone is UTC by default. After the conversion, time zone information is dropped.

  • Character strings that represent a timestamp with time zone specified are converted to TIMESTAMPTZ using the current session time zone, which is UTC by default. Likewise, character strings that represent a time with time zone specified are converted to TIMETZ using the current session time zone, which is UTC by default.

Implicit conversion types

There are two types of implicit conversions:

  • Implicit conversions in assignments, such as setting values in INSERT or UPDATE commands.

  • Implicit conversions in expressions, such as performing comparisons in the WHERE clause.

The table following lists the data types that can be converted implicitly in assignments or expressions. You can also use an explicit conversion function to perform these conversions.

From type To type
BIGINT (INT8) BOOLEAN
CHAR
DECIMAL (NUMERIC)
DOUBLE PRECISION (FLOAT8)
INTEGER (INT, INT4)
REAL (FLOAT4)
SMALLINT (INT2)
VARCHAR
CHAR VARCHAR
DATE CHAR
VARCHAR
TIMESTAMP
TIMESTAMPTZ
DECIMAL (NUMERIC) BIGINT (INT8)
CHAR
DOUBLE PRECISION (FLOAT8)
INTEGER (INT, INT4)
REAL (FLOAT4)
SMALLINT (INT2)
VARCHAR
DOUBLE PRECISION (FLOAT8) BIGINT (INT8)
CHAR
DECIMAL (NUMERIC)
INTEGER (INT, INT4)
REAL (FLOAT4)
SMALLINT (INT2)
VARCHAR
INTEGER (INT, INT4) BIGINT (INT8)
BOOLEAN
CHAR
DECIMAL (NUMERIC)
DOUBLE PRECISION (FLOAT8)
REAL (FLOAT4)
SMALLINT (INT2)
VARCHAR
REAL (FLOAT4) BIGINT (INT8)
CHAR
DECIMAL (NUMERIC)
INTEGER (INT, INT4)
SMALLINT (INT2)
VARCHAR
SMALLINT (INT2) BIGINT (INT8)
BOOLEAN
CHAR
DECIMAL (NUMERIC)
DOUBLE PRECISION (FLOAT8)
INTEGER (INT, INT4)
REAL (FLOAT4)
VARCHAR
TIMESTAMP CHAR
DATE
VARCHAR
TIMESTAMPTZ
TIME
TIMESTAMPTZ CHAR
DATE
VARCHAR
TIMESTAMP
TIMETZ
TIME VARCHAR
TIMETZ
INTERVAL DAY TO SECOND
TIMETZ VARCHAR
TIME
GEOMETRY GEOGRAPHY
GEOGRAPHY GEOMETRY
Note

Implicit conversions between TIMESTAMPTZ, TIMESTAMP, DATE, TIME, TIMETZ, or character strings use the current session time zone. For information about setting the current time zone, see timezone.

The GEOMETRY and GEOGRAPHY data types can't be implicitly converted to any other data type, except each other. For more information, see CAST function.

The VARBYTE data type can't be implicitly converted to any other data type. For more information, see CAST function.

Using dynamic typing for the SUPER data type

Amazon Redshift uses dynamic typing to process schemaless SUPER data without the need to declare the data types before you use them in your query. Dynamic typing uses the results of navigating into SUPER data columns without having to explicitly cast them into Amazon Redshift types. For more information about using dynamic typing for SUPER data type, see Dynamic typing.

You can cast SUPER values to and from other data types with some exceptions. For more information, see Limitations.