You can use Athena to query existing views in your external Apache Hive metastores. Athena translates your views for you on-the-fly at runtime without changing the original view or storing the translation.
For example, suppose you have a Hive view like the following that uses a syntax not
supported in Athena like LATERAL VIEW explode()
:
CREATE VIEW team_view AS
SELECT team, score
FROM matches
LATERAL VIEW explode(scores) m AS score
Athena translates the Hive view query string into a statement like the following that Athena can run:
SELECT team, score
FROM matches
CROSS JOIN UNNEST(scores) AS m (score)
For information about connecting an external Hive metastore to Athena, see Use an external Hive metastore.
Considerations and
limitations
When querying Hive views from Athena, consider the following points:
-
Athena does not support creating Hive views. You can create Hive views in your external Hive metastore, which you can then query from Athena.
-
Athena does not support custom UDFs for Hive views.
-
Due to a known issue in the Athena console, Hive views appear under the list of tables instead of the list of views.
-
Although the translation process is automatic, certain Hive functions are not supported for Hive views or require special handling. For more information, see the following section.
Hive function support
limitations
This section highlights the Hive functions that Athena does not support for Hive views
or that require special treatment. Currently, because Athena primarily supports functions
from Hive 2.2.0, functions that are available only in higher versions (such as Hive
4.0.0) are not available. For a full list of Hive functions, see Hive
language manual UDF
Aggregate functions
Aggregate functions that require special handling
The following aggregate function for Hive views requires special handling.
-
Avg – Instead of
avg(INT i)
, useavg(CAST(i AS DOUBLE))
.
Aggregate functions not supported
The following Hive aggregate functions are not supported in Athena for Hive views.
covar_pop histogram_numeric ntile percentile percentile_approx
Regression functions like regr_count
, regr_r2
, and
regr_sxx
are not supported in Athena for Hive views.
Date functions not supported
The following Hive date functions are not supported in Athena for Hive views.
date_format(date/timestamp/string ts, string fmt) day(string date) dayofmonth(date) extract(field FROM source) hour(string date) minute(string date) month(string date) quarter(date/timestamp/string) second(string date) weekofyear(string date) year(string date)
Masking functions not supported
Hive masking functions like mask()
, and mask_first_n()
are not supported in Athena for Hive views.
Miscellaneous functions
Miscellaneous functions that require special handling
The following miscellaneous functions for Hive views require special handling.
-
md5 – Athena supports
md5(binary)
but notmd5(varchar)
. -
Explode – Athena supports
explode
when it is used in the following syntax:LATERAL VIEW [OUTER] EXPLODE(
<argument>
) -
Posexplode – Athena supports
posexplode
when it is used in the following syntax:LATERAL VIEW [OUTER] POSEXPLODE(
<argument>
)In the
(pos, val)
output, Athena treats thepos
column asBIGINT
. Because of this, you may need to cast thepos
column toBIGINT
to avoid a stale view. The following example illustrates this technique.SELECT CAST(c AS BIGINT) AS c_bigint, d FROM table LATERAL VIEW POSEXPLODE(
<argument>
) t AS c, d
Miscellaneous functions not supported
The following Hive functions are not supported in Athena for Hive views.
aes_decrypt aes_encrypt current_database current_user inline java_method logged_in_user reflect sha/sha1/sha2 stack version
Operators
Operators that require special handling
The following operators for Hive views require special handling.
-
Mod operator (%) – Because the
DOUBLE
type implicitly casts toDECIMAL(x,y)
, the following syntax can cause aView is stale
error message:a_double % 1.0 AS column
To work around this issue, use
CAST
, as in the following example.CAST(a_double % 1.0 as DOUBLE) AS column
-
Division operator (/) – In Hive,
int
divided byint
produces adouble
. In Athena, the same operation produces a truncatedint
.
Operators not supported
Athena does not support the following operators for Hive views.
~A – bitwise NOT
A ^ b – bitwise XOR
A & b – bitwise
AND
A | b – bitwise OR
A <=> b – Returns same result as
the equals (=
) operator for non-null operands. Returns
TRUE
if both are NULL
, FALSE
if one
of them is NULL
.
String functions
String functions that require special handling
The following Hive string functions for Hive views require special handling.
-
chr(bigint|double a) – Hive allows negative arguments; Athena does not.
-
instr(string str, string substr) – Because the Athena mapping for the
instr
function returnsBIGINT
instead ofINT
, use the following syntax:CAST(instr(string str, string substr) as INT)
Without this step, the view will be considered stale.
-
length(string a) – Because the Athena mapping for the
length
function returnsBIGINT
instead ofINT
, use the following syntax so that the view will not be considered stale:CAST(length(string str) as INT)
String functions not supported
The following Hive string functions are not supported in Athena for Hive views.
ascii(string str) character_length(string str) decode(binary bin, string charset) encode(string src, string charset) elt(N int,str1 string,str2 string,str3 string,...) field(val T,val1 T,val2 T,val3 T,...) find_in_set(string str, string strList) initcap(string A) levenshtein(string A, string B) locate(string substr, string str[, int pos]) octet_length(string str) parse_url(string urlString, string partToExtract [, string keyToExtract]) printf(String format, Obj... args) quote(String text) regexp_extract(string subject, string pattern, int index) repeat(string str, int n) sentences(string str, string lang, string locale) soundex(string A) space(int n) str_to_map(text[, delimiter1, delimiter2]) substring_index(string A, string delim, int count)
XPath functions not supported
Hive XPath functions like xpath
, xpath_short
, and
xpath_int
are not supported in Athena for Hive views.
Troubleshooting
When you use Hive views in Athena, you may encounter the following issues:
-
View
<view name>
is stale – This message usually indicates a type mismatch between the view in Hive and Athena. If the same function in the Hive LanguageManual UDFand Presto functions and operators documentation has different signatures, try casting the mismatched data type. -
Function not registered – Athena does not currently support the function. For details, see the information earlier in this document.