Release 0.57#

Distinct aggregations#

The DISTINCT argument qualifier for aggregation functions is now fully supported. For example:

SELECT country, count(DISTINCT city), count(DISTINCT age)
FROM users
GROUP BY country

Примечание

approx_distinct() should be used in preference to this whenever an approximate answer is allowable as it is substantially faster and does not have any limits on the number of distinct items it can process. COUNT(DISTINCT ...) must transfer every item over the network and keep each distinct item in memory.

Hadoop 2.x#

Use the hive-hadoop2 connector to read Hive data from Hadoop 2.x.

Amazon S3#

All Hive connectors support reading data from Amazon S3. This requires two additional catalog properties for the Hive connector to specify your AWS Access Key ID and Secret Access Key:

hive.s3.aws-access-key=AKIAIOSFODNN7EXAMPLE
hive.s3.aws-secret-key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Miscellaneous#

  • Allow specifying catalog and schema in the JDBC драйвер URL.

  • Implement more functionality in the JDBC driver.

  • Allow certain custom InputFormats to work by propagating Hive serialization properties to the RecordReader.

  • Many execution engine performance improvements.

  • Fix optimizer performance regression.

  • Fix weird MethodHandle exception.