Release 405 (28 Dec 2022)#

General#

  • Add Trino version to the output of EXPLAIN. (#15317)

  • Add task input/output size distribution to the output of EXPLAIN ANALYZE VERBOSE. (#15286)

  • Add stage skewness warnings to the output of EXPLAIN ANALYZE. (#15286)

  • Add support for ALTER COLUMN ... SET DATA TYPE statement. (#11608)

  • Allow configuring a refresh interval for the database resource group manager with the resource-groups.refresh-interval configuration property. (#14514)

  • Improve performance of queries that compare date columns with timestamp(n) with time zone literals. (#5798)

  • Improve performance and resource utilization when inserting into tables. (#14718, #14874)

  • Improve performance for INSERT queries when fault-tolerant execution is enabled. (#14735)

  • Improve planning performance for queries with many GROUP BY clauses. (#15292)

  • Improve query performance for large clusters and skewed queries. (#15369)

  • Rename the node-scheduler.max-pending-splits-per-task configuration property to node-scheduler.min-pending-splits-per-task. (#15168)

  • Ensure that the configured number of task retries is not larger than 126. (#14459)

  • Fix incorrect rounding of time(n) and time(n) with time zone values near the top of the range of allowed values. (#15138)

  • Fix incorrect results for queries involving window functions without a PARTITION BY clause followed by the evaluation of window functions with a PARTITION BY and ORDER BY clause. (#15203)

  • Fix incorrect results when adding or subtracting an interval from a timestamp with time zone. (#15103)

  • Fix potential incorrect results when joining tables on indexed and non-indexed columns at the same time. (#15334)

  • Fix potential failure of queries involving MATCH_RECOGNIZE. (#15343)

  • Fix incorrect reporting of Projection CPU time in the output of EXPLAIN ANALYZE VERBOSE. (#15364)

  • Fix SET TIME ZONE LOCAL to correctly reset to the initial time zone of the client session. (#15314)

Security#

  • Add support for string replacement as part of impersonation rules. (#14962)

  • Add support for fetching access control rules via HTTPS. (#14008)

  • Fix some system.metadata tables improperly showing the names of catalogs which the user cannot access. (#14000)

  • Fix USE statement improperly disclosing the names of catalogs and schemas which the user cannot access. (#14208)

  • Fix improper HTTP redirect after OAuth 2.0 token refresh. (#15336)

Web UI#

  • Display operator CPU time in the «Stage Performance» tab. (#15339)

JDBC driver#

  • Return correct values in NULLABLE columns of the DatabaseMetaData.getColumns result. (#15214)

BigQuery connector#

  • Improve read performance with experimental support for Apache Arrow serialization when reading from BigQuery. This can be enabled with the bigquery.experimental.arrow-serialization.enabled catalog configuration property. (#14972)

  • Fix queries incorrectly executing with the project ID specified in the credentials instead of the project ID specified in the bigquery.project-id catalog property. (#14083)

Delta Lake connector#

  • Add support for views. (#11609)

  • Add support for configuring batch size for reads on Parquet files using the parquet.max-read-block-row-count configuration property or the parquet_max_read_block_row_count session property. (#15474)

  • Improve performance and reduce storage requirements when running the vacuum procedure on S3-compatible storage. (#15072)

  • Improve memory accounting for INSERT, MERGE, and CREATE TABLE ... AS SELECT queries. (#14407)

  • Improve performance of reading Parquet files for boolean, tinyint, short, int, long, float, double, short decimal, UUID, time, decimal, varchar, and char data types. This optimization can be disabled with the parquet.optimized-reader.enabled catalog configuration property. (#14423, #14667)

  • Improve query performance when the nulls fraction statistic is not available for some columns. (#15132)

  • Improve performance when reading Parquet files. (#15257, #15474)

  • Improve performance of reading Parquet files for queries with filters. (#15268)

  • Improve DROP TABLE performance for tables stored on AWS S3. (#13974)

  • Improve performance of reading Parquet files for timestamp and timestamp with timezone data types. (#15204)

  • Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)

  • Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)

  • Allow registering existing table files in the metastore with the new register_table procedure. (#13568)

  • Deprecate creating a new table with existing table content. This can be re-enabled using the delta.legacy-create-table-with-existing-location.enabled configuration property or the legacy_create_table_with_existing_location_enabled session property. (#13568)

  • Fix query failure when reading Parquet files with large row groups. (#5729)

  • Fix DROP TABLE leaving files behind when using managed tables stored on S3 and created by the Databricks runtime. (#13017)

  • Fix query failure when the path contains special characters. (#15183)

  • Fix potential INSERT failure for tables stored on S3. (#15476)

Google Sheets connector#

  • Add support for setting a read timeout with the gsheets.read-timeout configuration property. (#15322)

  • Add support for base64-encoded credentials using the gsheets.credentials-key configuration property. (#15477)

  • Rename the credentials-path configuration property to gsheets.credentials-path, metadata-sheet-id to gsheets.metadata-sheet-id, sheets-data-max-cache-size to gsheets.max-data-cache-size, and sheets-data-expire-after-write to gsheets.data-cache-ttl. (#15042)

Hive connector#

  • Add support for referencing nested fields in columns with the UNIONTYPE Hive type. (#15278)

  • Add support for configuring batch size for reads on Parquet files using the parquet.max-read-block-row-count configuration property or the parquet_max_read_block_row_count session property. (#15474)

  • Improve memory accounting for INSERT, MERGE, and CREATE TABLE AS SELECT queries. (#14407)

  • Improve performance of reading Parquet files for boolean, tinyint, short, int, long, float, double, short decimal, UUID, time, decimal, varchar, and char data types. This optimization can be disabled with the parquet.optimized-reader.enabled catalog configuration property. (#14423, #14667)

  • Improve performance for queries which write data into multiple partitions. (#15241, #15066)

  • Improve performance when reading Parquet files. (#15257, #15474)

  • Improve performance of reading Parquet files for queries with filters. (#15268)

  • Improve DROP TABLE performance for tables stored on AWS S3. (#13974)

  • Improve performance of reading Parquet files for timestamp and timestamp with timezone data types. (#15204)

  • Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)

  • Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)

  • Disallow creating transactional tables when not using the Hive metastore. (#14673)

  • Fix query failure when reading Parquet files with large row groups. (#5729)

  • Fix incorrect schema already exists error caused by a client timeout when creating a new schema. (#15174)

  • Fix failure when an access denied exception happens while listing tables or views in a Glue metastore. (#14746)

  • Fix INSERT failure on ORC ACID tables when Apache Hive 3.1.2 is used as a metastore. (#7310)

  • Fix failure when reading Hive views with char types. (#15470)

  • Fix potential INSERT failure for tables stored on S3. (#15476)

Hudi connector#

  • Improve performance of reading Parquet files for boolean, tinyint, short, int, long, float, double, short decimal, UUID, time, decimal, varchar, and char data types. This optimization can be disabled with the parquet.optimized-reader.enabled catalog configuration property. (#14423, #14667)

  • Improve performance of reading Parquet files for queries with filters. (#15268)

  • Improve performance of reading Parquet files for timestamp and timestamp with timezone data types. (#15204)

  • Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)

  • Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)

  • Fix query failure when reading Parquet files with large row groups. (#5729)

Iceberg connector#

  • Add support for configuring batch size for reads on Parquet files using the parquet.max-read-block-row-count configuration property or the parquet_max_read_block_row_count session property. (#15474)

  • Add support for the Iceberg REST catalog. (#13294)

  • Improve memory accounting for INSERT, MERGE, and CREATE TABLE AS SELECT queries. (#14407)

  • Improve performance of reading Parquet files for boolean, tinyint, short, int, long, float, double, short decimal, UUID, time, decimal, varchar, and char data types. This optimization can be disabled with the parquet.optimized-reader.enabled catalog configuration property. (#14423, #14667)

  • Improve performance when reading Parquet files. (#15257, #15474)

  • Improve performance of reading Parquet files for queries with filters. (#15268)

  • Improve DROP TABLE performance for tables stored on AWS S3. (#13974)

  • Improve performance of reading Parquet files for timestamp and timestamp with timezone data types. (#15204)

  • Improve performance of queries that read a small number of columns and queries that process tables with large Parquet row groups or ORC stripes. (#15168)

  • Improve stability and reduce peak memory requirements when reading from Parquet files. (#15374)

  • Fix incorrect results when predicates over row columns on Parquet files are pushed into the connector. (#15408)

  • Fix query failure when reading Parquet files with large row groups. (#5729)

  • Fix REFRESH MATERIALIZED VIEW failure when the materialized view is based on non-Iceberg tables. (#13131)

  • Fix failure when an access denied exception happens while listing tables or views in a Glue metastore. (#14971)

  • Fix potential INSERT failure for tables stored on S3. (#15476)

Kafka connector#

MongoDB connector#

  • Add support for fault-tolerant execution. (#15062)

  • Add support for setting a file path and password for the truststore and keystore. (#15240)

  • Add support for case-insensitive name-matching in the query table function. (#15329)

  • Rename the mongodb.ssl.enabled configuration property to mongodb.tls.enabled. (#15240)

  • Upgrade minimum required MongoDB version to 4.2. (#15062)

  • Delete a MongoDB field from collections when dropping a column. Previously, the connector deleted only metadata. (#15226)

  • Remove deprecated mongodb.seeds and mongodb.credentials configuration properties. (#15263)

  • Fix failure when an unauthorized exception happens while listing schemas or tables. (#1398)

  • Fix NullPointerException when a column name contains uppercase characters in the query table function. (#15294)

  • Fix potential incorrect results when the objectid function is used more than once within a single query. (#15426)

MySQL connector#

  • Fix failure when the query table function contains a WITH clause. (#15332)

PostgreSQL connector#

  • Fix query failure when a FULL JOIN is pushed down. (#14841)

Redshift connector#

  • Add support for aggregation, join, and ORDER BY ... LIMIT pushdown. (#15365)

  • Add support for DELETE. (#15365)

  • Add schema, table, and column name length checks. (#15365)

  • Add full type mapping for Redshift types. The previous behavior can be restored via the redshift.use-legacy-type-mapping configuration property. (#15365)

SPI#

  • Remove deprecated ConnectorNodePartitioningProvider.getBucketNodeMap() method. (#14067)

  • Use the MERGE APIs in the engine to execute DELETE and UPDATE. Require connectors to implement beginMerge() and related APIs. Deprecate beginDelete(), beginUpdate() and UpdatablePageSource, which are unused and do not need to be implemented. (#13926)