Object storage#

Примечание

Ниже приведена оригинальная документация Trino. Скоро мы ее переведем на русский язык и дополним полезными примерами.

Object storage systems are commonly used to create data lakes or data lake houses. These systems provide methods to store objects in a structured manner and means to access them, for example using an API over HTTP. The objects are files in various format including ORC, Parquet and others. Object storage systems are available as service from public cloud providers and others vendors, or can be self-hosted using commercial as well as open source offerings.

Object storage connectors#

Trino accesses files directly on object storage and remote file system storage. The following connectors use this direct approach to read and write data files.

The connectors all support a variety of protocols and formats used on these object storage systems, and have separate requirements for metadata availability.

Configuration#

By default, no file system support is activated for your catalog. You must select and configure one of the following properties to determine the support for different file systems in the catalog. Each catalog can only use one file system support.

File system support properties#

Property

Description

fs.native-azure.enabled

Activate the native implementation for Azure Storage support. Defaults to false.

fs.native-gcs.enabled

Activate the native implementation for Google Cloud Storage support. Defaults to false.

fs.native-s3.enabled

Activate the native implementation for S3 storage support. Defaults to false.

fs.hadoop.enabled

Activate support for HDFS and legacy support for other file systems using the HDFS libraries. Defaults to false.

Native file system support#

Trino includes optimized implementations to access the following systems, and compatible replacements:

The native support is available in all four connectors, but must be activated for use.

Legacy file system support#

The default behavior uses legacy libraries that originate from the Hadoop ecosystem. It should only be used for accessing the Hadoop Distributed File System (HDFS):

All four connectors can use the related hive.* properties for access to other object storage system as legacy support. Additional documentation is available with the Hive connector and relevant dedicated pages:

Other object storage support#

Trino also provides the following additional support and features for object storage: