Параметры выполнения запросов#

Примечание

Ниже приведена оригинальная документация Trino. Скоро мы ее переведем на русский язык и дополним полезными примерами.

query.client.timeout#

Configures how long the cluster runs without contact from the client application, such as the CLI, before it abandons and cancels its work.

query.execution-policy#

  • Type: string

  • Default value: phased

  • Session property: execution_policy

Configures the algorithm to organize the processing of all of the stages of a query. You can use the following execution policies:

  • phased schedules stages in a sequence to avoid blockages because of inter-stage dependencies. This policy maximizes cluster resource utilization and provides the lowest query wall time.

  • all-at-once schedules all of the stages of a query at one time. As a result, cluster resource utilization is initially high, but inter-stage dependencies typically prevent full processing and cause longer queue times which increases the query wall time overall.

  • legacy-phased has similar functionality to phased, but can increase the query wall time as it attempts to minimize the number of running stages.

query.hash-partition-count#

  • Type: integer

  • Default value: 100

  • Session property: hash_partition_count

The number of partitions to use for processing distributed operations, such as joins, aggregations, partitioned window functions and others.

query.low-memory-killer.policy#

  • Type: string

  • Default value: total-reservation-on-blocked-nodes

Configures the behavior to handle killing running queries in the event of low memory availability. Supports the following values:

  • none - Do not kill any queries in the event of low memory.

  • total-reservation - Kill the query currently using the most total memory.

  • total-reservation-on-blocked-nodes - Kill the query currently using the most memory specifically on nodes that are now out of memory.

Примечание

Only applies for queries with task level retries disabled (retry-policy set to NONE or QUERY)

task.low-memory-killer.policy#

  • Type: string

  • Default value: total-reservation-on-blocked-nodes

Configures the behavior to handle killing running tasks in the event of low memory availability. Supports the following values:

  • none - Do not kill any tasks in the event of low memory.

  • total-reservation-on-blocked-nodes - Kill the tasks which are part of the queries which has task retries enabled and are currently using the most memory specifically on nodes that are now out of memory.

  • least-waste - Kill the tasks which are part of the queries which has task retries enabled and use significant amount of memory on nodes which are now out of memory. This policy avoids killing tasks which are already executing for a long time, so significant amount of work is not wasted.

Примечание

Only applies for queries with task level retries enabled (retry-policy=TASK)

query.low-memory-killer.delay#

The amount of time a query is allowed to recover between running out of memory and being killed, if query.low-memory-killer.policy or task.low-memory-killer.policy is set to value differnt than none.

query.max-execution-time#

  • Type: duration

  • Default value: 100d

  • Session property: query_max_execution_time

The maximum allowed time for a query to be actively executing on the cluster, before it is terminated. Compared to the run time below, execution time does not include analysis, query planning or wait times in a queue.

query.max-length#

  • Type: integer

  • Default value: 1,000,000

  • Maximum value: 1,000,000,000

The maximum number of characters allowed for the SQL query text. Longer queries are not processed, and terminated with error QUERY_TEXT_TOO_LARGE.

query.max-planning-time#

  • Type: duration

  • Default value: 10m

  • Session property: query_max_planning_time

The maximum allowed time for a query to be actively planning the execution. After this period the coordinator will make its best effort to stop the query. Note that some operations in planning phase are not easily cancellable and may not terminate immediately.

query.max-run-time#

  • Type: duration

  • Default value: 100d

  • Session property: query_max_run_time

The maximum allowed time for a query to be processed on the cluster, before it is terminated. The time includes time for analysis and planning, but also time spend in a queue waiting, so essentially this is the time allowed for a query to exist since creation.

query.max-stage-count#

  • Type: integer

  • Default value: 150

  • Minimum value: 1

The maximum number of stages allowed to be generated per query. If a query generates more stages than this it will get killed with error QUERY_HAS_TOO_MANY_STAGES.

Предупреждение

Setting this to a high value can cause queries with large number of stages to introduce instability in the cluster causing unrelated queries to get killed with REMOTE_TASK_ERROR and the message Max requests queued per destination exceeded for HttpDestination ...

query.max-history#

The maximum number of queries to keep in the query history to provide statistics and other information. If this amount is reached, queries are removed based on age.

cedrusdata.query-external-history.path#

  • Type: string

  • No default value.

CedrusData позволяет отображать в UI историю выполнения запросов, запущенных в другом кластере.

Предварительно необходимо сохранить JSON-представление запроса в файл. Для этого можно открыть запрос в UI текущего кластера и нажать на кнопку «JSON», либо воспользоваться системной таблицей runtime.cedrusdata_query_json ( документация).

Для отображения запроса в истории текущего кластера, перенесите файл JSON в директорию локальной файловой системы узла-координатора, и укажите путь к директории в параметре конфигурации cedrusdata.query-external-history.path. В момент запуска координатор проанализирует все файлы в указанной директории и ее поддиректориях, и отобразит соответствующие запросы в UI.

Координатор анализирует директорию однократно при запуске. Если вы добавили новые JSON файлы, и хотите отобразить их в UI без перезапуска координатора, запустите процедуру system.runtime.cedrusdata_refresh_query_external_history() (документация).

Статистика анализатора истории доступна в JMX таблице trino.execution:name=queryexternalhistorymanager.

cedrusdata.query-external-history.file-pattern#

  • Type: string

  • No default value.

Java-паттерн, применяемый к полному пути к файлам из директории cedrusdata.query-external-history.path. Если путь к JSON файлу запроса не соответствует паттерну, запрос не будет отображен в UI.

Используется только при непустом значении параметра cedrusdata.query-external-history.path.

cedrusdata.query-external-history.deserialize#

  • Type: boolean

  • Default value: true

Следует ли хранить частичные JSON представления запросов из файлов в памяти координатора. Ускоряет отображение исторических запросов из директории cedrusdata.query-external-history.path в UI, но потребляет до нескольких десятков килобайт оперативной памяти на каждый запрос.

Используется только при непустом значении параметра cedrusdata.query-external-history.path.

query.min-expire-age#

The minimal age of a query in the history before it is expired. An expired query is removed from the query history buffer and no longer available in the Web UI.

query.remote-task.max-error-duration#

Timeout value for remote tasks that fail to communicate with the coordinator. If the coordinator is unable to receive updates from a remote task before this value is reached, the coordinator treats the task as failed.

retry-policy#

  • Type: string

  • Default value: NONE

The retry policy to use for Отказоустойчивость. Supports the following values:

  • NONE - Disable fault-tolerant execution.

  • TASK - Retry individual tasks within a query in the event of failure. Requires configuration of an exchange manager.

  • QUERY - Retry the whole query in the event of failure.