String functions and operators#
Примечание
Ниже приведена оригинальная документация Trino. Скоро мы ее переведем на русский язык и дополним полезными примерами.
String operators#
The ||
operator performs concatenation.
The LIKE
statement can be used for pattern matching and is documented in
Pattern comparison: LIKE.
String functions#
Примечание
These functions assume that the input strings contain valid UTF-8 encoded
Unicode code points. There are no explicit checks for valid UTF-8 and
the functions may return incorrect results on invalid UTF-8.
Invalid UTF-8 data can be corrected with from_utf8()
.
Additionally, the functions operate on Unicode code points and not user visible characters (or grapheme clusters). Some languages combine multiple code points into a single user-perceived character, the basic unit of a writing system for a language, but the functions will treat each code point as a separate unit.
The lower()
and upper()
functions do not perform
locale-sensitive, context-sensitive, or one-to-many mappings required for
some languages. Specifically, this will return incorrect results for
Lithuanian, Turkish and Azeri.
- chr(n) varchar #
Returns the Unicode code point
n
as a single character string.
- codepoint(string) integer #
Returns the Unicode code point of the only character of
string
.
- concat(string1, ..., stringN) varchar #
Returns the concatenation of
string1
,string2
,...
,stringN
. This function provides the same functionality as the SQL-standard concatenation operator (||
).
- concat_ws(string0, string1, ..., stringN) varchar #
Returns the concatenation of
string1
,string2
,...
,stringN
usingstring0
as a separator. Ifstring0
is null, then the return value is null. Any null values provided in the arguments after the separator are skipped.
- concat_ws(string0, array(varchar)) varchar
Returns the concatenation of elements in the array using
string0
as a separator. Ifstring0
is null, then the return value is null. Any null values in the array are skipped.
- format(format, args...) varchar
See
format()
.
- hamming_distance(string1, string2) bigint #
Returns the Hamming distance of
string1
andstring2
, i.e. the number of positions at which the corresponding characters are different. Note that the two strings must have the same length.
- length(string) bigint #
Returns the length of
string
in characters.
- levenshtein_distance(string1, string2) bigint #
Returns the Levenshtein edit distance of
string1
andstring2
, i.e. the minimum number of single-character edits (insertions, deletions or substitutions) needed to changestring1
intostring2
.
- lower(string) varchar #
Converts
string
to lowercase.
- lpad(string, size, padstring) varchar #
Left pads
string
tosize
characters withpadstring
. Ifsize
is less than the length ofstring
, the result is truncated tosize
characters.size
must not be negative andpadstring
must be non-empty.
- ltrim(string) varchar #
Removes leading whitespace from
string
.
- luhn_check(string) boolean #
Tests whether a
string
of digits is valid according to the Luhn algorithm.This checksum function, also known as
modulo 10
ormod 10
, is widely applied on credit card numbers and government identification numbers to distinguish valid numbers from mistyped, incorrect numbers.Valid identification number:
select luhn_check('79927398713'); -- true
Invalid identification number:
select luhn_check('79927398714'); -- false
- position(substring IN string) bigint #
Returns the starting position of the first instance of
substring
instring
. Positions start with1
. If not found,0
is returned.Примечание
This SQL-standard function has special syntax and uses the
IN
keyword for the arguments. See alsostrpos()
.
- replace(string, search) varchar #
Removes all instances of
search
fromstring
.
- replace(string, search, replace) varchar
Replaces all instances of
search
withreplace
instring
.
- reverse(string) varchar #
Returns
string
with the characters in reverse order.
- rpad(string, size, padstring) varchar #
Right pads
string
tosize
characters withpadstring
. Ifsize
is less than the length ofstring
, the result is truncated tosize
characters.size
must not be negative andpadstring
must be non-empty.
- rtrim(string) varchar #
Removes trailing whitespace from
string
.
- soundex(char) string #
soundex
returns a character string containing the phonetic representation ofchar
.It is typically used to evaluate the similarity of two expressions phonetically, that is how the string sounds when spoken:
SELECT name FROM nation WHERE SOUNDEX(name) = SOUNDEX('CHYNA'); name | -------+---- CHINA | (1 row)
- split(string, delimiter)#
Splits
string
ondelimiter
and returns an array.
- split(string, delimiter, limit)
Splits
string
ondelimiter
and returns an array of size at mostlimit
. The last element in the array always contain everything left in thestring
.limit
must be a positive number.
- split_part(string, delimiter, index) varchar #
Splits
string
ondelimiter
and returns the fieldindex
. Field indexes start with1
. If the index is larger than the number of fields, then null is returned.
- split_to_map(string, entryDelimiter, keyValueDelimiter) map<varchar, varchar> #
Splits
string
byentryDelimiter
andkeyValueDelimiter
and returns a map.entryDelimiter
splitsstring
into key-value pairs.keyValueDelimiter
splits each pair into key and value.
- split_to_multimap(string, entryDelimiter, keyValueDelimiter)#
Splits
string
byentryDelimiter
andkeyValueDelimiter
and returns a map containing an array of values for each unique key.entryDelimiter
splitsstring
into key-value pairs.keyValueDelimiter
splits each pair into key and value. The values for each key will be in the same order as they appeared instring
.
- strpos(string, substring) bigint #
Returns the starting position of the first instance of
substring
instring
. Positions start with1
. If not found,0
is returned.
- strpos(string, substring, instance) bigint
Returns the position of the N-th
instance
ofsubstring
instring
. Wheninstance
is a negative number the search will start from the end ofstring
. Positions start with1
. If not found,0
is returned.
- starts_with(string, substring) boolean #
Tests whether
substring
is a prefix ofstring
.
- substr(string, start) varchar #
This is an alias for
substring()
.
- substring(string, start) varchar #
Returns the rest of
string
from the starting positionstart
. Positions start with1
. A negative starting position is interpreted as being relative to the end of the string.
- substr(string, start, length) varchar
This is an alias for
substring()
.
- substring(string, start, length) varchar
Returns a substring from
string
of lengthlength
from the starting positionstart
. Positions start with1
. A negative starting position is interpreted as being relative to the end of the string.
- translate(source, from, to) varchar #
Returns the
source
string translated by replacing characters found in thefrom
string with the corresponding characters in theto
string. If thefrom
string contains duplicates, only the first is used. If thesource
character does not exist in thefrom
string, thesource
character will be copied without translation. If the index of the matching character in thefrom
string is beyond the length of theto
string, thesource
character will be omitted from the resulting string.Here are some examples illustrating the translate function:
SELECT translate('abcd', '', ''); -- 'abcd' SELECT translate('abcd', 'a', 'z'); -- 'zbcd' SELECT translate('abcda', 'a', 'z'); -- 'zbcdz' SELECT translate('Palhoça', 'ç','c'); -- 'Palhoca' SELECT translate('abcd', 'b', U&'\+01F600'); -- a😀cd SELECT translate('abcd', 'a', ''); -- 'bcd' SELECT translate('abcd', 'a', 'zy'); -- 'zbcd' SELECT translate('abcd', 'ac', 'z'); -- 'zbd' SELECT translate('abcd', 'aac', 'zq'); -- 'zbd'
- trim(string) varchar
Removes leading and trailing whitespace from
string
.
- trim([ [ specification ] [ string ] FROM ] source ) varchar #
Removes any leading and/or trailing characters as specified up to and including
string
fromsource
:SELECT trim('!' FROM '!foo!'); -- 'foo' SELECT trim(LEADING FROM ' abcd'); -- 'abcd' SELECT trim(BOTH '$' FROM '$var$'); -- 'var' SELECT trim(TRAILING 'ER' FROM upper('worker')); -- 'WORK'
- upper(string) varchar #
Converts
string
to uppercase.
- word_stem(word) varchar #
Returns the stem of
word
in the English language.
- word_stem(word, lang) varchar
Returns the stem of
word
in thelang
language.
Unicode functions#
- normalize(string) varchar #
Transforms
string
with NFC normalization form.
- normalize(string, form) varchar
Transforms
string
with the specified normalization form.form
must be one of the following keywords:Form
Description
NFD
Canonical Decomposition
NFC
Canonical Decomposition, followed by Canonical Composition
NFKD
Compatibility Decomposition
NFKC
Compatibility Decomposition, followed by Canonical Composition
Примечание
This SQL-standard function has special syntax and requires specifying
form
as a keyword, not as a string.
- to_utf8(string) varbinary #
Encodes
string
into a UTF-8 varbinary representation.
- from_utf8(binary) varchar #
Decodes a UTF-8 encoded string from
binary
. Invalid UTF-8 sequences are replaced with the Unicode replacement characterU+FFFD
.
- from_utf8(binary, replace) varchar
Decodes a UTF-8 encoded string from
binary
. Invalid UTF-8 sequences are replaced withreplace
. The replacement stringreplace
must either be a single character or empty (in which case invalid characters are removed).