mirror of
https://github.com/FirebirdSQL/firebird.git
synced 2025-01-23 19:23:03 +01:00
453 lines
12 KiB
Plaintext
453 lines
12 KiB
Plaintext
RE2 regular expression syntax reference
|
||
-------------------------------------
|
||
|
||
Single characters:
|
||
. any character, possibly including newline (s=true)
|
||
[xyz] character class
|
||
[^xyz] negated character class
|
||
\d Perl character class
|
||
\D negated Perl character class
|
||
[[:alpha:]] ASCII character class
|
||
[[:^alpha:]] negated ASCII character class
|
||
\pN Unicode character class (one-letter name)
|
||
\p{Greek} Unicode character class
|
||
\PN negated Unicode character class (one-letter name)
|
||
\P{Greek} negated Unicode character class
|
||
|
||
Composites:
|
||
xy «x» followed by «y»
|
||
x|y «x» or «y» (prefer «x»)
|
||
|
||
Repetitions:
|
||
x* zero or more «x», prefer more
|
||
x+ one or more «x», prefer more
|
||
x? zero or one «x», prefer one
|
||
x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
|
||
x{n,} «n» or more «x», prefer more
|
||
x{n} exactly «n» «x»
|
||
x*? zero or more «x», prefer fewer
|
||
x+? one or more «x», prefer fewer
|
||
x?? zero or one «x», prefer zero
|
||
x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
|
||
x{n,}? «n» or more «x», prefer fewer
|
||
x{n}? exactly «n» «x»
|
||
x{} (== x*) NOT SUPPORTED vim
|
||
x{-} (== x*?) NOT SUPPORTED vim
|
||
x{-n} (== x{n}?) NOT SUPPORTED vim
|
||
x= (== x?) NOT SUPPORTED vim
|
||
|
||
Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}»
|
||
reject forms that create a minimum or maximum repetition count above 1000.
|
||
Unlimited repetitions are not subject to this restriction.
|
||
|
||
Possessive repetitions:
|
||
x*+ zero or more «x», possessive NOT SUPPORTED
|
||
x++ one or more «x», possessive NOT SUPPORTED
|
||
x?+ zero or one «x», possessive NOT SUPPORTED
|
||
x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
|
||
x{n,}+ «n» or more «x», possessive NOT SUPPORTED
|
||
x{n}+ exactly «n» «x», possessive NOT SUPPORTED
|
||
|
||
Grouping:
|
||
(re) numbered capturing group (submatch)
|
||
(?P<name>re) named & numbered capturing group (submatch)
|
||
(?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED
|
||
(?'name're) named & numbered capturing group (submatch) NOT SUPPORTED
|
||
(?:re) non-capturing group
|
||
(?flags) set flags within current group; non-capturing
|
||
(?flags:re) set flags during re; non-capturing
|
||
(?#text) comment NOT SUPPORTED
|
||
(?|x|y|z) branch numbering reset NOT SUPPORTED
|
||
(?>re) possessive match of «re» NOT SUPPORTED
|
||
re@> possessive match of «re» NOT SUPPORTED vim
|
||
%(re) non-capturing group NOT SUPPORTED vim
|
||
|
||
Flags:
|
||
i case-insensitive (default false)
|
||
m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
|
||
s let «.» match «\n» (default false)
|
||
U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
|
||
Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
|
||
|
||
Empty strings:
|
||
^ at beginning of text or line («m»=true)
|
||
$ at end of text (like «\z» not «\Z») or line («m»=true)
|
||
\A at beginning of text
|
||
\b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
|
||
\B not at ASCII word boundary
|
||
\G at beginning of subtext being searched NOT SUPPORTED pcre
|
||
\G at end of last match NOT SUPPORTED perl
|
||
\Z at end of text, or before newline at end of text NOT SUPPORTED
|
||
\z at end of text
|
||
(?=re) before text matching «re» NOT SUPPORTED
|
||
(?!re) before text not matching «re» NOT SUPPORTED
|
||
(?<=re) after text matching «re» NOT SUPPORTED
|
||
(?<!re) after text not matching «re» NOT SUPPORTED
|
||
re& before text matching «re» NOT SUPPORTED vim
|
||
re@= before text matching «re» NOT SUPPORTED vim
|
||
re@! before text not matching «re» NOT SUPPORTED vim
|
||
re@<= after text matching «re» NOT SUPPORTED vim
|
||
re@<! after text not matching «re» NOT SUPPORTED vim
|
||
\zs sets start of match (= \K) NOT SUPPORTED vim
|
||
\ze sets end of match NOT SUPPORTED vim
|
||
\%^ beginning of file NOT SUPPORTED vim
|
||
\%$ end of file NOT SUPPORTED vim
|
||
\%V on screen NOT SUPPORTED vim
|
||
\%# cursor position NOT SUPPORTED vim
|
||
\%'m mark «m» position NOT SUPPORTED vim
|
||
\%23l in line 23 NOT SUPPORTED vim
|
||
\%23c in column 23 NOT SUPPORTED vim
|
||
\%23v in virtual column 23 NOT SUPPORTED vim
|
||
|
||
Escape sequences:
|
||
\a bell (== \007)
|
||
\f form feed (== \014)
|
||
\t horizontal tab (== \011)
|
||
\n newline (== \012)
|
||
\r carriage return (== \015)
|
||
\v vertical tab character (== \013)
|
||
\* literal «*», for any punctuation character «*»
|
||
\123 octal character code (up to three digits)
|
||
\x7F hex character code (exactly two digits)
|
||
\x{10FFFF} hex character code
|
||
\C match a single byte even in UTF-8 mode
|
||
\Q...\E literal text «...» even if «...» has punctuation
|
||
|
||
\1 backreference NOT SUPPORTED
|
||
\b backspace NOT SUPPORTED (use «\010»)
|
||
\cK control char ^K NOT SUPPORTED (use «\001» etc)
|
||
\e escape NOT SUPPORTED (use «\033»)
|
||
\g1 backreference NOT SUPPORTED
|
||
\g{1} backreference NOT SUPPORTED
|
||
\g{+1} backreference NOT SUPPORTED
|
||
\g{-1} backreference NOT SUPPORTED
|
||
\g{name} named backreference NOT SUPPORTED
|
||
\g<name> subroutine call NOT SUPPORTED
|
||
\g'name' subroutine call NOT SUPPORTED
|
||
\k<name> named backreference NOT SUPPORTED
|
||
\k'name' named backreference NOT SUPPORTED
|
||
\lX lowercase «X» NOT SUPPORTED
|
||
\ux uppercase «x» NOT SUPPORTED
|
||
\L...\E lowercase text «...» NOT SUPPORTED
|
||
\K reset beginning of «$0» NOT SUPPORTED
|
||
\N{name} named Unicode character NOT SUPPORTED
|
||
\R line break NOT SUPPORTED
|
||
\U...\E upper case text «...» NOT SUPPORTED
|
||
\X extended Unicode sequence NOT SUPPORTED
|
||
|
||
\%d123 decimal character 123 NOT SUPPORTED vim
|
||
\%xFF hex character FF NOT SUPPORTED vim
|
||
\%o123 octal character 123 NOT SUPPORTED vim
|
||
\%u1234 Unicode character 0x1234 NOT SUPPORTED vim
|
||
\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
|
||
|
||
Character class elements:
|
||
x single character
|
||
A-Z character range (inclusive)
|
||
\d Perl character class
|
||
[:foo:] ASCII character class «foo»
|
||
\p{Foo} Unicode character class «Foo»
|
||
\pF Unicode character class «F» (one-letter name)
|
||
|
||
Named character classes as character class elements:
|
||
[\d] digits (== \d)
|
||
[^\d] not digits (== \D)
|
||
[\D] not digits (== \D)
|
||
[^\D] not not digits (== \d)
|
||
[[:name:]] named ASCII class inside character class (== [:name:])
|
||
[^[:name:]] named ASCII class inside negated character class (== [:^name:])
|
||
[\p{Name}] named Unicode property inside character class (== \p{Name})
|
||
[^\p{Name}] named Unicode property inside negated character class (== \P{Name})
|
||
|
||
Perl character classes (all ASCII-only):
|
||
\d digits (== [0-9])
|
||
\D not digits (== [^0-9])
|
||
\s whitespace (== [\t\n\f\r ])
|
||
\S not whitespace (== [^\t\n\f\r ])
|
||
\w word characters (== [0-9A-Za-z_])
|
||
\W not word characters (== [^0-9A-Za-z_])
|
||
|
||
\h horizontal space NOT SUPPORTED
|
||
\H not horizontal space NOT SUPPORTED
|
||
\v vertical space NOT SUPPORTED
|
||
\V not vertical space NOT SUPPORTED
|
||
|
||
ASCII character classes:
|
||
[[:alnum:]] alphanumeric (== [0-9A-Za-z])
|
||
[[:alpha:]] alphabetic (== [A-Za-z])
|
||
[[:ascii:]] ASCII (== [\x00-\x7F])
|
||
[[:blank:]] blank (== [\t ])
|
||
[[:cntrl:]] control (== [\x00-\x1F\x7F])
|
||
[[:digit:]] digits (== [0-9])
|
||
[[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
|
||
[[:lower:]] lower case (== [a-z])
|
||
[[:print:]] printable (== [ -~] == [ [:graph:]])
|
||
[[:punct:]] punctuation (== [!-/:-@[-`{-~])
|
||
[[:space:]] whitespace (== [\t\n\v\f\r ])
|
||
[[:upper:]] upper case (== [A-Z])
|
||
[[:word:]] word characters (== [0-9A-Za-z_])
|
||
[[:xdigit:]] hex digit (== [0-9A-Fa-f])
|
||
|
||
Unicode character class names--general category:
|
||
C other
|
||
Cc control
|
||
Cf format
|
||
Cn unassigned code points NOT SUPPORTED
|
||
Co private use
|
||
Cs surrogate
|
||
L letter
|
||
LC cased letter NOT SUPPORTED
|
||
L& cased letter NOT SUPPORTED
|
||
Ll lowercase letter
|
||
Lm modifier letter
|
||
Lo other letter
|
||
Lt titlecase letter
|
||
Lu uppercase letter
|
||
M mark
|
||
Mc spacing mark
|
||
Me enclosing mark
|
||
Mn non-spacing mark
|
||
N number
|
||
Nd decimal number
|
||
Nl letter number
|
||
No other number
|
||
P punctuation
|
||
Pc connector punctuation
|
||
Pd dash punctuation
|
||
Pe close punctuation
|
||
Pf final punctuation
|
||
Pi initial punctuation
|
||
Po other punctuation
|
||
Ps open punctuation
|
||
S symbol
|
||
Sc currency symbol
|
||
Sk modifier symbol
|
||
Sm math symbol
|
||
So other symbol
|
||
Z separator
|
||
Zl line separator
|
||
Zp paragraph separator
|
||
Zs space separator
|
||
|
||
Unicode character class names--scripts:
|
||
Adlam
|
||
Ahom
|
||
Anatolian_Hieroglyphs
|
||
Arabic
|
||
Armenian
|
||
Avestan
|
||
Balinese
|
||
Bamum
|
||
Bassa_Vah
|
||
Batak
|
||
Bengali
|
||
Bhaiksuki
|
||
Bopomofo
|
||
Brahmi
|
||
Braille
|
||
Buginese
|
||
Buhid
|
||
Canadian_Aboriginal
|
||
Carian
|
||
Caucasian_Albanian
|
||
Chakma
|
||
Cham
|
||
Cherokee
|
||
Common
|
||
Coptic
|
||
Cuneiform
|
||
Cypriot
|
||
Cyrillic
|
||
Deseret
|
||
Devanagari
|
||
Dogra
|
||
Duployan
|
||
Egyptian_Hieroglyphs
|
||
Elbasan
|
||
Elymaic
|
||
Ethiopic
|
||
Georgian
|
||
Glagolitic
|
||
Gothic
|
||
Grantha
|
||
Greek
|
||
Gujarati
|
||
Gunjala_Gondi
|
||
Gurmukhi
|
||
Han
|
||
Hangul
|
||
Hanifi_Rohingya
|
||
Hanunoo
|
||
Hatran
|
||
Hebrew
|
||
Hiragana
|
||
Imperial_Aramaic
|
||
Inherited
|
||
Inscriptional_Pahlavi
|
||
Inscriptional_Parthian
|
||
Javanese
|
||
Kaithi
|
||
Kannada
|
||
Katakana
|
||
Kayah_Li
|
||
Kharoshthi
|
||
Khmer
|
||
Khojki
|
||
Khudawadi
|
||
Lao
|
||
Latin
|
||
Lepcha
|
||
Limbu
|
||
Linear_A
|
||
Linear_B
|
||
Lisu
|
||
Lycian
|
||
Lydian
|
||
Mahajani
|
||
Makasar
|
||
Malayalam
|
||
Mandaic
|
||
Manichaean
|
||
Marchen
|
||
Masaram_Gondi
|
||
Medefaidrin
|
||
Meetei_Mayek
|
||
Mende_Kikakui
|
||
Meroitic_Cursive
|
||
Meroitic_Hieroglyphs
|
||
Miao
|
||
Modi
|
||
Mongolian
|
||
Mro
|
||
Multani
|
||
Myanmar
|
||
Nabataean
|
||
Nandinagari
|
||
New_Tai_Lue
|
||
Newa
|
||
Nko
|
||
Nushu
|
||
Nyiakeng_Puachue_Hmong
|
||
Ogham
|
||
Ol_Chiki
|
||
Old_Hungarian
|
||
Old_Italic
|
||
Old_North_Arabian
|
||
Old_Permic
|
||
Old_Persian
|
||
Old_Sogdian
|
||
Old_South_Arabian
|
||
Old_Turkic
|
||
Oriya
|
||
Osage
|
||
Osmanya
|
||
Pahawh_Hmong
|
||
Palmyrene
|
||
Pau_Cin_Hau
|
||
Phags_Pa
|
||
Phoenician
|
||
Psalter_Pahlavi
|
||
Rejang
|
||
Runic
|
||
Samaritan
|
||
Saurashtra
|
||
Sharada
|
||
Shavian
|
||
Siddham
|
||
SignWriting
|
||
Sinhala
|
||
Sogdian
|
||
Sora_Sompeng
|
||
Soyombo
|
||
Sundanese
|
||
Syloti_Nagri
|
||
Syriac
|
||
Tagalog
|
||
Tagbanwa
|
||
Tai_Le
|
||
Tai_Tham
|
||
Tai_Viet
|
||
Takri
|
||
Tamil
|
||
Tangut
|
||
Telugu
|
||
Thaana
|
||
Thai
|
||
Tibetan
|
||
Tifinagh
|
||
Tirhuta
|
||
Ugaritic
|
||
Vai
|
||
Wancho
|
||
Warang_Citi
|
||
Yi
|
||
Zanabazar_Square
|
||
|
||
Vim character classes:
|
||
\i identifier character NOT SUPPORTED vim
|
||
\I «\i» except digits NOT SUPPORTED vim
|
||
\k keyword character NOT SUPPORTED vim
|
||
\K «\k» except digits NOT SUPPORTED vim
|
||
\f file name character NOT SUPPORTED vim
|
||
\F «\f» except digits NOT SUPPORTED vim
|
||
\p printable character NOT SUPPORTED vim
|
||
\P «\p» except digits NOT SUPPORTED vim
|
||
\s whitespace character (== [ \t]) NOT SUPPORTED vim
|
||
\S non-white space character (== [^ \t]) NOT SUPPORTED vim
|
||
\d digits (== [0-9]) vim
|
||
\D not «\d» vim
|
||
\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
|
||
\X not «\x» NOT SUPPORTED vim
|
||
\o octal digits (== [0-7]) NOT SUPPORTED vim
|
||
\O not «\o» NOT SUPPORTED vim
|
||
\w word character vim
|
||
\W not «\w» vim
|
||
\h head of word character NOT SUPPORTED vim
|
||
\H not «\h» NOT SUPPORTED vim
|
||
\a alphabetic NOT SUPPORTED vim
|
||
\A not «\a» NOT SUPPORTED vim
|
||
\l lowercase NOT SUPPORTED vim
|
||
\L not lowercase NOT SUPPORTED vim
|
||
\u uppercase NOT SUPPORTED vim
|
||
\U not uppercase NOT SUPPORTED vim
|
||
\_x «\x» plus newline, for any «x» NOT SUPPORTED vim
|
||
|
||
Vim flags:
|
||
\c ignore case NOT SUPPORTED vim
|
||
\C match case NOT SUPPORTED vim
|
||
\m magic NOT SUPPORTED vim
|
||
\M nomagic NOT SUPPORTED vim
|
||
\v verymagic NOT SUPPORTED vim
|
||
\V verynomagic NOT SUPPORTED vim
|
||
\Z ignore differences in Unicode combining characters NOT SUPPORTED vim
|
||
|
||
Magic:
|
||
(?{code}) arbitrary Perl code NOT SUPPORTED perl
|
||
(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
|
||
(?n) recursive call to regexp capturing group «n» NOT SUPPORTED
|
||
(?+n) recursive call to relative group «+n» NOT SUPPORTED
|
||
(?-n) recursive call to relative group «-n» NOT SUPPORTED
|
||
(?C) PCRE callout NOT SUPPORTED pcre
|
||
(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
|
||
(?&name) recursive call to named group NOT SUPPORTED
|
||
(?P=name) named backreference NOT SUPPORTED
|
||
(?P>name) recursive call to named group NOT SUPPORTED
|
||
(?(cond)true|false) conditional branch NOT SUPPORTED
|
||
(?(cond)true) conditional branch NOT SUPPORTED
|
||
(*ACCEPT) make regexps more like Prolog NOT SUPPORTED
|
||
(*COMMIT) NOT SUPPORTED
|
||
(*F) NOT SUPPORTED
|
||
(*FAIL) NOT SUPPORTED
|
||
(*MARK) NOT SUPPORTED
|
||
(*PRUNE) NOT SUPPORTED
|
||
(*SKIP) NOT SUPPORTED
|
||
(*THEN) NOT SUPPORTED
|
||
(*ANY) set newline convention NOT SUPPORTED
|
||
(*ANYCRLF) NOT SUPPORTED
|
||
(*CR) NOT SUPPORTED
|
||
(*CRLF) NOT SUPPORTED
|
||
(*LF) NOT SUPPORTED
|
||
(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
|
||
(*BSR_UNICODE) NOT SUPPORTED pcre
|
||
|