Appendix D: Builtins
Functions
Tokens
The following tokens are built into Tokay and can be used immediatelly. Programs can override these constants on-demand.
| Token | Token+ | Description |
|---|---|---|
| Alphabetic | Alphabetics | All Unicode characters having the Alphabetic property |
| Alphanumeric | Alphanumerics | The union of Alphabetic and Numeric |
| Ascii | Asciis | All characters within the ASCII range. |
| AsciiAlphabetic | AsciiAlphabetics | All ASCII alphabetic characters [A-Za-z] |
| AsciiAlphanumeric | AsciiAlphanumerics | ASCII alphanumeric characters [0-9A-Za-z] |
| AsciiControl | AsciiControls | All ASCII control characters [\x00-\x1F\x7f]. SPACE is not a control character. |
| AsciiDigit | AsciiDigits | ASCII decimal digits [0-9] |
| AsciiGraphic | AsciiGraphics | ASCII graphic character [!-~] |
| AsciiHexdigit | AsciiHexdigits | ASCII hex digits [0-9A-Fa-f] |
| AsciiLowercase | AsciiLowercases | All ASCII lowercase characters [a-z] |
| AsciiPunctuation | AsciiPunctuations | All ASCII punctuation characters [-!"#$%&'()*+,./:;<=>?@[\\\]^_`{|}~] |
| AsciiUppercase | AsciiUppercases | All ASCII uppercase characters [A-Z] |
| AsciiWhitespace | AsciiWhitespaces | All characters defining ASCII whitespace [ \t\n\f\r] |
| Char | Chars | Any character, except EOF |
| Char<...> | Chars<...> | Any character of specified character-class, except EOF |
| Control | Controls | All Unicode characters in the controls category |
| Digit | Digits | ASCII decimal digits [0-9] |
| EOF | - | Matches End-Of-File. |
| Lowercase | Lowercases | All Unicode characters having the Lowercase property |
| Numeric | Numerics | All Unicode characters in the numbers category |
| Uppercase | Uppercases | All Unicode characters having the Uppercase property |
| Whitespace | Whitespaces | All Unicode characters having the White_Space property |
| Void | - | The empty token, which consuming nothing. But it consumes! |
The respective properties of the built-in character classes is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database in DerivedCoreProperties.txt.