ECMA-334: 9.4.1 Unicode escape sequences
| C# Language Specification |
|
| © 2006 ECMA International |
9.4.1 Unicode escape sequences
A Unicode escape sequence represents a Unicode character. Unicode escape sequences are processed in identifiers (§9.4.2), regular string literals (§9.4.4.5), and character literals (§9.4.4.4). A Unicode character escape is not processed in any other location (for example, to form an operator, punctuator, or keyword).
- unicode-escape-sequence::
\uhex-digit hex-digit hex-digit hex-digit\Uhex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit
A Unicode escape sequence represents the single Unicode character formed by the hexadecimal number
following the "\u" or "\U" characters. Since C# uses a 16-bit encoding of Unicode characters in characters
and string values, a Unicode code point in the range U+10000 to U+10FFFF is represented using two
Unicode surrogate code units. Unicode code points above 0x10FFFF are invalid and are not supported.
Multiple translations are not performed. For instance, the string literal "\u005Cu005C" is equivalent to
"\u005C" rather than "\". [Note: The Unicode value \u005C is the character "\". end note]
[Example: The example
class Class1 { static void Test(bool \u0066) { char c = '\u0066'; if (\u0066) System.Console.WriteLine(c.ToString()); } }
shows several uses of \u0066, which is the escape sequence for the letter "f". The program is equivalent to
class Class1 { static void Test(bool f) { char c = 'f'; if (f) System.Console.WriteLine(c.ToString()); } }
end example]