Identifier

From cppreference.com
< c‎ | language

An identifier is an arbitrarily long sequence of digits, underscores, lowercase and uppercase Latin letters, and Unicode characters specified using \u and \U escape notation (since C99). A valid identifier must begin with a non-digit character (Latin letter, underscore, or Unicode non-digit character (since C99)). Identifiers are case-sensitive (lowercase and uppercase letters are distinct).

It is implementation-defined if raw (not escaped) Unicode characters are allowed in identifiers:

char *\U0001f431 = "cat"; // supported
char *🐱 = "cat"; // implementation-defined (e.g. works with Clang, but not GCC)
(since C99)

Identifiers can denote the following types of entities:

Every identifier other than macro name or macro parameter name has scope, belongs to a name space, and may have linkage. The same identifier can denote different entities at different points in the program, or may denote different entities at the same point if the entities are in different name spaces.

Reserved identifiers

The following identifiers are reserved and may not be declared in a program (doing so invokes undefined behavior):

1) The identifiers that are keywords cannot be used for other purposes. In particular #define or #undef of an identifier that is identical to a keyword is not allowed.
2) All external identifiers that begin with an underscore.
3) All identifiers that begin with an underscore followed by a capital letter or by another underscore (these reserved identifiers allow the library to use numerous behind-the-scenes non-external macros and functions)
4) All external identifiers defined by the standard library (in hosted environment). This means that no user-supplied external names are allowed to match any library names, not even if declaring a function that is identical to a library function.
5) Identifiers declared as reserved for future use by the standard library, namely
  • function names
  • cerf, cerfc, cexp2, cexpm1, clog10, clog1p, clog2, clgamma, ctgamma and their -f and -l suffixed variants, in <complex.h>
  • beginning with is or to followed by a lowercase letter, in <ctype.h> and <wctype.h>
  • beginning with str followed by a lowercase letter, in <stdlib.h>
  • beginning with str, mem or wcs followed by a lowercase letter, in <string.h>
  • beginning with wcs followed by a lowercase letter, in <wchar.h>
  • beginning with atomic_ followed by a lowercase letter, in <stdatomic.h>
  • beginning with cnd_, mtx_, thrd_ or tss_ followed by a lowercase letter, in <threads.h>
  • typedef names
  • beginning with int or uint and ending with _t, in <stdint.h>
  • beginning with atomic_ or memory_ followed by a lowercase letter, in <stdatomic.h>
  • beginning with cnd_, mtx_, thrd_ or tss_ followed by a lowercase letter, in <threads.h>
  • macro names
  • beginning with E followed by a digit or an uppercase letter, in <errno.h>
  • beginning with FE_ followed by an uppercase letter, in <fenv.h>
  • beginning with INT or UINT and ending with _MAX, _MIN, or _C, in <stdint.h>
  • beginning with PRI or SCN followed by lowercase letter or the letter X, in <stdint.h>
  • beginning with LC_ followed by an uppercase letter, in <locale.h>
  • beginning with SIG or SIG_ followed by an uppercase letter, in <signal.h>
  • beginning with TIME_ followed by an uppercase letter, in <time.h>
  • beginning with ATOMIC_ followed by an uppercase letter, in <stdatomic.h>
  • enumeration constants
  • beginning with memory_order_ followed by a lowercase letter, in <stdatomic.h>
  • beginning with cnd_, mtx_, thrd_ or tss_ followed by a lowercase letter, in <threads.h>

All other identifiers are available, with no fear of unexpected collisions when moving programs from one compiler and library to another.

Note: in C++, identifiers with a double underscore anywhere are reserved everywhere; in C, only the ones that begin with a double underscore are reserved.

Translation limits

Even though there is no specific limit on the length of identifiers, early compilers had limits on the number of significant initial characters in identifiers and the linkers imposed stricter limits on the names with external linkage. C requires that at least the following limits are supported by any standard-compliant implementation:

  • 31 significant initial characters in an internal identifier or a macro name
  • 6 significant initial characters in an external identifier
  • 511 external identifiers in one translation unit
  • 127 identifiers with block scope declared in one block
  • 1024 macro identifiers simultaneously defined in one preprocessing translation unit
(until C99)
  • 63 significant initial characters in an internal identifier or a macro name
  • 31 significant initial characters in an external identifier
  • 4095 external identifiers in one translation unit
  • 511 identifiers with block scope declared in one block
  • 4095 macro identifiers simultaneously defined in one preprocessing translation unit
(since C99)

References

  • C11 standard (ISO/IEC 9899:2011):
  • 5.2.4.1 Translation limits (p: 25-26)
  • 6.4.2 Identifiers (p: 59-60)
  • 6.10.8 Predefined macro names (p: 175-176)
  • 6.11.9 Predefined macro names (p: 179)
  • 7.31 Future library directions (p: 455-457)
  • K.3.1.2 Reserved identifiers (p: 584)
  • C99 standard (ISO/IEC 9899:1999):
  • 5.2.4.1 Translation limits (p: 20-21)
  • 6.4.2 Identifiers (p: 51-52)
  • 6.10.8 Predefined macro names (p: 160-161)
  • 6.11.9 Predefined macro names (p: 163)
  • 7.26 Future library directions (p: 401-402)
  • C89/C90 standard (ISO/IEC 9899:1990):
  • 2.2.4.1 Translation limits
  • 3.1.2 Identifiers
  • 3.8.8 Predefined macro names

See also