Jump to content

C Programming/Language Reference

From Wikibooks, open books for an open world
Previous: Code library C Programming Next: Compilers

Table of keywords

[edit | edit source]

ANSI (American National Standards Institute) C (C89)/ISO C (C90)

[edit | edit source]

Very old compilers may not recognize some or all of the C89 keywords const, enum, signed, void, volatile, as well as any later standards' keywords.

  • auto
  • break
  • case
  • char
  • const
  • continue
  • default
  • do
  • double
  • else
  • enum
  • extern
  • float
  • for
  • goto
  • if
  • int
  • long
  • register
  • return
  • short
  • signed
  • sizeof
  • static
  • struct
  • switch
  • typedef
  • union
  • unsigned
  • void
  • volatile
  • while

ISO C (C99)

[edit | edit source]

These are supported in most new compilers.

  • _Bool
  • _Complex
  • _Imaginary
  • inline

ISO C (C11)

[edit | edit source]

These are supported only in some newer compilers

  • alignof
  • _Alignas
  • _Atomic
  • _Generic
  • _Noreturn
  • _Static_assert
  • _Thread_local

Although not technically a keyword, C99-capable preprocessors/compilers additionally recognize the special preprocessor operator _Pragma, which acts as an alternate form of the #pragma directive that can be used from within macro expansions. For example, the following code will cause some compilers (incl. GCC, Clang) to emit a diagnostic message:

    #define EMIT_MESSAGE(str)    EMIT_PRAGMA(message(str))
    #define EMIT_PRAGMA(content) _Pragma(#content)
    EMIT_MESSAGE("Hello, world!")

Some compilers use a slight variant syntax; in particular, MSVC supports __pragma instead of _Pragma.

Specific compilers may also—in a non-standards-compliant mode, or with additional syntactic markers like __extension__—treat some other words as keywords, including asm, cdecl, far, fortran, huge, interrupt, near, pascal, or typeof. However, they typically allow these keywords to be overridden by declarations when operating in standards-compliant modes (e.g., by defining a variable named typeof), in order to avoid introducing incompatibilities with existing programs. In order to ensure the compiler can maintain access to extension features, these compilers usually have an additional set of proper keywords beginning with two underscores (__). For example, GCC treats asm, __asm, and __asm__ somewhat identically, but the latter two are always guaranteed to have the expected meaning since they can't be overridden.

Many of the newly introduced keywords—namely, those beginning with an underscore and capital letter like _Noreturn or _Imaginary—are intended to be used only indirectly in most situations. Instead, the programmer should prefer the use of standard headers such as <stdbool.h> or <stdalign.h>, which typically use the preprocessor to establish an all-lower-case variant of the keyword (e.g., complex or noreturn). These headers serve the purpose of enabling C and C++ code, as well as code targeting different compilers or language versions, to interoperate more cleanly. For example, by including <stdbool.h>, the tokens bool, true, and false can be used identically in either C99 or C++ without having to explicitly use _Bool in C99 or bool in C++.

See also the list of reserved identifiers [1].

Table of operators

[edit | edit source]

Operators in the same row of this table have the same precedence and the order of evaluation is decided by the associativity (left-to-right or right-to-left). Operators closer to the top of this table have higher precedence than those in a subsequent group.

Operators Description Example Usage Associativity
Postfix operators Left to right
() function call operator swap (x, y)
[] array index operator arr [i]
. member access operator
for an object of struct/union type
or a reference to it
obj.member
-> member access operator
for a pointer to an object of
struct/union type
ptr->member

Unary Operators Right to left
! logical not operator !eof_reached
~ bitwise not operator ~mask
+ -[2] unary plus/minus operators -num
++ -- post-increment/decrement operators num++
++ -- pre-increment/decrement operators ++num
& address-of operator &data
* indirection operator *ptr
sizeof sizeof operator for expressions sizeof 123
sizeof() sizeof operator for types sizeof (int)
(type) cast operator (float)i

Multiplicative Operators Left to right
* / % multiplication, division and
modulus operators
celsius_diff * 9.0 / 5.0

Additive Operators Left to right
+ - addition and subtraction operators end - start + 1

Bitwise Shift Operators Left to right
<< left shift operator bits << shift_len
>> right shift operator bits >> shift_len

Relational Inequality Operators Left to right
< > <= >= less-than, greater-than, less-than or
equal-to, greater-than or equal-to
operators
i < num_elements

Relational Equality Operators Left to right
== != equal-to, not-equal-to choice != 'n'

Bitwise And Operator Left to right
& bits & clear_mask_complement

Bitwise Xor Operator Left to right
^ bits ^ invert_mask

Bitwise Or Operator Left to right
| bits | set_mask

Logical And Operator Left to right
&& arr != 0 && arr->len != 0

Logical Or Operator Left to right
|| arr == 0 || arr->len == 0
Conditional Operator Right to left
?: size != 0 ? size : 0

Assignment Operators Right to left
= assignment operator i = 0
+= -= *= /=
%= &= |= ^=
<<= >>=
shorthand assignment operators
(foo op= bar represents
foo = foo op bar)
num /= 10

Comma Operator Left to right
, i = 0, j = i + 1, k = 0

Table of data types

[edit | edit source]
Type Size in Bits Comments Alternative Names
Primitive Types in ANSI C (C89)/ISO C (C90)
char ≥ 8
  • sizeof gives the size in units of chars. These "C bytes" need not be 8-bit bytes (though commonly they are); the number of bits is given by the CHAR_BIT macro in the limits.h header.
  • Signedness is implementation-defined.
  • Any encoding of 8 bits or less (e.g. ASCII) can be used to store characters.
  • Integer operations can be performed portably only for the range 0 ~ 127.
  • All bits contribute to the value of the char, i.e. there are no "holes" or "padding" bits.
signed char same as char
  • Characters stored like for type char.
  • Can store integers in the range -127 ~ 127 portably[3].
unsigned char same as char
  • Characters stored like for type char.
  • Can store integers in the range 0 ~ 255 portably.
short ≥ 16, ≥ size of char
  • Can store integers in the range -32767 ~ 32767 portably[4].
  • Used to reduce memory usage (although the resulting executable may be larger and probably slower as compared to using int.
short int, signed short, signed short int
unsigned short same as short
  • Can store integers in the range 0 ~ 65535 portably.
  • Used to reduce memory usage (although the resulting executable may be larger and probably slower as compared to using int.
unsigned short int
int ≥ 16, ≥ size of short
  • Represents the "normal" size of data the processor deals with (the word-size); this is the integral data-type used normally.
  • Can store integers in the range -32767 ~ 32767 portably[4].
signed, signed int
unsigned int same as int
  • Can store integers in the range 0 ~ 65535 portably.
unsigned
long ≥ 32, ≥ size of int
  • Can store integers in the range -2147483647 ~ 2147483647 portably[5].
long int, signed long, signed long int
unsigned long same as long
  • Can store integers in the range 0 ~ 4294967295 portably.
unsigned long int
float ≥ size of char
  • Used to reduce memory usage when the values used do not vary widely.
  • The floating-point format used is implementation defined and need not be the IEEE single-precision format.
  • unsigned cannot be specified.
double ≥ size of float
  • Represents the "normal" size of data the processor deals with; this is the floating-point data-type used normally.
  • The floating-point format used is implementation defined and need not be the IEEE double-precision format.
  • unsigned cannot be specified.
long double ≥ size of double
  • unsigned cannot be specified.

Primitive Types added to ISO C (C99)
long long ≥ 64, ≥ size of long
  • Can store integers in the range -9223372036854775807 ~ 9223372036854775807 portably[6].
long long int, signed long long, signed long long int
unsigned long long same as long long
  • Can store integers in the range 0 ~ 18446744073709551615 portably.
unsigned long long int
intmax_t the maximum width supported by the platform
uintmax_t same as intmax_t
  • Can store integers in the range 0 ~ (1 << n)-1, with 'n' the width of uintmax_t.

User Defined Types
struct ≥ sum of size of each member
  • Said to be an aggregate type.
union ≥ size of the largest member
  • Said to be an aggregate type.
enum ≥ size of char
  • Enumerations are a separate type from ints, though they are mutually convertible.
typedef same as the type being given a name
  • typedef has syntax similar to a storage class like static, register or extern.

Derived Types[7]
type*

(pointer)
≥ size of char
  • 0 always represents the null pointer (an address where no data can be placed), irrespective of what bit sequence represents the value of a null pointer.
  • Pointers to different types may have different representations, which means they could also be of different sizes. So they are not convertible to one another.
  • Even in an implementation which guarantess all data pointers to be of the same size, function pointers and data pointers are in general incompatible with each other.
  • For functions taking variable number of arguments, the arguments passed must be of appropriate type, so even 0 must be cast to the appropriate type in such function-calls.
type [integer[8]]

(array)
integer × size of type
  • The brackets ([]) follow the identifier name in a declaration.
  • In a declaration which also initializes the array (including a function parameter declaration), the size of the array (the integer) can be omitted.
  • type [] is not the same as type*. Only under some circumstances one can be converted to the other.
type (comma-delimited list of types/declarations)

(function)
  • Functions declared without any storage class are extern.
  • The parentheses (()) follow the identifier name in a declaration, e.g. a 2-arg function pointer: int (* fptr) (int arg1, int arg2).

Character sets

[edit | edit source]

Programs written in C can read and write any character set, provided the libraries that support them are included/used.

The source code for C programs, however, is usually limited to the ASCII character set.

In a file containing source code, the end of a line is sometimes, depending on the operating system it was created on not a newline character but compilers treat the end of each line as if it were a single newline character.

Virtually all compilers allow the $, @, and ` characters in string constants. Many compilers also allow literal multibyte Unicode characters, but they are not portable.

Certain characters must be escaped with a backslash to represent themselves in a string or character constant. These are:

  • \\ Literal backslash
  • \" Literal double quote
  • \' Literal single quote
  • \n Newline
  • \t Horizontal tab
  • \f Form feed
  • \v Vertical tab

Additionally, some compilers allow these characters:

  • \r Carriage return
  • \a Alert (audible bell)
  • \b Backspace

\xhh, where the 'h' characters are hexadecimal digits, is used to represent arbitrary bytes (including \x00, the zero byte).

\uhhhh or \Uhhhhhhhh, where the 'h' characters are hexadecimal digits, is used to portably represent Unicode characters.

References

[edit | edit source]
  1. http://publib.boulder.ibm.com/infocenter/comphelp/v7v91/topic/com.ibm.vacpp7a.doc/language/ref/clrc02reserved_identifiers.htm list of reserved identifiers
  2. Very old compilers may not recognize the unary + operator.
  3. -128 can be stored in two's-complement machines (i.e. most machines in existence). Very old compilers may not recognize the signed keyword
  4. a b -32768 can be stored in two's-complement machines (i.e. most machines in existence). Very old compilers may not recognize the signed keyword
  5. -2147483648 can be stored in two's-complement machines (i.e. most machines in existence). Very old compilers may not recognize the signed keyword
  6. -9223372036854775808 can be stored in two's-complement machines (i.e. most machines in existence)
  7. The precedences in a declaration are:
    [], () (left associative) — Highest
    * (right associative) — Lowest
  8. The standards do NOT place any restriction on the size/type of the integer, it's implementation dependent. The only mention in the standards is a reference that an implementation may have limits to the maximum size of memory block which can be allocated, and as such the limit on integer will be size_of_max_block/sizeof(type)
Previous: Code library C Programming Next: Compilers
华夏公益教科书