typedef int x; int *a; x *b; int **p = &b;
This is a crucial point: typedef
defines aliases, not types.
All three of these type-defining keywords share some syntax. In particular, structures, unions, and enumerations use tags. These tags live in the ‘tag’ name space [link to namespaces]; all three keywords share a single tag name space, so if you have a struct blat, you cannot also have a union blat, nor an enum blat. While the tags are optional, there is rarely a good reason to omit them, and I prefer to think of the tag as the ‘true name’ of the type. In essence, struct blat { int x; }; really means ‘define a new type named blat’ (with one integer as its sole member).
We can refer back to these types by repeating the type-name at any point. A future struct blat this_blat; defines a variable using the type we just defined. Importantly, we can also refer forward to types we have not yet defined. This allows self-referential and mutually-referential types, including simple data structures like linked lists.
Type names (as declared or defined by the struct keyword and a tag) spring into being as soon as they are mentioned. This is often convenient, but carries a price: misspelled type names are not caught, as they simply create a new type. A C compiler is usually not able to distinguish between the case in which the programmer wanted to create a new type, and the one in which the programmer simply misspelled the type name. For this reason, some C programmers prefer to use typedef aliases.
Consider the following code fragment:
struct temperature; void apply_heat(struct temperature *); void remove_heat(struct tempure *);
Better compilers will often produce a warning here, because the type we created here was declared in an inner scope [insert link to scopes and linkage rules once I write the html page]. Specifically, anything inside the parentheses of a function prototype [insert link] has ‘prototype scope’. An incomplete type can only be completed in the same scope in which it was created, and can only be referred-to (as with pointers) in that scope or any more-deeply-nested scope. Since the prototype scope ends immediately, we can never refer to that particular type again, and there is no way to pass a valid argument to the remove_heat() function.[2] Thus, we are probably well-served by a compiler that complains that this type's scope is so limited as to be useless.
One of the biggest drawbacks I find with C's typedef syntax is that it mucks up C's declarations (and C's ‘declaration mirrors use’ syntax is already one of C's most confusing features). For all of C's built-in types, we declare variables by naming the base type first, using a keyword. Since the set of keywords is fixed, we can always be sure that it is a type name. This rule even holds for user-defined types, as long as we start with struct (or union or enum): we just have a tag inserted before the rest of the declaration. With typedef names, however, we see only an ordinary identifier, indistinguishable from any other ordinary identifier, except that it has been declared as an alias, perhaps in some #include header that we cannot easily inspect. We need to be able to tell -- preferably at a glance -- that it is in fact a typedef-name. Otherwise we end up with the situation exemplified by this (C89-specific) code fragment:void f(x);
For this reason, those who use typedefs almost invariably invent a typographic convention (or several conventions) to make them stand out. If we can tell at a glance that some identifier is a typedef-alias, the syntactic problem vanishes. In my experience, the three most common conventions are:
While I personally dislike typedef and am entirely willing to write out the struct keyword every time, we can actually use this syntactic quirk to our advantage. Suppose we rewrite the earlier code fragment to use a typedef-name to alias the incomplete structure type, and then use the alias in the function prototypes:
typedef struct temperature TEMPERATURE; void apply_heat(TEMPERATURE *); void remove_heat(TEMPURE *);
The classic self-referential structure is a linked list. Let us consider a somewhat more complicated data structure, in which we have a list of items that themselves have sub-lists, and the sub-lists can refer back to the top level lists. For instance, in an operating system kernel, we might have a list of files, in which each file contains of a list of cached file-blocks. At the same time, each cached file-block needs to point back to its containing file.
struct fileinfo; struct cached_block; struct fileinfo { struct fileinfo *fi_next; /* list of all files */ struct cached_block *fi_blks; /* cached blocks for this file */ /* ... */ }; struct cached_block { int cb_lbn; /* logical block number */ struct cached_block *cb_next; /* next cached block for this file */ struct fileinfo *cb_file; /* file containing this block */ /* ... */ };
If we wish to use typedefs, however, we have a problem. At least one of typedef must occur first, but because typedef-names do not simply ‘spring into being’ we cannot use both typedef names until we have defined both. But this is not really a serious problem after all: all we have to do is define the type-names and create the aliases, then complete the two types:
struct fileinfo; struct cached_block; typedef struct fileinfo FILEINFO; typedef struct cached_block CACHED_BLOCK; struct fileinfo { FILEINFO *fi_next; /* list of all files */ CACHED_BLOCK *fi_blks; /* cached blocks for this file */ /* ... */ }; struct cached_block { int cb_lbn; /* logical block number */ CACHED_BLOCK *cb_next; /* next cached block for this file */ FILEINFO *cb_file; /* file containing this block */ /* ... */ };
typedef struct { FILEINFO *fi_next; /* list of all files */ CACHED_BLOCK *fi_blks; /* cached blocks for this file */ } FILEINFO; typedef struct { int cb_lbn; /* logical block number */ CACHED_BLOCK *cb_next; /* next cached block for this file */ FILEINFO *cb_file; /* file containing this block */ } CACHED_BLOCK;