MYWAY - C Programming Standards

Excellance [sic] can be achieved if you:
Care more than others think is wise
Risk more than others think is safe
Dream more than others think is practical
Expect more than others think is possible
- anonymous -

This document is an attempt to explain the spirit in which I (maintain) code. It does not necessarily explain the way the code is nor the way I would do it if I completely controlled the environment. The first, because attitudes and understanding change with time as well as the lack thereof. The second, because, to paraphrase Stallman, each vendor has their own perverse idea about what constitutes a rational machine.

There is a reason for this madness. It works on most boxes, most of the time, and when it doesn't, it provides a convenient base to work from. Note the difference between should and must.

I urge you to adopt a similar stance. The rules may be different based on your own perverse experiences and ideas, but the consistency engendered will pay off. Trust me.

File names

File names should relate in some way to the contents. They should be 10 characters or less (not including the .[choa.]). A poor name is one of forty-two that look like p043.c. lckpg.c would have been a lot more useful. Modern systems allow for longer names and within reason, this is a good thing. Keep in mind your potential ports!

Some people want a single file for each subroutine, some put everything in one file. BALANCE - JUDGEMENT, helper functions should definitely be in the file with the function(s) being helped. Main(s) should be in files named the same as the intended result (simplifies makefiles).

I like to capitalize Directory names. I touch type and use wildcards a lot. */W*/E* is a lot easier to type (for me) than */w*/e*. It also groups directories away from the source and text.

ALL uppercase distinguishes files (and directories), SEE MKS.

I like to use directories to categorize things. But don't go too far. Too many. Little directories. Really messes. Things up.

Think of database design, is this file related to that file, how about other files. This is especially true of include files. Don't encode your current file structure into the contents of that structure any more than you have to. (AND DONT HARDCODE PATHS in your code more than you have to either).

As an aside, don't create more entries in your PATH variable than you have to.

Use an MKS system (standardized make files). [I really need to integrate my way of using make with autoconf - then we would really have something.]

C standards

WARNING. I have a unorthodox opinion of style, especially concerning efficiency and comments. Code SHOULD be efficient, though I readily agree that DESIGN (oops the d word) is far more important, good design rarely requires sacrificing overall efficiency in the coding phase. On multitasking/multiuser systems one cycle saved is one cycle available. Think about it.

Overall format

I use the following format. Put system things first where possible. Then organizational stuff (local string libraries, etc.), and follow with the application or project stuff. This provides a progression of control for the include files, and organizes everything else into neat categories.

some SCCS id
/* file prologue - what functions in file */
system include files          /* description */
organization include files    /* description */
project include files         /* description */

Externals   (system)          /* description */
Externals   (organization)    /* description */
Externals   (project)         /* description */

Publics                       /* description */

local typedefs                /* description */

Privates                      /* description */

/* function prologue - purpose of function */
Public type func(x)
   type x;                    /* argument description */
{
   type y;                    /* local variable decription */

   ENTRY(func);

   body;

   RTNx(y);
}

...

ALWAYS return, return(e), or exit. Feel free to return when processing is complete. One entry is good. One exit is a pavlovian reaction in C.

NOTE: A good tool to have is one that can reformat syntactically correct code so that cosmetic things like location of braces, indentation levels, and such remain cosmetic. Determine a 'Standard' option setting for deliverable code and leave the programmer to his/her own twisted opinion.

NOTE: The existance of braces may or may not be cosmetic.

Comments

I get pretty sparse in my comments in code. I expect the person reading this to be conversant with C or in the style of foreign language schools, willing to learn via the immersion technique. Comments are meant to clarify the code. The code is the only thing the computer will interpret. The programmer had better understand what the code is doing, not what the comments imply the code is doing.

I'm not saying don't document the design. We need a lot more CASE support for design (and ultimately no more coding) (right). I've kept fairly busy going into big firms with many programmers whose focus was writing comments versus putting out code.

I prefer to document the design of functions and programs in a separate document. A picture (DFD, CFD) is worth a thousand comments. An properly documented example is worth millions. (well lots)

Sometimes I document a function in a prologue, sometimes in an associated readme or man page. Sometimes that prologue is awfully short. Just remember that you can write a program to extract the inputs, and outputs and functions called. The force be with you if you are worrying about who calls you. (the force will zap you more likely). What is more difficult is the purpose of the function. Put in at least a one line description. (A program that does prologues and makes man pages, ummmm sounds like a nifty tool.)

I come from a database background and redundant information drives me nuts. (thats why I have so many copies of everything).

This same concept applies throughout the code. COMMENTS are meant to CLARIFY. I know i++ means increment i, and I know that if (i != x) means if i is not equal to x. What I may not grasp off hand is that if i is not equal to x then y is in sad shape. One more time for the comment moguls, tell me WHY not WHAT.

Most humans read meaning into structure. (Whether is is there or not!) Indentation provides a visible and useful way to indicate structure. Inconsistent indentation provides an easy way to confuse. Comments should follow the indentation of the code. Beginning all comments in column one is a guaranteed method of obsfucating your code.

I leave a space after reserved words and none after functions and macros.

I don't leave a space after open parens.

I try to leave a space around most operators, especially assignment.

If I must break an expression across lines, I try to break AFTER the operator. This lets me know that I need a continuation and is less likely to mislead in a quick read versus the GNU standard of breaking BEFORE. What matters most is simply be consistent.

I try to leave a space after commas.

If I can fit things on one line by removing blanks (not newlines) I will.

I prefer one statement per line.

I generally specify variables one to a line without commas. This allows room for a description and lets me insert and delete on a line basis.

I don't line up variables.

I do line up right margin comments (or use a formatter).

I indent 3 spaces (or set tabs to 3, so be careful if you like 8). This gives me visual indentation and lots of room. [There is a code fragment floating around that indicates that GOD uses 3-space tabs, but that is a different story...]

I don't indent case labels after switches. I do indent case bodies.

I doubly indent continued lines,
      that is lines that are part of the same expression.
   this allows me to put my block braces at the end of the line
   and keep my structure neat.

I repeat, I put my braces like this {
   I'll accept
   {
      this
   }
   but I really would rather have the extra line on screen.
}

A good editor (even vi) will match braces, and I match the control word.

I like braces but they are only required when syntax demands. Good style says the following case demands:

   if this
      for that
         if something
            while whatever
               you get my point.

Try to place comments in groups (blocks) relating to blocks of code. NOTE that it is bad to comment every line of code. It clutters up the right margin or hides the structure and is unnecessary. Either your C is too confusing, you are programming for idiots, or you get paid for the size and time you take. A good test of the quality of code is how well it stands on its own. The computer sure doesn't read comments (and neither do most maintenance people).

Remember every rule is made to be broken and some especially hairy code may need much more commenting than usual.

Comment format. I take several shortcuts.

   T=>x  means if this variable is TRUE then x
   # means number of
   ^ means pointer to
   max means maximum
   min generally means minimum, though it can be minutes
    */ means end of comment as opposed to                                */

   Saying maximum number of widgets leaves less room to explain widgets on
   one line, and I do like one liners.

/*
   code fragment;
*/
   means a temporary deletion of the code fragment.  TESTING only!!

   Code I want to comment out "permanently" should be ifdefed.  Sometimes
   I forget.  Better yet, use some form of SCCS and 'delete' commented out
   code.

Be thrifty in your use of the repeat key. Lots of ***** and ----- and ===== or what have you just uses up file space. Sometimes it's justified. I like blank lines better.

MACROS

MACROS should be all upper case, except when they are 'redirecting' a bonafide function.

I use "cplus.h" to define the macros I use all the time. Things like TRUE, FALSE, NUL, and NULL, MAX, MIN, and my printf debug stuff.

You'll notice that many routines have ENTRY(), RTNx(), and lots of DEC(), CMT(), and friends in between. No ifdef DEBUGs. "cplus.h" defines a nifty set of macros that provide a quick way to consistently sprinkle printfs though the code without grossly affecting the structure and in a consistent fashion.

Notice that I haven't redefined the syntax of C, no 'hidden' braces, and no redefined control structures.

I do use multi line macros. If your preprocessor won't handle them, get cpp from DECUS. Don't screw around with faulty tools, at least not that faulty.

I've gone one step beyond with my own libraries for strings and some of the libc.c functions. Too many nights fixing index for strchr and recoding so strcpy didn't choke on NULL inputs. Besides, I prefer to have a pointer to the place scpy finished. I know where s1 starts and I don't think it's a good idea to put function calls in argument lists (at least not just to save a line of source code). [Note that with the relatively recent arrival of reasonable quality libc's from the various vendors, this clause is somewhat dated. I still have my library, but it doesnt support Unicode, so ...]

Types

First, don't use the C primitives. Some gutless wonder refused to say shorts mean this, longs mean this, and ints mean this. If you've ever moved to a machine that disagrees with your previous playground, you'll understand. "cplus.h" defines the ones I use. See Plum for an exhausting list of possibilities.

Second, always typedef new structures.  I like the following style:

   typedef struct new_t {
      struct new_t *n_next;         /* ^ to next one in list */
      Char *n_string;               /* contents of list */
      Int n_hash;                   /* hash value of contents */
   } New;

   New *news[MAXNEWS];              /* vector of new */
the new_t gives a nice unique structure name for self-referencing. the n_ is a way of separating different structure memberships. [Now that ANSI C is the C, the n_ is not really necessary and I don't use it as often.]

New is a nice way of visually separating New from new. And its consistent with Char and Int and...

Function names

My preference is to name functions so that they relate to their purpose. The same for variables, though I like short names for temporary indices and pointers. (I hate typing long names and I hate reading them) theSMoperations.Opcode_Designator->modifier[Modifier_index]->nextPointer = youGet_myPoint.Ihope. Maintain some X code if you don't.)

Names are meant to be tags, not descriptions. Describe the thing the tag represents when you declare it (if you are a user) or define it (if you are a maker). The level of detail is dependant on the situation. I know i is a loop counter, and s is a string. If that is all they are, don't waste time on them. however if i is a part of a complex variable, some warning would be helpful. [Hungarian notation in its 'pure' form is an abomination. You should be concentrating on the FUNCTION of the data not it's physical representation which OFTEN CHANGES!]

There are two remaining camps on names, the separate_all_words_with_underscores crowd, and the capitalizeTheSecondaryWords. I actually use both, but lean towards the caps group. a favorite trick is to indicate library membership with a 3letter prefix separated by an underscore and then caps the rest, tft_getMsg(). Play with this at the beginning of your projects and then INSIST on the standard you derive - This is ABSOLUTELY CRUCIAL for external interfaces.

Scope

Declare things Public if they are defined in this file and are meant to be public (global).

Declare things Private if they are defined in this file and are meant to be private. This is separate from Static which means it is a local variable meant to be STATIC STORAGE. (not file scope).

Declare things External if they are defined in some other file (hopefully, as Public).

Declare MAIN in the main.c file and declare things Global if they are declared in an include file that is used by both the defining file and declaring files. (look at cplus.h)

Declare everything you can to be Private. The name of the game is name space reduction. The fewer names you have floating around the better.

All Public functions should be prototyped Extern in an include file appropriate to the scope of the function.

Contrary to popular opinion (and the actual standard) not all compilers properly utilize the ANSI prototypes. I personally think that the standard sucks, a proper prototype, if you are going to bother, should include not only the type but a formal parameter name. WHY? Besides working everywhere, it is a whole lot easier to maintain and use a subroutine whose calling sequence shows a logical variable rather than just a type, ala

badproto(int, int, int) vs goodproto(int x, int y, int z)

naturally the values of x,y, and z influence their utility, but it is always easier to debug the routine even with just x,y, and z.

[and the the standard DOES NOT specify the return type for main, though I think it should be VOID since a proper main exits but hey...compiler writers sometimes forget that though they do interpret and implement the standard, they do not set it]

Globals

The fewer the better. Do you have a state machine? Define a state structure and pass it (or a pointer to it) around, not the n+1 things that describe it. Object-oriented programming isn't just for Java and C++.

Goto

gfmt uses it 9 times. gfmt is fast, small, and structured. Computers use gotos. Humans abuse them. Be careful. I use a define BREAK(label) (goto label); when I am using goto to exit from nested control structures. I get less reaction from "Stuctured Programmers" that way. A rose is a rose (just be careful of the thorns).

Do not replace a goto with a dummy variable. Dummy variables are aptly named.

Do NOT goto the body of conditional code.

Do NOT goto the body of loops.

DO goto the end of nested switches and loops.

Maybe goto common error handling.

Going forward is more easily justified than going back.

Use judgement, that's why we're paid the big bucks.

Also, consider using continue, break, and return. If you're finished do it, don't make me wade through the code looking for the end.

In line with this, if you're checking qualifying conditions, exit on the error condition, ie.

if (error)
   return(error)

if (error2)
   return(error2)

/* ok, do it to it */
....

is infinitely better than

if (!error) {
   if (!error2) {
      /* ok, do it */
   } else
      return(error2)
} else
   return(error)

Add a few lines between checks and make it a real algorithm and the difference becomes more infinite.

Be a C programmer

Strings are NUL terminated. Generally if you see things inside of "" it is or soon will be a string.

Beware of strncpy which sometimes doesn't (yet another reason for private string libraries).

Use memxxx for non-string memory manipulation. Use strxxx for strings. Be careful when mixing them.

Be cognizant of the difference between a thing and a pointer to the thing.

Use ++ and --.

Pre decrement and post increment whenever possible. You would think that with today's compilers the optimizer could figure this out, but most don't and it really does make a difference (I'm not sure about RISC).

for (;;) is better than while (TRUE), otherwise use whichever makes the most sense.

Use register sparingly. Putting something in a register does not necessarily make the routine or program faster. This is one of those cases where if you think it might be important, then check it out. If you're not using an optimizer that puts everything it can in registers (and wastes time doing so), then you might be guilty of wasting time saving and assigning registers when it is not justified.

Use -= and +=.

My favorite (and I can hear the screams) use

   if (!(x = f(y))) rather than if ((x = f(y)) == (Type *)NULL)

it is easier, and in practice less prone to error when you forget to properly typecast that NULL.

If you insist on explictly testing conditions, and there is some room for debate, well, I insist you properly typecast it.

Second pet peeve. If you have a function that normally returns a pointer, return a NULL for an error. There is a special place in Hell reserved for the jerk at Bell who returned a -1 for failed IPC pointers. If you need more information than failed, set an external or parameterized error result.

if (x) ;
else ... deserves a kick in the pants.  use it or lose it.

THIS goes double for if ((rc = f(x))) return; if you are going to use rc, then by all means use it. otherwise don't clutter things. Notice the double parens, use the compiler to warn you of errors, but sometimes you need to help it.

I prefer switches to if else if else if else if else if else. Sometimes its slower, sometimes not; if it's important check it out. Otherwise use the switch. It is a lot more readable.

If I use if else if ... as a switch, I line them up. Who needs all that indentation. Besides that is not the logical structure.

Misc
   Be careful of the scanf() family.
   Be wary of buffer overflow in your code and in libraries.
   Be wary of signals - Good candidate for ifdef code.
   Ditto for timers.
   Package file locking functions.  Another major source of bickering.

I think that sizeof(buf) is generally a bad idea when buf is an instance of a type and the type is known. (ala Mm_que buf, sizeof(buf) > sizeof(Mm_que) This falls in the category of NYET vs 0 since they are essentially the same, but I can grep for NYET and usages of Mm_que, 0 and buf are not really good targets.

Lint

Lint should be run on all code. Note that you may get messages that prove that lint is not all it should be. Nevertheless, get rid of all those unused variables, declare those external functions even if they are int, and typecast those assignments or the lack thereof.

I don't use ifdef LINT.

GCC -WALL warning messages should generally be removed, in particular the following messages should be resolved:

 warning: suggest parentheses around assignment used as truth value
 warning: suggest parentheses around + or - inside shift
 unused variable `f'
 warning: implicit declaration of function `selmsg'
 warning: control reaches end of non-void function
 warning: empty body in an if-statement
 warning: assignment makes pointer from integer without a cast
 warning: comparison is always 0 due to limited range of data type
 warning: `smach' declared `static' but never defined

It is understood that some messages and even some instances of the above messages are the fault of screwed up system files and nothing can really be done - but most of the time it can.

Design

KISS and read The Psychology of Everyday Things, by Donald Norman.

A thing that does what it is designed to without extra wear and tear on the user (or environment) is a thing of beauty. Everything else is junk. Sometimes you can find a use for junk, and frankly, simple elegance is not easy (nor, in the short term at least, cost justifiable). Still an ideal.

One final philosophical observation. As programmers, what we do is used (if we're successful) by many people. Systems programmers, and compiler writers take SPECIAL NOTE. Every screwup we make, everything we don't do in the name of time we try to save, is time (and world resources) spent many times over by our users.

(and keep a happy face :) )


Also see The Ten Commandments for C Programmers (annotated)
bobh@optimizations.com (Bob Hampton)