How to write programs larger than 100 lines

TPU Chat

Chat on IRC
Internet Relay Chat?
Java Applet client
irc.freenode.net
/join #hprog
Who's on IRC?

How to write programs larger than 100 lines
Introduction and Legible Coding | Solid Code, Handling Exceptions, and Style | Comments, planning, and tools
By RedX
Introduction and Legible Coding

O.K., you wrote your first series of programs, and now you know how to use the loops and variables. You wrote a few little ASCII games and perhaps even some graphical demos. But all this isn't even near what you find at your local software store. These programs are enormous compared to what you have written. How do they manage to write programs composed of a few hundred files and several million lines of code without getting stuck? Today some of these secrets are revealed.

A. The real reason for existence of high level languages.

Everybody (at least those who read this) knows that assembly language programs can potentially be a lot faster than those written in other, higher level languages. So why do the professionals use C or C++ if assembly can be so much faster and smaller? Because one can write better, legible code in a higher language than in assembly, and in a much shorter time.

Pay special attention to the word 'legible' in the last sentence. You have to write your code as legibly as possible. Write as if someone will read it like a novel or as a manual. It has to be clear what the program does, what the purpose of a variable is, without scrolling back to see what type a certain variable had. If every element of a program is clear, then there will be a lot less 'unsolvable' errors to deal with.

B. The achievement of legible coding.

The first elements we're going to clean up are the variable and the constant.

A variable (or a constant) has a name. The choice of this name is a *very* important moment in the life of a program. Pick an unclear name and you'll regret it for the rest of the program's life.

Choosing this name requires you to understand perfectly what its use is. Pick something that explains its reason of existence as clear as possible, even if this results in a 25-letter name. It'll pay off when you're near the end of the project and want to get the last few bugs out of it.

It's also very handy to add a (few) letter(s) to the beginning (or end) of a variables name to indicate what type it is. It saves a lot of time looking back for it. (look for 'Hungarian Naming Convention' or 'Hungarian notation')

And while it's good to recycle your garbage, it's not a good thing to do with a variable. Every variable has to have one use and should not be used for anything else. (Disobeying this can result in very difficult to remove bugs).

(This is also true for pointers. However, here it applies to the pointer and not to the memory location it points to)

The only acceptable use of one-letter names as 'i' or 'j' is when it's used as a loop counter in a simple, not nested, loop, and it doesn't have any other meaning than counting the repeats.

Constants: Never use magic numbers. If the maximum amount of enemies is 100 then create a constant with a good name (e.g. maximumEnemies) and assign 100 to it. This is because one day you'll find a way to allow more enemies and 100 will have be to replaced with 200. Now would you rather spend five minutes making this adjustment and spend the rest of the day playing with 200 enemies, or would you rather spend a whole day changing 100's into 200's?

And then there is PI. While this will remain 3.141592... for the rest of our existence, this doesn't mean it's O.K. to use 3.141592 in stead of a constant. Today 3.14 might be accurate enough, but when you need some extra accuracy later...

One exception on this are the '1's and '0's used in loops or counting.

E.g.

FOR dotsOnScreen = 0 TO MAXDOTS

and

intCounter = intCounter + 1

The second element is the function/procedure.

The whole discussion of the variable naming can be repeated for this part. A function's name must be clear. It must be clear what it does, what its input is, and what it returns. The name should clearly state its purpose in terms of the problem (e.g. GetPlayerName) and not in terms of the solution or implementation (e.g. InputStringThroughConsole). This makes reading the code easier.

Also don't let functions alter too much global data. This makes it harder to find bugs (any functions can be the one resposible for that off-by-one bug) and makes it more difficult to reuse functions from previous programs (why write the same thing twice?). The ideal function receives some variables as input and returns a value, without disturbing the rest of the data in the program, has a clear name, and can be reused.

A Good example:

STRING strNameOfPlayer

CONSTANT HIGHEST_SCORE = 1000

INTEGER iPlayerScore

IF iPlayerScore > HIGHEST_SCORE THEN strNameOfPlayer = GetPlayerName

An ugly example:

STRING name

INTEGER score

IF score > 1000 THEN name = InputStringThroughConsole

| Next |