Handmade Opinions

Table of Contents

This is a collection comments on C programming sourced from Casey Muratori's Handmade Hero series. Casey has many asides where he shares strong opinions and, perhaps, wisdom. He is careful to say that the comments are, in fact, only his opinions. He emphasizes that if something different works for you, then do what works for you.

Furthermore, the opinions here are not Casey's–they're my own. It's my interpretation of what he says and I'm just some dude on the Internet. Please take that into consideration.

Always know what is actually happening with the CPU and memory

Being a good programmer means knowing what's going on in the machine. It's the only way to control what you're doing.

It is essential to read assembly and understand what your code is doing. It gives a sense of connection to your code when you know exactly what the processor is doing.

Despite claims otherwise, it's not a slow way of programming to understand the assembly. Not much assembly is actually generated. This makes sense because if the code you're writing is fast and efficient, then it will use a minimal number of instructions.

Fundamentally, everything is a CPU instruction that changes memory. This is what needs to be focused on because otherwise, you optimize for the abstraction.

Program in C

It's easy to understand what the CPU does when you write C code. This makes it possible to write simple, efficient, good code that can be easily debugged.

Don't use "Hungarian" notation

"Hungarian" notation is a prefix of characters that tell you what the thing is. Casey believes this is a bad idea.

Joel on Software makes a case for prefixing variables with "kind", not "type". He distinguishes between "Apps Hungarian" and "Systems Hungarian".

Apps Hungarian

Prefix names with the "kind". For example, use "colMax" and "rwMax" to distinguish between ints that represent incompatible "kinds" like columns and rows. In this way, related information is co-located which allows you to more easily spot errors.

Systems Hungarian

Prenames with the "type". For example, use "intMax" to let you know that the integer represents a maximum.

Casey and Joel appear to agree that Systems Hungarian is not helpful.

Memory is best acquired and released in aggregate

Think of things as a group that gets handled together, not as individuals. Create and release them in aggregate.

In C++, RAII is the idea that data should be wrapped in classes with constructors which allocate memory and destructors that release it. RAII says that these should happen symmetrically, for individual resources.

In Casey's view, it's better to think of resource management in waves. For example, the game doesn't make sense without the main window. If there is no main window, there is no game. To bother cleaning up the window makes no sense since the OS will collect all those resources anyway. He makes the point of some applications taking forever to close. These applications are "cleaning up" and wasting people's time since the OS will do that anyway.

Make static explicit

K & R gives two meanings to static:

  1. External (to a function): Limits the scope of an object to the current translation unit, e.g. file
  2. Internal (to a function): The variable persists across function calls

Casey differentiates external statics by variable and function. He calls static external variables "global" and static functions "internal". It's a little confusing because "internal" sounds like the opposite of "external". However, functions are considered "external" by default, meaning they can be called by any part of the program (from any translation unit). So, by "internal", he means the function is scoped to the translation unit, similar to how an internal variable is scoped to the function1. It's also confusing because he uses "global" to mean the variable is scoped to the translation unit whereas without static, the variable is scoped across the program (truly global). Since he knows his application will only have onetranslation unit, the global variable will be global in this sense.

Of course, a truly global variable is a problem waiting to happen because it's not clear who owns it, is modifying it, and when.

The rationale for defining new names for these different uses is to make searching for them easier.

#define internal static
#define local_persist static
#define global_variable static

Pay attention to whether a passed pointer is being read from or written to

Any time you pass pointers, the compiler might not be able to do optimizations because of pointer aliasing.

Pointer aliasing is when two pointers COULD point to the same memory and the compiler doesn't know if a write to one of those pointers might affect a read from the other pointer.

If you only read from the pointers, then it's not a problem. Assigning to them, however, introduces aliasing.

For example, when both B and D point to Y, the compiler must load from B each time it assigns (to A and C, respectively). If instead B is passed as a value, then the compiler can know to use that value directly and not worry about its value changing sometime during execution.

int X = 1;
int Y = 2;
int Z = 3;
int W = 4;

void
MoveBad (int *A, int *B, int *C, int *D)
{
  *A = *B;
  *D = 5;
  *C = *B;
}

/* Can't optimize: B and D point to the same memory; must read from B
   twice, once for A, once for C */
MoveBad (&X, &Y, &Z, &Y);

/* Can optimize: B and D point to different memory; write to C the
   same value you wrote to A*/
MoveBad (&X, &Y, &Z, &W);

void
MoveGood (int *A, int B, int *C, int *D)
{
  *A = B;
  *D = 5;
  *C = B;  // write same value as with A
}

NOTE:

The compiler is allowed to, and actually does, assume that two pointers to different types never alias unless one of the types is char or unsigned char.

CASEY'S OPINION:

The compiler has various optimization setting which assume things about aliasing. Assuming things about aliasing is just bugs waiting to happen. Assume-no-aliasing in compilers is bad. It's instead much better to mark your code to indicate that there is no aliasing (for example, the restrict and assume keywords).

Use 0 instead of NULL

Casey likes less defines and all the platforms he writes for make NULL be 0. So, he reasons, just use 0.

Some errors aren't a big deal

There are two kinds of errors: insidious errors and errors which manifest every time you run the program. You don't need to worry much about the kinds of errors that happen every time you run the application. You'll see those.

const is useless

Casey finds it doesn't help him catch bugs. He also says that the compiler doesn't really pay attention to it (in the optimizer). The reason is that in C++, apparently, you can cast away const.

Because of pointer aliasing, const is meaningless:

static void
DoTheStuff (int const *A, int *B)
{
  /* B could point to A */
}

Dynamically loading library functions (monkey patching)

If you don't want to include an entire library for just one or two functions, you can dynamically load the library and monkey patch the functions.

Monkey patching is not a term Casey uses. It's a term which means replacing the definition of a function while retaining the symbol name during run-time. That's exactly what Casey does in Episode 006.

To monkey patch in C:

  1. Define a stub for the function you want to replace
  2. Define a new type corresponding to the function's signature
  3. Create a function pointer to the stub using the type
  4. Replace all instances of the original function call with the function pointer
#include <stdio.h>

void
function_to_replace (char * message)
{
  printf ("%s\n", message);
}

/* 1. Define a stub for the function you want to replace */
void function_to_replace_stub(char * message)
{
  printf ("the function was replaced!\n");
}

/* 2. Define a new type corresponding to the function's signature */
typedef void function_to_replace_t(char * message);

/* 3. Create a function pointer to the stub using the type*/
function_to_replace_t * stub_ptr = function_to_replace_stub;

/* 4. Replace all instances of the original function call with the function pointer */
#define function_to_replace stub_ptr

int
main (void)
{
  char * message = "hello, world!";

  function_to_replace(message);

  return (0);
}
the function was replaced!

Casey uses a macro to provide a single place to change the signature for the stub and the typedef. This adds an additional step to the process.

  1. Define a macro to provide the type
  2. Define a stub for the function you want to replace
  3. Define a new type corresponding to the function's signature
  4. Create a function pointer to the stubaccept using the type
  5. Replace all instances of the original function call with the function pointer
#include <stdio.h>

void
library_function (char * message)
{
  printf ("%s\n", message);
}

/* 0. Define a macro to provide the type */
#define TYPE_SIGNATURE(name) void name(char * message)

/* 1. Define a stub for the function you want to replace */
TYPE_SIGNATURE(library_function_stub)
{
  printf ("your function was replaced!\n");
}

/* 2. Define a new type corresponding to the function's signature */
typedef TYPE_SIGNATURE(library_function_t);

/* 3. Create a function pointer to the stub using the type*/
library_function_t * stub_ptr = library_function_stub;

/* 4. Replace all instances of the original function call with the function pointer */
#define library_function stub_ptr

int
main (void)
{
  char * message = "hello, world!";

  library_function(message);

  return (0);
}
your function was replaced!

You can use gcc -E to see the macro expansion. The expansion (and possibly lots of other stuff) will be printed to stdout:

gcc -E /home/ahab/Projects/scratch/function_points.c
void
library_function (char * message)
{
  printf ("%s\n", message);
}

void library_function_stub(char * message)
{
  printf ("your function was replaced!\n");
}

typedef void library_function_t(char * message);
library_function_t * stub_ptr = library_function_stub;

int
main (void)
{
  char * message = "hello, world!";

  stub_ptr(message);

  return (0);
}

Align API documentation with the actual API

Don't tell programmers to use a member before that member is defined. It should go without saying, but maybe saying it will help people not design poor API documentation.

Always initialize variables to zero

Casey likes to always initialize variables to zero for code that's not performance critical. This looks something like:

int XOffset = 0;
int YOffset = 0;

win32_sound_output SoundOutput = {};

He feels it catches a lot of bugs and that you generally want to assume that things are initialized to zero. It would be nice, he says, for the language to assume that everything is initialized (to zero) and to require the programmer to explicitly request that a variable not be initialized, rather than the other way around.

Footnotes:

1

He mentions later that this is only true in name. A pointer to the function could still be passed to a different translation unit and that function called.

2023-02-23

Powered by peut-publier

©2024 Excalamus.com