Variables

Now saying hello to your computer shows you how to run gcc and print stuff to the terminal, but how about actually adding some user input or working on some large amounts of data? You know, not useless stuff.

So what are variables in C?

As you might know already, variable is kind of like a “name” or “a box” that represents a value that might change over time – hence the name variable.

C isn’t an object-oriented language and fairly low-level compared to other languages, so in a C context, a variable represents a place in memory and the value stored at that particular place in memory, this is true for any other programming language as well, but when working with pointers and memory, this becomes even more clear with C.

It’s also worth noting that variables in C aren’t objects the same way they are in Java, Python or JavaScript, but simply binary numbers with a particular way of being interpreted. Because of this, variables don’t come prepackaged with a set of methods, and the methods and operators you might have been used to from more higher level languages are simply functions that exist apart from the programming language itself.

But enough theorising, how about some real world examples eh?

We’ll start with a simple calculator.

This simple calculator uses the argument vector to add two numbers:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
  if(argc < 3) {
    printf("Usage ./adder [addend] [addend]\n");
    return -1;
  }
  // Accesses second element of argument vector and parses to int
  int a = atoi(argv[1]); 
  // Accesses third element of argument vector and parses to int
  int b = atoi(argv[2]); 
  int c = a + b;
  printf("%d + %d = %d\n", a, b, c);
  return 0;
}

Compile this and run by running

gcc adder.c -o adder

./adder 2 2

This sums up the arguments and should print something like:

2 + 2 = 4

If you try to run it by just typing and executing ./adder you should get a warning message:

Usage ./adder [addend] [addend]

In this case the if ( argc < 3 ) statement checks to see if we have too few command-line arguments (the length of the *char argv[] array) for our program, if they’re less than 3, we print the warning and return from main, which is equivalent to exiting the program.

Now let’s look at the variables in this program.

As you might have noticed, I’ve already introduced three variables in the “Hello, World!” program: int argc, char *argv[] and char *envp[].

The first such variable is an integer, the second and third are arrays of strings. These variables are arguments to the main function, when you run the program, the shell then executes the compiled program and calls the main function with the list of arguments written in the terminal.

The other three variables, int a, int b and int c are part of the function body and are local to the main function. If we want to, we could declare these variables outside the main function, and the scope of the variables would be global, making it possible to access the variables from any function in the program, the downside of this is that since variables can only be declared once, no other variable within the same scope can have the same name, and when programming any larger program, global variables will become a PITA to keep track of.

The integer data type is normally a 32 bit signed integer with maximum and minimum values that you can print out using this program:

#include <stdio.h>
#include <limits.h>
int main(int argc, char *argv[]) {
  printf("Int32 min: %d \nInt32 max: %d \n", INT_MIN, INT_MAX);
}

Which prints:

Int32 min: -2147483648 
Int32 max: 2147483647 

So we have signed and unsigned variables. What do we mean by this?

Well a signed variable has either an implicit plus or an explicit minus in front of the number, to get this, one bit is sacrificed (if the most significant bit is 0, it’s positive, while a 1 means negative).

The primitive data types in C are.

  • char
    a single character, usually 8 bits, and can be signed or unsigned
  • short
    a short integer, often 16 bits, can be signed or unsigned.
  • int
    a regular integer, often 32 bits, can be signed or unsigned
  • long
    a large integer, often 64 bits, can be signed or unsigned
  • float
    a single precision floating point number, usually defaults to 32 bits
  • double
    a double precision floating point number, usually defaults to 64 bits
  • pointer
    like the char, this is an integer that is interpreted in a special way, a char can be seen as the “address” of a character in a table, while a pointer is an address to a place in memory.

If not otherwise specified, char, short, int and long defaults to signed, meaning that you often need to specify when you want it to be unsigned, while you don’t have to specify signed.

If you work across platforms, you should not trust this, but then again, if you work across platforms, you shouldn’t be using the primitive types either, but rather types such as uint32_t defined int the <stdint.h> header file or some other standardised library.

A variable is declared this way:

datatype variable_name;

Example:

int a;

And a variable is declared and defined using the assignment operator =

datatype variable_name = expression_value;

Note that since we use the equals sign as an assignment operator, a single equals sign in a statement such as a = b” should be read as an assignment and never as “a equals b“, for this type of statement we use the equality operator == as in “a == b”.

Example:

int a = 42;

Or:

double b = (my_function_name() + 42) / 37;

As with any programming language, there are conventions to variable names.

To keep things simple, always start with a letter, stick to mostly alphanumeric characters and underscores, use UPPER_CASE for constant names, and unless you’re planning on writing the C library, don’t start names with an underscore such as _foo or _bar.

I use the term expression value, since the right hand side of the assignment expression can in itself be an expression, but that the expression must result in a value equivalent to the type declared. If the expression results in a similar, but still different datatype than the type declared, you can try and use a cast. For example, float may be converted to double precision floating point values by using the (double) cast in front of the expression value, and any integral number may be converted to int using the (int) cast.

You can also convert floating point numbers into integers, and get the address of a pointer to by casting it to int.

The following code:

#include <stdio.h>

int main(int argc, char *argv[]) {
  double a_dbl = 3.5;
  int a_int = (int) a_dbl;
  unsigned int a_dbl_addr = (unsigned int) &(a_dbl);
  printf("a_dbl: %f \na_int: %d\na_dbl_addr: %u\n", a_dbl, a_int, a_dbl_addr);
}

Will print this to the terminal:

a_dbl: 3.500000 
a_int: 3
a_dbl_addr: 3912067768

If all this talk about pointers and addresses confuse you, don’t think too much about it. This is one of those low-level quirks that make C stand apart from other programming languages, and in nearly every tutorial on C, at least one chapter is dedicated to the topic of pointers and memory management.

Notice how every floating point number is rounded down to its nearest integer, this is because the casting doesn’t really consider anything beyond the decimal point and simply cuts it away. If you want your program to behave in a mathematically consistent manner, you can use one of the many useful functions found in the <math.h> header file.

For example, to round up, we can use the ceil() function in math.h like this:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char *argv[]) {
  if(argc < 2) {
    fprintf(stderr, "Usage ./round [float_number]\n");
    return -1;
  }
  double i = atof(argv[1]);
  int q = (int) ceil(i);
  printf("%f ≈ %d\n", i, q);
  return 0;
}

Which will print:

5.500000 ≈ 6

When supplied with the value 5.5 as the first argument after the program name.

The variables mentioned thus far are either in static memory or automatic variables such as those in the function declaration for main. These types of programs aren’t very useful since the user of the program can’t submit any new data to the program during runtime that is somehow stored in the program, but only a fixed number of bytes are available to the user.

Before we can use dynamic memory in C we need to understand how pointers, arrays and memory allocation works. We’ll look at that in later chapters.