Introduction to C

Why learn C?

A survey of popular programming languages and how they are implemented.

Every language needs a compiler, and this compiler itself needs to be written in a language. (So how was the first compiler written?)

It is very telling to see how "pure" a language is by whether it is good enough for its own inventors.

Language: Version: Implemented in: Link:
Java Sun JDK C http://hg.openjdk.java.net/
C GCC C http://gcc.gnu.org/viewcvs
C++ GCC C/C++ http://gcc.gnu.org/viewcvs
Python Python Foundation C http://svn.python.org/view/
Perl Perl Foundation C http://perl5.git.perl.org/
PHP The PHP Group C http://svn.php.net/viewvc/
Ruby http://ruby-lang.org C http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/

In all of these cases, we are talking about the core compiler for the base language, not any "standard libraries" that may come with these languages.

The main reasons these languages are all implemented in C is due to:

Comparing "Hello World"s

The first program that most people write in any language prints the phrase "Hello World!" on the screen. So let's compare what this takes in both Java and C.
public class HelloWorldApp {

public static void main(String[] args) {
System.out.println("Hello world!");
}

}
#include <stdio.h>

int main(int argc, char** argv) {
printf("Hello world!\n");
return 0; }

Of course, there are others ways to accomplish the same things in each language, but in an aesthetic sense, these are the most "Java" and "C" ways of doing things.

Similarities

Differences

Peeking Beneath the Hood

Although Java likes to pretend we live in a magical world of pure objects that cuddle together in a heap, it doesn't take a lot of digging to get a peek at the unencoded matrix: the default hashcode for Java objects is essentially their location in memory. This is used as the default toString() implementation as well.

class Peek {

public static void main(String[] args) {
Peek peeker = new Peek();
System.out.println(peeker.hashCode());
System.out.println(peeker.toString());
}

}

How to Break Your Computer

Don't worry, this won't void your warranty.

In C, it is ridiculously easy to get the location of something in memory (its "address") and do something devious with it. The address of a type is itself a type, called a "pointer". Pointers and references are mostly interchangeable concepts, their only difference is in notation, which we'll discuss later.

The pointer to a type is denoted with an asterisk (*), or simply "star". You can have pointers to ints (int*), pointers to doubles (double*), or pointers to anything (void*).

int* integer_address;

There can even be pointers to pointers (ad infinitum).

int** integer_address_address;

Taking the address of something is done with a preceding ampersand operator:

int a;
int* integer_address = &a;

We say that the pointer integer_address references the integer a. Conversely, the contents of a are pointed to, or referenced by, integer_address.

You can access the contents referenced by the pointer by using indirection, or "dereferencing", again with the poor, overused star operator.

int a = 2;
int* integer_address = &a;
printf("%d\n", *integer_address);

You can even write over the contents of memory referenced by a pointer:

int a = 2;
int* integer_address = &a;
*integer_address = 3;
printf("%d\n", *integer_address); /* a now equals 3 */

The below program prints out the address of the main function itself, and then tries to write to it (causing an error).

#include <stdio.h>

int main(int argc, char* argv[]) {
int* blah = (int*)&main;
printf("%x\n", blah);
*blah = 0x3141592;
return 0;
}

A Tentative Cheat Sheet

Java C
Naming convention: camelCaseInJava Naming convention: under_score_in_c
Garbage collection - no deallocation. Explicit memory allocation / deallocation (malloc and free)
All variables are references to objects (or primitive types like ints) All variables are primitive types (and they are all basically the same primitive type)
SomeClass.java -> SomeClass.class somefunctions.c -> somefunctions.o -> somefunctions.exe
Java has classes with access control for members (private, protected, public, package-visible) C has structs, but everything is public
Java produces bytecode, which must then be interpreted in virtual machine (java.exe) C produces an actual binary, which can be run directly by operating system.
import somePackage.someClass; #include <some_header.h>
Java is class-based, and uses packages/imports to reference externals C is module-based, and uses preprocessor includes to reference externals