CSE 451 Autumn 2000 - Project Assignment 1

Simple System Call

Out: 27 September, 2000
Due: 5 October, 2000

Assignment Goals

Background

The Linux kernel allocates memory using a "buddy system."  We'll discuss buddy system allocation later in the course.  For now, it's enough to know that requests for memory (within the kernel) are required to be a power of two number of pages.

The routine 

struct page * __alloc_pages(zonelist_t *zonelist, unsigned long order);

implements the allocation side of this memory management.  The parameter order is the log of the number of pages requested.  So, if order is zero, the requester wants a single page returned, whereas if order is 4, the requester wants 16 (contiguous, in real memory) pages.

The Assignment

To motivate a later assignment, we want to instrument the kernel so that we can write a user-level program that will print histograms of the actual request sizes handled by __alloc_pages();  that is, I want to write a garden-variety C program that prints out (a) the total number of requests to __alloc_pages() that have been made since the system was booted, and (b) the fraction of that total that were requests for one page, for two pages, for four pages, etc.

To do this requires three things:

  1. Modify __alloc_pages() to keep track of this information.

  2. Design and implement a new system call that will get this data back to the user application.

  3. Write the user application.

Warning 1: Remember that the Linux kernel runs in real-mode (memory addresses refer to real memory), while the calling application runs in some virtual address space (so addresses it gives are in that address space).

Warning 2: Remember that it's inconceivable that this problem has never before been confronted in the existing kernel.

Warning 3: Remember that the kernel must never, ever trust the application to know what it's talking about when it makes a request.

Warning 4: Remember that you must be sure not to create security holes in the kernel with your code.

Unusually Pointed Directions

The part of the user-level application you didn't learn in CSE 142 is this:

#include <sys/syscall.h>
#define __NR_orderhist something
#include <unistd.h>
....

int ret = syscall(__NR_orderhist, ...);

(One last detail: If we were really implementing a new system call, we'd put the #define above in  <sys/syscall.h>.  But, we're better off not monkeying with that file, as it's shared among all of us.)

Recommended Procedure

I suggest you wade, rather than dive, into this.  In particular, here's a suggested set of incremental steps:

  1. Don't change any Linux code.  Figure out how to do a make of a new boot image, what file to move where so that you can boot the image you just created, how to tell the loader (LILO) that your image exists, and then how to boot your image.

  2. Now put a "printk()" somewhere in the code, and figure out how to find its output.  (Hints: /var/log and "man dmesg").

  3. Now implement a parameterless system call, whose body is just a printk() call.  Write a user-level routine that invokes it.  Check to make sure it was invoked.

  4. Now write the full implementation.

What to Hand In

You should hand in a write-up of what you did.  Generally, these write-ups will include only, at most, snippets of code.  That's the case here.  Tell us:

This is all due in section on 10/5/00.


Details

The source is on the two gateway machines, greer and baughm, in /scratch/linux-2.2.4-test6IKD.tgz.  This is a gzip'ed, tar'ed file.  You'll need to extract the source to make a private copy.  After you've done that, you'll work from your private copy.

Accounts

You'll have personal accounts on both the gateway machines and the test boxes.

Your gateway (greer/baugh) account is your normal instructional account. 

You'll get your test box personal accounts, and passwords, in section on Thrusday.

On Thursday you'll also get the root password (the password for account "root", also known as "superuser", which is the equivalent of the "administrator" account in Windows) for the test boxes.  With the root password you can do anything - create or delete files anywhere in the file system, create accounts, reboot the machine of the guy next to you, you name it.

Why have personal accounts on the test machines?  If you get to the point of doing significant work on those machines, e.g., debugging, you might want to have some personalized settings for things like the editor and the window system ("X Windows").  (On the other hand, for some kinds of debugging you might end up wanting to be logged in as root.)  Personalized settings are kept in configuration files, usually with names prefixed with a period (so that they aren't shown when you issue a standard ls (Windows "dir") command), and usually stored in your home directory.  If everyone logs in as root, that's only one user as far as Unix is concerned, and so you'll end up getting the customizations of whoever set them last.  (This is like all of us sharing the same Windows profile.)  So, you might want to be able to login and be you.  Of course, if the machine is blown away by someone and rebuilt, well, your "dot files" may not live through that.

Etiquette

As root, you can change "system files."  This will cause a long and vehement stream of "bloody hell"s to be emitted from the next user of that machine.  For example, you can change the files in /usr/include, which is where things like <stdio.h> are kept.  Don't.

Where to Keep Files

You should keep your private copy of the full source on greer or baughm.  Do not keep the source in the home directory you'll have when you login to those machines.  That directory is your standard (Unix) instructional home directory.  If you attempt to extract that source into that directory you will immediately exceed your storage quota.

Instead, god willing, you'll find a directory set up for your use in /scratch on greer and/or baugh.  That is where you should do your work.  Note, though, that these files are not backed up.  The systems are stable, but it would be a good idea to copy only those source files you have changed to your instructional account files space from time to time.  (Instructional account file space, e.g., your home directory on those machines, is backed up.)

Useful Unix Commands

There are a lot of them.  Here are some places to begin

Unix Shells

The Unix shell is the program that reads the commands you type and figures out what they mean.  In Unix, it's easy to create new shells, and there are lots of them out there.

When you're running as root, you'll be using the bash shell.  When you're running as you, you'll be using some shell, maybe bash, maybe tcsh, maybe something else,  This means you might be talking to different command interpreters at different times.  For simple interactions, you won't be able to tell the difference among the shells, and it won't matter.  As you get more sophisticated, either you'll just automatically adjust to whichever shell you happen to have (your fingers will do the thinking), or else you'll figure out how to change your shell.

Unix Files

It's just like windows (folders are called directories; files are called files).  Unix does not rely on the filename (and in particular the file extent) to tell it what the type of file is, for the most part.  But, you should know that .c files are C source files, .h files are header files (just like in MSVC), and .o files are (essentially) machine instructions - the result of compiling something. 
For what it's worth, moving around in the Linux file system is just like moving around in a Windows file system using a command line (DOS) window, more or less. cd <dir> changes the working directory to the named directory.  ".." is the name of the parent of the current directory, so cd .. moves up one level in the file tree.  "." is the name of the current directory.  pwd prints the (fully qualified) name of the current directory.  ls lists the names of the files in the current directory. 

Editors

Vi and emacs.  There's some information on these in your Running Linux book; there's also information online (man, for instance), and undoubtedly on the web.  Plus, there's information about them sitting in front of screens in the various instructional labs.

Getting Along in Unix

The biggest distinction between Unix and Windows is attitude.  The Unix attitude is "I won't give you a fork, because some people want 3 tine forks and some want 4, so I'll instead give you a handle and a box of  tines and a way to connect them, and you build what you want." ("And, oh, by the way, maybe what I should really give you is some iron ore and a smelter and you can make a lot more cutlery than forks.")  The Windows attitude is "You say food and the next thing you know a complete menu Microsoft has chosen for you based on your past preferences appears magically in your mouth.  What could be more convenient?"

To tell you the truth, these days I prefer Windows when I actually want to get something done.  But, there's a lot (a huge lot) to be said for the Linux way of doing things.  It will seem crude at first, but once you get the hang of it, you'll find that there is amazing power in providing a (reasonably thought-out) set of basic building blocks that know how to talk to each other, and trusting the user to find ways to make new things out of them.