CSE 326 Data Structures Project 3

CSE 326 Data Structures
Project 3
N-Body Simulation

Code due: Thursday, February 25, 1999, 11:30pm
Documentation due: Friday, February 26, 1999, in class

Objectives

Design geometric data structures for representing large sets of points.
Design algorithms for N-body simulation when N is large and approximations are needed.

Overview

The problem of N-body simulation is the following. Suppose we know the position, velocity, and masses of a large collection of bodies (planets). Given our understanding of physics and gravity, how can we determine the position and velocity of those bodies far into the future?

In this assignment we will examine a technique for solving this problem. It should first be noted that it is not ever possible to solve this problem exactly. However, our solution will allow us to achieve an arbitrary degree of accuracy, provided that we are willing to sacrifice time to wait for the answer.

We have provided you with a program called nbody for performing N-body simulation. You will extend this program and greatly speed-up its calculations by incorporating the Quadtree data structure.

Measuring Accuracy

Suppose we come up with a new technique for solving this problem. How can we determine the accuracy of the solutions it produces? Luckily for us, this question is much easier to answer than the original problem.

Two basic axioms of physics are that the total energy and total angular momentum of any closed system never change. This suggests a method for measuring the accuracy of our solution: compute the total energy and total angular momentum in the original state and in the solution, then compare those numbers. The more accurate our solution, the closer those numbers will be.

In the nbody program that you will be modifying, pressing the 'e' key in the main simulation window will have the program print out the total energy of the system. Likewise, pressing the 'l' will print out the total angular momentum of the system.

Accurate Simulation

The first issue in creating an N-body simulation program is to come up with a technique for solving this problem accurately.

Suppose that we were trying to simulate the motion of a single body in one dimension. We are given the current position (x), velocity, and mass of this body at some time t, and we are asked to determine the position of this body at some later time t + s (the number s is called the "time step"). A simple technique for producing an approximate answer to this problem is shown in the following figure.

Here we simply assume that the body is going to continue moving in the same direction. Then, the position at time t + s is simply x + s v, where v is the velocity. As you can see, this method will be more accurate if the time step s is smaller. In particular, if s = 0, then the error would be 0.

If we are given the position and velocity at time t₀ and we want to know the position and velocity at a much later time t₁, then we can achieve better accuracy by extending the method outlined above as follows. We start by picking some time step, s, such that the error in calculating the position at t + s, given the position at time t, is sufficiently small. Then, we use the method from above to compute the position at time t₀ + s from the position at t₀. Next, we use the method from above to compute the position at time t₀ + 2s from the position at t₀ + s. We repeat this to compute the position at time t₀ + 3s, t₀ + 4s, and so on, until eventually we have the time at t₀ + Ns = t₁. This technique, called the "Euler Method", is shown pictorially in the following figure.

It can be shown that the error in the first method is proportional to s while the error in the latter method goes to zero like s² (remember: s is small, so s² is even smaller).

The Euler Method uses the value of the velocity at only a single point in order to determine the position of the body at time t + s. What we are actually given in this problem, is the ability to compute the value of the velocity at any point. One thing we could try is to compute the value of the velocity at several points and then combine those numbers somehow to produce a better estimate. One method that does this is the classic "Fourth-Order Runge-Kutta" method. It computes the value of the velocity at 4 cafefully chosen points to make its estimate. The important question is whether or not this extra work pays off.

It should be clear that computing the velocity at 4 points instead of just 1 will make the simulation take approximately 4 times as long to compute the positions at the next time step. Suppose that this method decreases the error by a factor of r (i.e., the new error is the old error divided by r). Then this means that we can achieve the same accuracy by using a time step which is r times longer. This would achieve the same accuracy and use r^-1 times less intermediate points. Therefore, the Runge-Kutta method is an improvement over the Euler method if r is greater than 4. What did the value of r turn out to be in our nbody program? It was approximately 10,000,000 = 10⁷. The Runge-Kutta method was quite an improvement! (In general, it can be shown that the error in the Fourth-Order Runge-Kutta method goes to zero like s⁵.)

In our N-body simulation program, the situation is a bit more complicated than described above. First of all, our points are 2-dimensional instead of 1-dimensional. Secondly, our universe is described by 2 equations instead of 1. Let me elaborate on this second point.

We said above that we were given the ability to compute the velocity at a given point. We used the velocity, then, to compute the value of the position of each point at the next time step. In our simulation, we have the ability to compute the acceleration of each point. This is not so much of a problem. What we need to do is keep track of the position and velocity of each body. We can use those values to compute the acceleration of each body. We use the acceleration to compute the velocity at the next time step (t + s), and we use the velocity and acceleration to compute the position at the next time step. Thus, the techniques described above are still applicable.

Fast Simultion

The second order of business in creating a good N-body simulation program is make it fast.

The computation time of our N-body simulation program is greatly dominated by the 4 acceleration calculations used by the Runge-Kutta method to determine the positions and velocities at the next time step. Thus, in order to speed up our program, we should concentrate on speeding up these acceleration calculations.

The obvious, brute-force method for finding the acceleration of each body given their current positions is to compute the force of gravity between every pair of bodies. Specifically, for each body b_i, we compute the total force on b_i by summing up the force between b_i and every other planet b_j. Once we have the total force, we can compute the acceleration via Newton's Third Law: F = ma (or a = m^-1F).

If we have N points, the brute-force method will require O(N²) time to compute the acceleration of each body. This is unacceptable for large N. The key to speeding up these distance calculations is to figure out where this method is doing unnecessary work. Such a situation is shown in the following figure.

Here we are trying to compute the force on the body on the left due to the gravity of the bodies on the right. In this situation, the bodies on the right are well-approximated by their center of mass. Instead of computing the force between the body on the left and each of the bodies on the right, we should simply make one force calculation: the force between the body on the left and a single body, situated at the center of mass of the bodies on the right (shown in grey), with mass equal to the sum of the masses of the bodies on the right. Since the body on the left is far away from the bodies on the right and since the bodies on the right are clustered together, there will not be much error in this approximation.

Our technique will be the following. We will build at Quadtree to hold the bodies in our simulation. At each internal node of the Quadtree, in addition to the normal fields of a Quadtree node, we will store the center of mass of all the bodies in the tree rooted at that node (along with the sum of their masses). Then we will compute the total force on a some body b_i due to all of the other bodies recursively as follows. If we are at a leaf node, we compute the force between b_i and the body stored in this node and add that force into the running total. If we are at an internal node, first we determine the amount of error that we would endure by approximating the bodies in the subtree rooted at this node by their center of mass. If this amount of error is less than some threshold, compute the force between b_i and a body situated at the center of mass, with mass equal to the sum of the masses of the bodies in this subtree. Otherwise, recursively add in the force contributed by the bodies in each subtree of this node. Hopefully, this technique will have us search far fewer bodies than the brute-force method.

In order to implement this technique, we will need to implement a dynamic Quadtree data structure which implements the algorithm just described. This is your assignment.

Assignment

For this assignment, you are required to extend the nbody program to use Quadtrees for performing its accelerations calculations. First, your Quadtree implementation will need to support the basic operations of insert, delete, and range-query. Second, it will require an implementation of the algorithm described above to compute accelerations. Finally, it will need to support an operation for drawing the splitting lines of Quadtree.

What we've provided

We have provided you with a fully functional version of the nbody program. Its implementation consists of more than 7500 lines from more than 50 files. Fortunately, you will only need to look a couple of these: Quadtree.h and Quadtree.C (and possibly Simulation.h). Quadtree.h contains the declaration of the Quadtree data structure that you must implement. Quadtree.C is the file in which you will implement the Quadtree data structure. The full details of which functions you must implement can be found in Quadtree.h.

Handing in the code

You are required to hand in your project directory with your implementation of the Quadtree data structure. Included in this directory should be at least the Quadtree.h, Quadtree.C, and any other files that you added or modified.

Turn your files in electronically using turnin:

% cd ~/326stuff/myproj3dir % make clean % cd ../ % turnin -c cse326=AB -p project3 myproj3dir

Handing in the documentation

Your documentation should be one or two typewritten pages. Your figures can be hand drawn. It should contain the following:

A description of your Quadtree data structure and a brief explanation of why you chose this particular structure.
A high level description of your algorithms.

Also answer the following questions:

What are some good criteria for initializing a set of bodies with velocities? For example, how would you initialize an asteroid belt or a pair of rotating stars of different masses?
How do you find a good setting for the error bound in the simulation? If the error bound is too large then the simulation is very inaccurate and if it is too small then the simulation takes a long time for large N.
When the error bound is at the good setting you described above, what fraction of the nodes the tree are visited, on average, in performing an acceleration calculation?

Additional utilities

Along with the nbody program, project3 comes with a program called generate which generates random simulations that can be run with nbody. Use the following command to see its options.

generate -h

Its output is printed to the console. To generate a random simulation of 30 planets and save this simulation as a file called foo.sim you would type the following.

generate -n 30 > foo.sim