# Project 1 hints

Here is one way to perform the experiments. In the text below, n refers to the number of variables in the CNF and m the number of clauses.

## 1. Finding hard problems

In the Mitchell/Cook paper, they did an experiment (p.8) on DPLL, showing that at a certain point the problems become very hard, and then drops off. It is related to the structure of the CNF. The second graph shows the relationship between the % of problems solvable in the range. You can carry out a similar experiment with a couple of different values for n to see if you get similar results.

## 2. Tuning Walksat

Initially Walksat seems to perform much worse than DPLL. However, when given good parameter values, Walksat can do very well. In the "Evidence for Invariant ..." paper, they describe such phenonmenon. You can implement sophisicated adjustment described in the paper. A more straightfoward approach is to take a number of harder problems (which you know how to generate from part 1) and run different values of Nflips, Nrestarts and p against it. To test the values of one parameter, you need to fix the other parameters. For example, to test for values of p, you need to fix Nflips and Nrestarts.

## 3. Comparing DPLL & Walksat

From part 1 above, you should now be able to tell what are easy and hard problems, and generate them accordingly. Try to run Walksat and DPLL on these problems, and see what happens. One possible way to evaluate the algorithms is to cut off the search at a fixed time, say a few minutes. Find some n so that DPLL fails within the time threshold, and test how Walksat performs given the timethreshold. You can make plots similar to figure 1 of Mitchell/Cook paper, but with data for both DPLL and Walksat.

You can vary the length of clauses in part 1 to get a sense of how length of clause affects problem hardness, .e.g instead of using 3-CNF in Figure 1 of Mitchell/Cook paper, you can try the same experiments with 5-CNF or 7-CNF to see how the results differs from 3-CNF. In part 2 and 3, you shouldn't worry about the length, instead focus on choosing a range of m and n that would perform reasonably on your solvers. In the Mitchell/Cook paper, they have n=50 and m ranging from 2 to 8, but depending on your implementation of your solvers and the time you have, you may want to change these numbers. You'd have trouble finishing the project if the problems take an hour to run, but would probably be ok if they run for 10 or 15 minutes or less. The important thing about this project is to find the critical region of change shown in Figure 1 of the Mitchell/Cook paper and see how Walksat and DPLL perform before, during and after this critical region. For each configuration of CNF you test with different n and m, remember to generate a number of different CNFs with that configuration and get an average; otherwise, your data may not be well-behaved. If you want to plot the graph in the bottom half of Figure 1 in the Mitchell paper, you also need a meaningful sample size. How big a sample would vary for different generators/solvers. You can start with some arbitrary size say 20 or 30, and add to the dataset if the plots doesn't look quite like Figure 1.

If you are running out of time and can't read all the papers, you should at least try to understand section 3 of the Mitchell/Cook paper.

Also remember to run each problem instance with Walksat for a few times to get an average number, since its performance varies from run to run, you need to get an average across a few runs.

There are alternative versions of walksat, one flipping v from all available variables, and one from a random unsatisfied clause. You can choose to implement either, the Weld version should give a better hill climbing heuristic, even though it's slower on each flip. In the end my guess is the improvement in speed per flip you get from the faster approach may be offset by the more number of flips required to reach the goal, but the choice is up to you, just remember to state what you're implementing in your report. It would also be interesting to compare both versions of Walksat empirically.