handout #3

CSE341—Programming Languages

Assignment #2 (ML)

due 11 pm, Thursday, 4/15/10

In this assignment you will continue to practice with the basic building blocks of ML but we’ll include patterns and the let construct.  In other words, you are allowed to use any constructs from chapters 1 through 3 of the Ullman text.  You are not yet allowed to use higher order functions (functions that take other functions as arguments, like the map function) and you should not use the built-in function called rev even though it is briefly mentioned in chapter 3.

You are once again allowed to define “helper” functions to solve these problems and you can include any testing code you develop (although it should be labeled as such).  You are expected to use good programming style and to comment your functions.  All helper functions should be defined with a let construct so that they are not part of the top-level bindings.

In this assignment we will explore what is known as RSA encryption.  You don’t have to understand the details of the algorithm to do the assignment, but you might want to read about it online first.  For example, you can read about it at wikipedia:

http://en.wikipedia.org/wiki/RSA

or at Wolfram’s MathWorld:

http://mathworld.wolfram.com/RSAEncryption.html

RSA encryption involves computing with very large integers.  This requires special code that will perform the required math in a more efficient manner.  All of the encryption will involve integers, so we will need functions that convert between strings and sequences of numbers.  We will turn each character into its ordinal equivalent using the built-in ord function and will convert it back using the built-in chr function.  We will be using the built-in implode and explode functions to convert between string values and character lists.  To keep the math fairly easy, we’ll restrict ourselves to ASCII characters that have a range of 0 to 127 and we’ll even assume that we have no characters with ordinal value 0.

We don’t want to encrypt these simple ordinal values by themselves because there are only 127 of them.  That would mean that there are only 127 codes to guess, which would be fairly simple.  We need to combine several characters together to form a larger integer that we’ll encrypt.  I refer to this in the functions I’m asking you to write as packing and unpacking.

For an example of packing and unpacking, suppose you were dealing with numbers between 1 and 99 and you wanted to “glue” several of them together.  If you wanted to pack together the numbers [38, 3, 97], you could turn this into the integer 380397 by computing:

38 * 1002 + 3 * 1001 + 97 * 1000

Given the integer 380397 and knowing that we used the integer 100 for the packing operation, we could undo this to generate the original sequence.

You are going to write this in a general way as a function called pack that takes a list of integers to pack along with the number of integers to pack together and the base to use.

For example, suppose that you have defined the following list:

val lst = [18,3,95,48,22,39,47,12,73,15];

Below are a few examples of different calls on pack all using base 100 but packing together a different number of numbers on each call:

Notice that there might be extra values at the end of the list that are only partially packed.  For example, when packing 4 at a time with this list of 10 values there are 2 stray values at the end ([73, 15]) that are converted into 7315 rather than 73150000.

By packing the lists in this way and by guaranteeing that the value 0 does not appear in the list to be packed, we can unpack the list using just the original base (100 in the examples above) without knowing how many values were packed into each integer.

You should include the following variable bindings at the beginning of your file:

val p = 131;   (* first prime *)

val q = 239;   (* second prime *)

val n = p * q; (* modulus *)

val e = 1363;  (* public key *)

val d = 227;   (* private key *)

Given the functions you are developing, a person could encrypt a message to you knowing just the values of n and e (they can pick any value for per).  You then use the value d, which you keep private, to decode the message.  It is believed that others can’t break the coding scheme unless they can factor n.  In this example, it wouldn’t be difficult to break the encryption because n is so small.  In true RSA encryption, we would use much larger primes so that it would be impractical to factor n.  The bonus problem gives you a flavor of this by working with two 20-digit primes.

Once you have completed problems 1 through 7, you should include the following variable bindings at the end of your file:

val msg = "Twas brillig and the slithy toves did gyre and gimble";

val code = encrypt(msg, e, n, 2, 128);

val decode = decrypt(code, d, n, 128);

If your functions work, msg and decode should be equal.  Below is a list of the function prototypes for problems 1 through 7:

val stringToInts = fn : string -> int list

val intsToString = fn : int list -> string

val pack = fn : int list * int * int -> int list

val unpack = fn : int list * int -> int list

val modPow = fn : int * int * int -> int

val encrypt = fn : string * int * int * int * int -> int list

val decrypt = fn : int list * int * int * int -> string

You can use the built-in length function for lists, but you can’t use other list operations unless they are described in the first three chapters of the Ullman text.  You should collect all of your functions and variables into a single file called hw2.sml that you submit through the catalyst CollectIt link on the class web page.  Those doing the bonus should also submit a second file called hw2big.sml.

1.       (5 points) Define a function called stringToInts(s) that takes a string s as an argument and that returns a list of integers that represent the ordinal values (ASCII codes) of the sequence of characters in s.

2.      (5 points) Define a function called intsToString(lst) that is the inverse of the function you wrote in question #1, taking a list of ordinal values lst as an argument and returning the corresponding string.

3.      (30 points) Define a function called pack(lst, m, b) that takes as arguments a list of ints lst, a number m indicating the maximum number of ints to pack per int, and a base b to use for the packing and that returns the list obtained by combining integers in the given base.  In particular, consecutive sequences of m values from lst (a1, a2, …, am) are replaced with the single integer:

a1* bm-1+ a2* bm-2+ a3* bm-3+…+ am* b0

If the list ends with a sequence of fewer than m values, those values should be combined using the same formula above as if m were the length of the list instead.  Your function should return an empty list if passed an empty list.  You may assume that m and b are both greater than 0 and that the numbers in the list are all greater than 0 and less than b.  In testing your program, be careful about which cases you test because you might get integer overflow.

4.      (20 points) Define a function called unpack(lst, b) that is the inverse of the function you wrote in question #3, taking a list and a base as arguments and returning the list of ints obtained by expanding each integer using the given base.  Keep in mind that the original list did not have any occurrences of 0, so your function should pull apart each integer as many times as it can until it becomes 0.  You will want to use the div and mod operators to pull apart each integer into its constituent parts.  You may assume that the list/base combination you are asked to unpack was generated by a legal call on pack with the same base.

5.      (20 points) Define a function called modPow(x, y, n) that takes integers x, y and n as parameters and that efficiently computes (xy mod n).  You may assume that y is greater than or equal to 0.  You can use your pow function from assignment 1 as a starting point.  You need to make two modifications.  First, you should include a special case for even powers that takes advantage of the fact that for even values of y, xy = (xy/2)2 =(x2)y/2 (this equality actually indicates two different ways to make the computation efficient—you can use either).  If you correctly implement this optimization, your function will make at most O(log n) calls where n is the exponent you are computing.  Second, you should make sure that your function computes the mod often.  For example, it would be highly inefficient to compute xy in its entirety first and then to mod it by n only at the end of the computation.  Instead, we want to break up the computation into smaller computations that we mod by n.  The principle we will apply is that:

(x * y mod n) = ((x mod n) * (y mod n)) mod n

In writing your function, make sure that every time you multiply two values together that you compute the result in mod n before using the result in any other multiplication.  If we assume x is less than n, this will ensure that you never compute a value larger than (n - 1)2, which will significantly improve the efficiency of the computation.

6.      (10 points) Write a function encrypt(s, e, n, per, base) that takes as arguments a string s, an integer key e, a modulus n, an integer “per” (how many ordinals to pack into each encrypted integer), and an integer base and that returns a list of encrypted integers. The function should convert the string into a list of ordinal values, pack those ordinal values with the given base and packing the given number of ordinals per encrypted int, and it should then encrypt each int x in the resulting list by replacing it with the value obtained by computing:

xe mod n

7.      (10 points) Write a function decrypt(lst, d, n, base) that takes as arguments a list of encrypted ints, an integer key d, an integer modulus n, and an integer base, and that returns a decrypted string.  It should decrypt each integer x of the list by replacing it with:

xd mod n

and should then unpack the resulting list using the given base and should then convert the resulting list of ordinal values into a string.

Extra Credit

8.      (1 point) Once you have verified that you have a working solution to problems 1 through 7, save a new copy of the file as hw2big.sml.  In this second version we will use the IntInf structure that can handle large integers.  Replace the variable bindings at the beginning of the file with the following (including the command to open the IntInf structure):

open IntInf;

 

val p = toLarge(30000000000000000041);               (* first prime *)

val q = toLarge(40000000000000000019);               (* second prime *)

val n = p * q;                                       (* modulus *)

val e = toLarge(1255949);                            (* public key *)

val d = toLarge(6688169662940135319969202571123509); (* private key *)

Before you can proceed, you will have to modify your definitions for stringToInts and intsToString.  The ord and chr functions deal with ordinary int values (what will be listed as ?.int after you open the IntInf structure).  These ordinary ints can be converted back and forth with big ints by calling the functions fromInt and toInt.  The fromInt function converts an ordinary int into a big int.  The toInt function converts a big int into an ordinary int.  Once you have made the appropriate modifications, the function prototypes should look like this:

val stringToInts = fn : string -> int list

val intsToString = fn : int list -> string

Then replace the final two variable bindings with the following:

val code = encrypt(msg, e, n, 16, 256);

val decode = decrypt(code, d, n, 256);

The rest of the code should work without modification and should quickly encrypt the message, this time packing 16 characters into each encoded integer and allowing characters to have ordinal values as high as 255.  The call on decrypt should correctly decrypt the message.