Data Sets


So far, most of the data sets that I have are randomly generated. I would love to get more real data sets: a complete schedule for an airline or a city bus network would be great. To see the best reported tours, check the Speedy Tourist Page.

Railway Schedules

This data is from the Indian Bradshaw Railway guide (January 1996). My current effort has been to include the full set of routes between "major" cities in South India. A city was considered major if I had heard of it previously, or if it was a railway junction. The data was entered by hand, so it probably has a few bugs. The schedule is presented as a daily schedule, with the incorrect assumption that trains run every day. (A more accurate schedule would be a weekly schedule). Another minor nit is that I have not distinguished between different railway stations in the same city, so you are allowed to depart from Madras Central one minute after your arrival from Madras Egmore.

Data from real schedules has different properties from the artificial data. The vertex degree is generally small (except for a few junctions), but the number of parallel edges can be very high. There can be a strong correlation between arrival and departure times, so (for good connections), the waiting time in a city can be very small.

This data set is now complete, with the 75 city example covering essentially all the rail routes in southern India.

Random Instance Generators

Several different data sets have been generated. For some of the data sets, I have used real cities, and then randomly generated bus routes between pairs of cities. The source code for the generators is available and a sketch of the generation methods is given below. The graphs are checked for strong connectivity before being included in the official data set.

Data Sets

Each data set consists of a pair of files, a schedule file, and a city file. The schedule file has all of the information about the routes. The city file gives names to the cities, and display coordinates. (The city files aren't really necessary for the problem).

There are several families of data sets. Within each family, a common generation algorithm was used. Each problem instance specificies a starting time, and a starting city. (The times and cities are given as integers - the minutes since midnight, and the city number.)

The instances are specified: schedule_file, city_file, start_city, start_time

Southern Railway

ProblemScheduleMap Start CityStart time 
south-12.ttp south-12.tts south-12.ttm 1 480 (Bangalore, 8 am)
south-25.ttp south-25.tts south-25.ttm 2 480 (Bangalore, 8 am)
south-36.ttp south-36.tts south-36.ttm 2 480 (Bangalore, 8 am)
south-49.ttp south-49.tts south-49.ttm 3 480 (Bangalore, 8 am)
south-66.ttp south-66.tts south-66.ttm 3 480 (Bangalore, 8 am)
south-75.ttp south-75.tts south-75.ttm 3 480 (Bangalore, 8 am)

USA Complete

ProblemScheduleMap Start CityStart time 
usa-10.ttp usa-10.tts usa-10.ttm 9 60 (Yakima, 1 am)
usa-12.ttp usa-12.tts usa-12.ttm 3 600 (Seattle, 10 am)
usa-15.ttp usa-15.tts usa-15.ttm 3 0 (Walla Walla, Midnight)
usa-20.ttp usa-20.tts usa-20.ttm 18 720 (Weed, Noon)

USA Sparse

ProblemScheduleMap Start CityStart time 
usa-25.ttp usa-25.tts usa-25.ttm 1 510 (Reno, 8:30am)
usa-30.ttp usa-30.tts usa-30.ttm 11 720 (Roswell, Noon)
usa-35.ttp usa-35.tts usa-35.ttm 27 720 (San Francisco, Noon)
usa-40.ttp usa-40.tts usa-40.ttm 38 1200 (Waycross, 8:00 pm)
usa-45.ttp usa-45.tts usa-45.ttm 9 540 (Traverse City, 9:00 am)
usa-50.ttp usa-50.tts usa-50.ttm 38 60 (Weed, 1:00 am)
usa-60.ttp usa-60.tts usa-60.ttm 59 180 (Toronto, 3:00 am)
usa-80.ttp usa-80.tts usa-80.ttm 31 1200 (Vicksburg, 8:00 pm)
usa-100.ttp usa-100.tts usa-100.ttm 56 360 (Regina, 6:00 am)
usa-128.ttp usa-128.tts usa-128.ttm 98 1080 (Waco, 6:00 pm)

UK Sparse

ProblemScheduleMap Start CityStart time 
uk-25.ttp uk-25.tts uk-25.ttm 17 510 (St. Bees Head, 8:30 am)
uk-50.ttp uk-50.tts uk-50.ttm 4 720 (Middle Wallop, Noon)
uk-100.ttp uk-100.tts uk-100.ttm 99 600 (Glenlivet, 10:00 am)
uk-150.ttp uk-150.tts uk-150.ttm 100 500 (Lizard Lighthouse, 8:20 am)
uk-200.ttp uk-200.tts uk-200.ttm 193 1000 (London, 4:40 pm)
uk-284.ttp uk-284.tts uk-284.ttm 268 900 (Edinburgh, 3:00 pm)

Meshes

ProblemScheduleMap Start CityStart time 
mesh-3.ttp mesh-3.tts mesh-3.ttm 0 100 (Calf of Man, 1:40 am)
mesh-4.ttp mesh-4.tts mesh-4.ttm 15 200 (Newton, 3:20 am)
mesh-5.ttp mesh-5.tts mesh-5.ttm 21 300 (Jubilee Corner, 5:00 am)
mesh-6.ttp mesh-6.tts mesh-6.ttm 3 400 (Leeds, 6:40 am)
mesh-7.ttp mesh-7.tts mesh-7.ttm 48 500 (Anvil Green, 8:20 am)
mesh-8.ttp mesh-8.tts mesh-8.ttm 3 600 (Bentwaters, 10:00 am)
mesh-9.ttp mesh-9.tts mesh-9.ttm 7 700 (Fair Isle, 11:40 am)
mesh-10.ttp mesh-10.tts mesh-10.ttm 68 800 (Guernsey, 1:20 pm)

Triangles

ProblemScheduleMap Start CityStart time 
tri-3.ttp tri-3.tts tri-3.ttm 0 100 (North Rona Island, 1:40 am)
tri-4.ttp tri-4.tts tri-4.ttm 0 200 (Lydd, 3:20 am)
tri-5.ttp tri-5.tts tri-5.ttm 0 300 (Hawarden, 5:00 am)
tri-6.ttp tri-6.tts tri-6.ttm 1 400 (Lundy Island, 6:40 am)
tri-7.ttp tri-7.tts tri-7.ttm 13 500 (Greenham Common, 8:20 am)
tri-8.ttp tri-8.tts tri-8.ttm 35 600 (Sculthorpe, 10:00 am)
tri-9.ttp tri-9.tts tri-9.ttm 44 700 (Oban, 11:40 am)
tri-10.ttp tri-10.tts tri-10.ttm 21 800 (Inverness, 1:20 pm)
tri-11.ttp tri-11.tts tri-11.ttm 59 900 (St. Bees Head, 3:00 pm)
tri-12.ttp tri-12.tts tri-12.ttm 63 1000 (Muckle Flugga, 4:40 pm)