CSE 444 Homework 3

Objectives:
To be able to manipulate XML: query it with XQuery and export it from relational databases.
Reading Assignments:
Lecture notes on XML and XQuery, SQL Server documentation
Number of points:
100
Due date:
PART 1 November 2 10:30 AM (by email - no paper turn-in)
          PART 2 November 9 10:30 AM (by email - no paper turn-in)
Assignment Tools:
SQL Server, XQuery (Galax)
Here is a list of answers to some frequently asked questions.

PART 1

  1. [60 points] Consider the XML data instance Mondial, avaialable here. Write XQueries to answer the following questions. In formulating your questions, you need to understand how various elements are nested: e.g. what is under a country, under which element is a city etc. For that it helps if you inspect the DTD (ignore the warning that the data is not valid), or inspect the data directly. Deliverable: For each question below turn in the XQuery, and the result of running the query on the XML data (which should be an XML document).

    1. Retrieve all the names of all cities located in Peru, sorted alphabetically.
    2. Find all countries with more than 20 provinces.
    3. Find all ethnic groups that live in more than more than 10 countries.
    4. Find the countries adjacent to the 'Pacific Ocean' sea.
    5. Find the names of all countries that have at least 3 mountains over 2000m high, and list the names and heights of all mountains in these countries (regardless of their height). Note: the height attribute is in meters, so you don' have to do any conversions.
    6. One user is interested in long rivers. Produce the following view of the data, containing only rivers longer than 2000 (all units are in km), in the format described below:
      • The root element is user and contains several river elements
      • Each rivercontains a name element with the river's name, and several country elements, one for each country through which it flows. (Note: some rivers may not have any country, due to noise in the data. It is OK to include these rivers, even if they look as they flow through no country at all.)
      • Each country element contains only the name of the country (a string).

    For this question you need to run galax. For a brief introduction to getting started with galax, click on one of the following links: Click here for using Galax on Linux OR Click here for using Galax on Windows


  2. PART 2

  3. [15 points] The IISQLServer has a database named Mondial with the following table

                Mondial_Version3(DOC xml)

    The table has one row that contains the Mondial version 3.0 xml data. Your task is to write SQL statements using the nodes() and query() functions provided by the SQL Server to generate the following 2 tables.

  4.             Country(CountryName, Religion)
                City(CityName, CountryName, CityPopulation)

     
    Note:
    (1) If a country has say 3 religions, then your query should output three rows with one religion on each row. (2) If a city has many population sub-elements, then your query should output the population from the last population element only.

    Deliverable: Turn in all queries/programs you used to populate your database, as well as the name of your database.
     
    Hint:
    Read the SQL Server help on how to use the nodes() function. Click here to see an example of nodes() being used.

  5. [25 points] Consider the movies database we used in Homework 1. Generate an XML file with the following structure:

    Your XML file should include all the relevant data in the relational database, i.e. all actors, and for each actor all their movies. In this problem you can use a combination of the FOR XML feature of SQL Server and Galax queries to generate this XML file. Deliverable: Turn in all programs that you used to generate the XML file

  6.