HW1 Solutions ------------- 1. List how many spams were sent from the Cayman Islands? SELECT msgcount FROM StatCountries WHERE countryname = 'Cayman Islands' 2. List how many spams were sent from countries whose names begin with 'D'? SELECT SUM(MsgCount) FROM StatCountries WHERE CountryName LIKE 'D%' 3. List the top 10 domains (and the corresponding countries) from which spams were sent. List them in decreasing order of number of spams sent. SELECT TOP 10 DomainName, CountryName FROM StatCountries ORDER BY MsgCount DESC 4. List the top 10 spammers in this database. List them in decreasing order of the number of spams they sent. The table returned should have two columns: one named "FromAddr" and the other named "NumSpams". SELECT TOP 10 FromAddr, COUNT(FromAddr) AS NumSpams FROM UnicodeHeaderV2 GROUP BY FromAddr ORDER BY NumSpams desc 5. List all countries from which spam was sent with both the word 'free' and 'game' in the subject. SELECT DISTINCT d.country AS Country FROM UnicodeHeaderV2 AS h, Domains AS d WHERE Subject LIKE '%free%' AND Subject LIKE '%game%' AND d.name = h.DomainName 6. List the number of spams sent in each (non-null) language. The table you return should have two columns: one named 'Language' and the other named 'NumSpams'. SELECT l.name AS Language, COUNT(b.langid) AS NumSpams FROM UnicodeBody AS b JOIN Languages AS l ON b.langid = l.id WHERE langid IS NOT NULL GROUP BY l.name 7. List the number of spams sent in each (non-null) language by the from address of 'Mail'. The table returned should have two columns: one named "Language" (this should be the name of the language) and the other named "NumSpams". SELECT l.name AS Language, COUNT(b.LangId) AS NumSpams FROM UnicodeHeaderV2 AS h, UnicodeBody AS b, Languages AS l WHERE h.FromAddr = 'Mail' AND h.id = b.id AND b.LangId IS NOT NULL AND l.id = b.LangId GROUP BY l.name 8. List the set of all domains from which the database has no spams. SELECT DISTINCT name as Domains.name FROM Domains LEFT JOIN UnicodeHeaderV2 ON domains.name = UnicodeHeaderV2.DomainName WHERE UnicodeHeaderV2.DomainName IS NULL or SELECT DISTINCT name as DomainName FROM Domains as d WHERE NOT EXISTS (SELECT * FROM UnicodeHeaderV2 as h WHERE h.DomainName = d.name) or SELECT DISTINCT name FROM Domains WHERE name NOT IN (SELECT DISTINCT DomainName FROM UnicodeHeaderV2 WHERE DomainName IS NOT NULL) 9. List the set of spammers (their from address only) that sent spams in both English and French. SELECT DISTINCT FromAddr FROM UnicodeHeaderV2 as h, UnicodeBody as b, Languages as l WHERE h.id = b.id AND b.langid = l.id AND l.name = 'English' AND EXISTS (SELECT DISTINCT * FROM UnicodeHeaderV2 as h2, UnicodeBody as b, Languages as l WHERE h2.id = b.id AND b.langid = l.id AND l.name = 'French' and h.fromAddr = h2.fromAddr) SELECT DISTINCT h.FromAddr FROM UnicodeHeaderV2 as h join UnicodeBody as b1 on h.id = b1.id join UnicodeBody as b2 on h.id = b2.id join Languages as l1 on l1.id = b1.langid join Languages as l2 on l2.id = b2.langid where l1.name = 'French' and l2.name = 'English' 10. Write your own unique query. Try to make it at least as interesting/difficult as problems 5 and 6.