Does Any Of That Stuff Mean Anything?Before we begin, stick your nose into some 20th century, Post-Modern, Post-Structuralist, French literary criticism...Jacques Derrida, 1930- French philosopher. He argues that language only refers to other language, therefore negating the idea of a single, valid “meaning” of a text as intended by the author. Rather, the author’s intentions are subverted by the free play of language, giving rise to many meanings the author never intended. What Are We Trying To Do Here? Ask a Librarian!What we trying to do is name things (and thereby give them a meaning). You do that when you construct XML documents, when you build databases, etc., and you name the XML element or the attribute of a relation, etc. It's what librarians are famous for. They've been doing that for about 100 years with the Dewey Decimal system and the Library of Congress Subject Heading list New Library of Congress Subject Headings and their corresponding Dewey Decimal Numbers Date: July 18, 2003
So Let Allyce Do Her Thing And Give It A Name!See, the problem is, which name. Many words can mean the same thing, and a single word can mean a lot of things.
Predictably, Allyce And Her Librarian Friends Will Probably Disagree About The Meaning"About fifteen years ago, some evidence was brought to the attention of the field which indicated that, if several different indexers are all asked to index the same document, a great deal of inconsistency is likely to be apparent in the results. That is to say, different indexers are apt to assign quite different sets of index terms (i.e., descriptors, subject headings) to the same document. This evidence must have been received with considerable skepticism by those who believed that there is only one 'right' way to index a document and that any properly trained indexer has a pretty good idea of what that 'right' way is. Since then, however, the issue has received a great deal of attention, and the original findings have been amply corroborated by a number of independent tests. It seems that a substantial amount of interindexer inconsistency, as the phenomenon of conflicting indexer decisions has come to be called, is the rule rather than the exception." William S. Cooper, "Is Interindexer Consistency A Hobgoblin?" American Documentation, July 1969, 268-278
Even though the basic problems of language and meaning observed in paper technologies and the work of librarians were never solved, information technology sped forward. If you suppose, then, that we shall see these basic problems re-appear, but now manifested as "IT problems" or "Web problems" and maybe given new, whiz-bang names like "metadata," you win the prize for being prescient, far sighted, forward looking, insightful (Wait! Those are synonyms! Synonymy is one of the problems!) Feeling better after our historical interlude?
What Is Metadata?Jargon? A subject heading used on the web. Jargon? Information about information? Jargon? The key building block of the Semantic Web. Jargon? The foundation to interoperability. Jargon? How a web author indicates the contents of a web page. Jargon? What a librarian calls a subject heading, a webster calls metadata. Jargon? What a relational databaser calls attributes, a webster calls metadata. Jargon? A name for something. Jargon? A description of something. Jargon? Just about anything you want. Jargon? A piece of inflated rhetoric to intimidate the uninitiated. Oh, that's about right!
Where Does Metadata Live?The HTML 4.01 Specification (W3C Recommendation 24 December 1999) notes that Example <HEAD> section with properties <HEAD profile="http://www.acme.com/profiles/core"> <TITLE>How to complete Memorandum cover sheets</TITLE> <META name="author" content="John Doe"> <META name="copyright" content="© 1997 Acme Corp."> <META name="keywords" content="corporate,guidelines,cataloging"> <META name="date" content="1994-11-06T08:49:37+00:00"> </HEAD> "A common use for meta is to specify keywords that a search engine may use to improve the quality of search results. When several meta elements provide language-dependent information about a document, search engines may filter on the lang attribute to display search results using the language preferences of the user. For example:" <-- For speakers of US English --> <META name="keywords" lang="en-us" content="vacation, Greece, sunshine"> <-- For speakers of British English --> <META name="keywords" lang="en" content="holiday, Greece, sunshine"> <-- For speakers of French --> <META name="keywords" lang="fr" content="vacances, Grèce, soleil"> Ouch! Metadata on the Web"The tygers of wrath are wiser than the horses of instruction." William BlakeIt appears to be a matter of belief. You are a person who believes that people all over the world will play nicely together, or you're a *&%#!@ cynic. You believe that people all over the world will cooperate in using metadata wisely or you're some kind of a +&^$#@~ nihilist. A Statement of Belief: "Metadata is a key part of the information infrastructure necessary to help create order in the chaos of the Web, infusing description, classification, and organization to help create more useful stores of information." Metadata Principles and Practicalities by Erik Duval et al. The Tygers of Wrath Teach Us "Experience with the tag [meta keywords tag] has showed it to be a spam magnet. Some web site owners would insert misleading words about their pages or use excessive repetition of words in hopes of tricking the crawlers about relevancy." Danny Sullivan Something to read: Death of a meta tag by Danny Sullivan, editor of SearchEngineWatch Something to read: Metacrap by Cory Doctorow
Here's metadata on the web ... What is Dublin Core?"The Dublin Core metadata standard is a simple yet effective element set for describing a wide range of networked resources. The Dublin Core standard comprises fifteen elements, the semantics of which have been established through consensus by an international, cross-disciplinary group of professions from librarianship, computer science, text encoding, the museum community, and other related fields of scholarship." Diane Hillmann An example: <meta name = "DC.Subject" scheme = "MESH" content = "myocardial Infarction; Pericardial Effusion"> <meta name = "DC.Creator" content = "Gogh, Vincent van">The tiger bites... "A discouraging aspect of metadata usage trends on the public Web over the last five years is the seeming reluctance of content creators to adopt formal metadata schemes with which to describe their documents. For example, Dublin Core metadata appeared on only 0.5 percent of public Web site home pages in 1998; that figure increased almost imperceptibly to 0.7 percent in 2002. The vast majority of metadata provided on the public Web is ad hoc in its creation, unstructured by any formal metadata scheme." Trends in the Evolution of the Public Web, 1998 - 2002 by Edward T. O'Neill, et al
More metadata on the web ... What is RDF?"The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource."
The 'Eric Miller' example. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#"> <contact:Person rdf:about="http://www.w3.org/People/EM/contact#me"> <contact:fullName>Eric Miller</contact:fullName> <contact:mailbox rdf:resource="mailto:em@w3.org"/> <contact:personalTitle>Dr.</contact:personalTitle> </contact:Person> </rdf:RDF> The tiger bites again... "Since the initial experiments indicate that RDF data is hard to find, a more targeted search was conducted.
Lots more metadata on the web ... What is OIL + DAML?
"The use of ontologies provides a very powerful way to describe objects and their relationships to other objects. The DAML language is being developed as an extension to XML and the Resource Description Framework (RDF). The latest release of the language (DAML+OIL) provides a rich set of constructs with which to create ontologies and to markup information so that it is machine readable and understandable. " About the DAML Language
Something to notice:
Did you see the description above? "machine readable" ... that means we've moved away from categorizing things for human beings and are designing metadata for machines to (pardon the jargon, but this is what they say) harvest. Think about the economics of time and energy. This stuff is time- and labor-expersive. Too expensive for just any HTML page. If you're going to mark stuff up with DAML+OIL, it would only make economic sense if the stuff was going to hang around for a long time. Can you say digital library? What about web service? There are two types of animals, Male and Female. <daml:Class rdf:ID="Male"> <rdfs:subClassOf rdf:resource="#Animal"/> </daml:Class> It perfectly admissible for a class to have multiple superclasses: A Man is a Male Person <daml:Class rdf:ID="Man"> <rdfs:subClassOf rdf:resource="#Person"/> <rdfs:subClassOf rdf:resource="#Male"/> </daml:Class>Is there anything for the tiger to bite? Terry Brooks says DAML + OIL is so leading edge and so complex a technology that it will probably never penetrate widely into the open, common web. There may be some large demonstation projects that are created to show proof of concept, but somebody will have to hold Terry's hand when he DAMLizies his web site. The DAML community is happy just to get some attention: DAML.ORG has had over ten million hits as of Friday, 28 March, 2003. "The very large amount of activity for this web site reflects the significant interest around the world in DAML technology as it supports the emerging Semantic Web" HotDAML newsletter
Lots and lots more metadata on the web ... What is OWL?"The OWL Web Ontology Language is designed for use by applications that need to process the content of information instead of just presenting information to humans. OWL facilitates greater machine readability of Web content than that supported by XML, RDF, and RDF Schema by providing additional vocabulary along with a formal semantics."
Somebody call Redmond, Washington! Now semantics on the web has shifted away from humans to presenting information for machine comsumption. It's a revolution that has already happened. It's called web services. You can use Microsoft's VS.NET to write a web service so that your computer application can talk to my computer application. |