Sharing Lots of Stuff: XML and Content Management

 

What is XML? (Extensible Markup Language)

Answer: A markup format. It's just UNICODE in a simple text file.

What Makes It Important?

Answer: It's SEMANTIC markup, that is, markup that self-describes (to some degree). For example, you might indicate what you ate for lunch inside <Lunch> tags, and what you ate for dinner inside <Dinner> tags, etc. If you send that stuff to your collaborators in Transylvania, they could understand your cuisine, presumably.

Significance check for the inattentive:   <Lunch> and <Dinner> are not defined as part of HTML, XHTML, etc., You've extended these markup languages with your own tags: Hence "Extensible" Markup Language ... "extended" ... "extensible"   get it?

So What's The Big Deal?

Answer: Everybody can do their own thing. You can label your data anyway you like. We finally separate content from presentation. XML is often hailed as the foundation of the Second-Generation Web

Something to read:   XML and the Second-Generation Web, by Jon Bosak and Tim Bray

Significance check for the inattentive:   XML is designed for sharing across geography, time and application. It becomes a fundamental form of data. For example, Microsoft's .NET initiative is premised on information shared among Windows forms, Web forms, Web Services, etc., as XML.

Show Me An Example of XML

<?xml version="1.0"?>
<animal>
	<dog>Lassie</dog>
	<dog>Rin Tin Tin</dog>
</animal>
	
Document Object Model

Show Me Another Example of XML

<?xml version="1.0"?>
<dog>
	<animal breed="Collie">Lassie</animal>
	<animal breed="German Shepard">Rin Tin Tin</animal>
	<animal breed="none">Mutt</animal>
</dog>
Document Object Model

How about a more impressive example of XML?   An organization chart


Something to read: XML in 10 points

Something to read: Understanding XML by Dare Obasanjo, July 2003

XML is preferable to previous data formats because XML can easily represent both tabular data (such as relational data from a database or spreadsheets) and semi-structured data (such as a Web page or business document). Popular pre-existing formats such as comma separated value (CSV) files either work well for tabular data and handle semi-structured data poorly, or like RTF are too specialized for semi-structured text documents. This has led to the widespread adoption of XML as the lingua franca of information interchange. Dare Obasanjo

Pit Stop To Fill Up On Jargon And Acronyms

  • W3C - World Wide Web Consortium
  • XML - Extensible Markup Language
  • An XML "document" has XML elements and XML attributes
  • DOM - Document Object Model
  • XSLT - Extensible Stylesheet Language Transform
  • CSV - Common Separated Value files (i.e., the kinds of files that databases export)
  • RTF - Rich Text Format (i.e., the kinds of files that word processors export)

Will XML Replace Database?

No, they are complementary, not competing technologies. XML is good for sharing data. Database is good for serving data.

Something to think about:

What about storage? I thought databases were good for storing data. Hmmm. Terry is being really cagey here.

Can A Database Serve XML?

Can XML Fill A Database?

 

Can I Mix XML And HTML Together?

 

XML Data Islands
Editorial Note:  XML Data Islands are interesting and you should be aware that they exist, but they probably are not in everybody's standard toolbox.

 

First create an XML island and put it in the <head> of your HTML document:
	<xml id="person">
	<week>
		<Monday>Monday's child is full of grace</Monday>
		<Tuesday>Tuesday's child is lost in space</Tuesday>
		<Wednesday>Wednesday's child is going to waste</Wednesday>
	</week>
	</xml>
Next use the <span> tag and the datasrc (data source) and datafld (data field) attributes.
You can use the <span> tag anywhere you like in your HTML page:
	
	<span datasrc="#person" datafld="Monday"></span>
	
For example, here is the <span> tag in a table:	
	
	<table datasrc="#person">
	 <tr><td><span datafld="Monday"></span></tr></td>
	 <tr><td><span datafld="Tuesday"></span></tr></td>
	</table>
	
Or, to get fancy, you could put it inside some JavaScript:	 
	
	<script language="javascript">
	   for (var i = 0; i < 5; i++) {
		document.write("<span datasrc='#person' datafld='Wednesday'></span><br>");
	   }
	</script>

See an example HTML page with an XML data island   The source code

 

Content Management

Concern for managing business information has a long and distinguished history; methods have evolved as new information technologies have appeared.

"Ultimately, vertical filing, first presented to the business community at the 1893 Chicago World's Fair (where it won a gold medal), became the accepted solution to the problem of storage and retrieval of paper documents. From the beginning, vertical files were presented as a way of combining all documents (incoming, outgoing, or internal) in a single system organized by subject, location, or any categorization appropriate to the business. Thus the technology included not just file drawers and folders to hold papers; it also included the bureaucratic technique of organizing the papers into folders and the folders into drawers by an appropriate and accessible filing system."

"Indeed, articles on filing appeared in management periodicals, and textbooks focused wholly or partially on filing proliferated, as filing systems became closely associated with systematic methods."

JoAnne Yates "Business use of information and technology during the industrial age" pp. 107-36 in A Nation Transformed by Information: How Information has Shaped the United States from Colonial Times to the Present. Edited by A.D. Chandler & J.W. Cortada. 2000


Something to think about:

You can easily imagine that if an organization has its information marked up as XML, somebody is going to come along and point that $ can be saved if information is REUSED; that is, there is a SINGLE SOURCE for your name and address and that many applications use that single information listing. In this way, your many applications become INTEROPERABLE saving you even more $.

All you have to do to save $ is buy our product that interoperably reuses a single source, etc.


You don't believe me?

  • Greater Interoperability "Support for XML Web services and the Microsoft .NET common language runtime enables your Content Management Server 2002 applications to achieve seamless interoperability with the rest of your business systems regardless of operating environment and development language barriers."
  • XML for Data: Reuse it or lose it  "In enterprise-level solutions, one of the most challenging problems facing XML designers is how to design structures that can be reused."
  • XML & Single Source Content: Maximizing Content Maneuverability  "XML helps you solve that problem by making it possible to create a single XML-based source of media independent, reusable content for automatic, customizable formatting and publishing to all types of media."

Right Around the Corner: InfoPath

Office 2003 will feature a new Microsoft application, InfoPath. InfoPath represents a customizable, end-user interface for creating structured data (translation: InfoPath is to XML as MS Word is to word processing).

InfoPath Usage Scenarios

Something to think about:

InfoPath will be a screaming success because it fills the need of collecting structure information at the periphery of an organization, e.g., salesperson out in the field fills out a "smart" InfoPath form and sends it back to the organization where it is "shredded" (i.e., its XML elements are sent to various databases), thereby accomplishing instant, intelligent update of an organization's information base.