Office XML File Formats Overview
Historically, productivity applications like word processors and spreadsheets have followed one of two approaches to storing their documents.
One approach is to use an industry-standard file format like plain text, RTF or HTML. With this approach, you have transportable files that can be viewed or edited with many different programs, but the functionality of the documents is limited by the specific format. For example, you can’t embed graphics in a text file, or do IRM (integrated rights management) from an RTF file.
The other approach is to use a proprietary format. In that case, the developer can decide what every byte means, and any conceivable functionality could be supported by the document. But since the format is proprietary, it may be very complex, and therefore it may not be feasible to try to edit the file without using the software that created it.
Office 12 combines the benefits of both of these approaches, through the new Office Open XML file formats. These formats are based on simple industry standards — ZIP compression and XML syntax — so that they can be opened and edited by a wide variety of software. But they also include support for the types of complex structures that Office users have come to expect, such as embedded graphics and IRM. Those types of capabilities are designed into the schemas that Microsoft has developed, and those schemas have been submitted to ECMA as an industry standard. (If you’re interested in the details of that process, Brian Jones has some good information on his blog.)
The XML file formats give third-party Office developers lots of new creative options. For example, you can generate Word or Excel documents from a line-of-business database application, without using Word or Excel: just write the appropriate XML and store it in a ZIP file. (The new WinFX packaging API makes this job very simple — more on that in a future post.) Or you can define your own extensions to the Office XML formats, and store additional XML tags in the files that can be used by your application. Microsoft Office applications will leave your XML intact, even after the user edits the document.
That’s an overview, now let’s get into some of the details. Next we’ll begin exploring the inner workings of the XML file formats.
This entry was posted on Monday, January 9th, 2006 at 8:47 am. You can subscribe to comments on this post through its RSS feed.
on January 9, 2006 at 1:12 pm Doug’s World » Exploring the XML File Formats wrote:
[…] In a previous post, we covered the basic concepts behind the new Office Open XML file formats. Now let’s look at an example in a little more detail, to get a feel for how the new file formats work. […]