Office 2003 XML for Power Users
Reviews from Bill Coan and Cindy Meister
- by Matthew MacDonald
- Published by Apress: Click here to visit the publisher’s website
- ISBN: 1-59059-264-6 323 pp.
- Published: Jan 2004 Price: $US39.99
Review from Bill Coan:
Matthew MacDonald’s latest book quickly helps you understand why the new XML features in Office 2003 are “nothing short of revolutionary.” Then, almost as quickly, the book helps you understand how to take advantage of the new XML features.
The title implies that the book is aimed only at power users but it might have been more accurate to say that this book is aimed at serious users of Office. The book doesn’t presume a deep understanding of Office so much as it presumes a serious commitment to solving business problems through the use of Office applications.
Chapter 1 provides an excellent (and mercifully concise) introduction to the XML language. Chapter 2 explains how XML schemas can be used to govern the structure and content of your Office 2003 documents.
Then it’s off to the races with several chapters showing how to work with XML data in Excel, Word, and Access. Word users can easily skip the chapters on Excel and Access, if desired, but there’s much to be gained by reading them, since one of the benefits of XML technology is that it allows data to be shared more easily among all applications.
Chapter 7 explains how specialized XML documents known as XSL transforms can be used to extract XML data from Office documents and automatically sort, select, and modify the data to meet different business needs. Chapter 8 shows how XML data can flow from ordinary Office documents through a custom workflow and end up on the web, completely automatically. Chapter 9 provides an introduction to InfoPath, Microsoft’s new XML data-entry application.
At 300 pages, this is a very manageable book to hold and to read. MacDonald presents his complex subject in short, highly readable chapters. He never argues that XML is valuable because it is “cool.” Instead, he rigorously shows that XML is valuable because it addresses critical problems in business.
No single book can tell you everything there is to know about XML technology or even everything there is to know about XML features in Office 2003, but this book will get you started and carry you a surprising distance along that road.
Review from Cindy Meister:
Office 2003 XML for Power Users is an introduction to the XML capabilities in Office 2003. The author says that he doesn't get into programming, but that's not entirely the case. While the XML-related features in the Office object model are not covered (unfortunately), the book and the related sample files contain
- Tools for viewing XML in documents, programmed in the .NET framework. All the code is open source, and can be looked at, if the reader is interested.
- VB code for opening XML documents and extracting information. Although the author says this is for VB6, it can also be used in Office VBA environments.
- .NET code for building a Web Service.
With these foundations, the interested reader has the basics building blocks for going into more depth on programmatically using XML files.
The book starts out with a presentation of XML and XML schemas in the first two chapters. The author provides the necessary background on XML in an understandable manner. Unlike pure XML books, this one concentrates on what you need to know to work with XML in Office 2003 and doesn't confuse the issue with - from an Office point of view - irrelevant information.
The reader is then taken right into XML in the Office applications. XSLT (stylesheet transformations) is introduced later, when the reader has had a chance to recover from all the new concepts, and has developed a better conceptual framework for understanding this new material.
XML in Excel, Word, Access and InfoPath (there is no XML support in Powerpoint) are covered. The chapters on XML in Office deal primarily with Word and Excel, where the XML capabilities are more expanded and useful in everyday end-user work than those in Access. The book first introduces XML capabilities in each application's user interface, and goes on to explain how these relate to the business world.
After dealing with XML in the application interface, the author looks at how SpreadsheetML and WordProcessingML (the XML "dialects" for Office that retain the data as well as the Office application information) are constructed. He discusses the problems involved when converting an Excel or Word file into pure XML, and what purposes the XML approach Microsoft decided on are intended to address.
Chapter 8 presents a "real-world scenario", showing how XML in Excel can be linked to a Web Service in order to place the user input into a database. The entire process, including setting up the Web Service in the .NET Framework is described. The last chapter gives a brief introduction to a new member of the Office family: InfoPath.
The book was obviously written based on a beta version of Office 2003. The author neglects to mention this fact, but the reader intending to work with the material in depth needs to be aware of this fact. For example, Microsoft opted to use "WordProcessingML" rather than "WordML" before releasing to market. Nor will anyone be able to find the eight-ball icon mentioned in the text and shown in a screen shot. Even more important, some of the namespace paths for the internal Microsoft schemas were changed going from Beta 2 into the final release. It is advisable, therefore, to download the Content Development Kit from the MSDN website and read up on the information there.
In the introduction, the author says that a list of errata will be posted on his website; as of this writing (June 2004), the book isn't even listed among his publications. Unfortunately, there are a number of errors that a thorough technical proof reading ought to have caught. The reader unfamiliar with XML should therefore take this into consideration and, if something doesn’t make sense, realize it may well be a mistake in the text.
As with so many "Office" books on the new technologies, the author is not, himself, completely familiar with the Office applications. This is another source of errors. Here are some important corrections:
- the author completely missed the implementation of XML in Word's IncludeText fields, which provides some of the capabilities he laments as not being present.
- When discussing why XML is so important for Excel, the author states: "Currently, the only way you can read Excel data in another application is either to force the users to export their entire spreadsheets to a simple text-based format, or use the Office automation objects." This is not true; the data in Excel files can be extracted using DAO, ODBC or OLE DB without Excel even having to be present on the machine.
- In the chapter on WordML, the author maintains that sections aren't used in a Word document, only when it's saved as HTML. This is patently incorrect and could have been checked with any Word professional; sections are a very important and integral part of every Word document. All header and footer information is stored in them. They are also responsible for newspaper column formatting and page formatting in documents containing more than one section.
On a scale from 1 (poor) to 9 (excellent), I give the book a 7 due to the less than perfect technical editing and proof-reading. Plus, I personally would rather have seen a discussion of XML in the Office object models, rather than having an entire chapter devoted to how to create a WebService in .NET.
I recommend it highly for those looking for an understandable introduction to XML and how it relates to Microsoft Office. Even if you're interested only in the Excel or Word interface, you'll find this book useful.
For those who, after reading this book, need a reference to quickly look up XML syntax or learn more about the possibilities, I further recommend the XML Pocket Consultant by William Stanek from MS Press.