Criticism of ECMA-376 (Microsoft Office Open XML)

Rob Weir has posted a criticism of ECMA-376, the official standard name for Microsoft Office Open XML, the new XML standard based on the Microsoft Office file format.

The core of his criticism is that parts of the standard are defined, in terms of 12 or 16 year old pieces of software which are no longer supported, let alone sold by their vendor. Any third party implementation of ECMA-376 requires a reverse engineering of these pieces of software and duplicating of their behaviour, a technically challenging process which appears to be of dubious legality (I’m not a lawyer, but see “Contract case could hurt reverse engineering”).

Looking at the standard myself, another flaw becomes apparent: from a software engineering point of view the standard is a nightmare to implement and may be impossible to test against.

At 6039 pages the standard is huge, and rather than being as close as possible in presentation style to the standards on which it builds (a common and very useful trait among the IETF, W3C and ISO sets of standards), it bears little or no relation to the structure of the standards it builds upon (the W3C standards for XML).

Certain classes of testing against the ECMA-376 should be easy, particularly XML conformance and validation, but testing for the correct behaviour of applications is going to be next to impossible, particularly when the behaviour is specified as (example lifted from Rob Weir): suppressTopSpacingWP (Emulate WordPerfect 5.x Line Spacing)

This element specifies that applications shall emulate the behavior of a previously existing word processing application (WordPerfect 5.x) when determining the resulting spacing between lines in a paragraph using the spacing element (ยง2.3.1.33). This emulation typically results in line spacing which is reduced from its normal size.

[Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

To be frank, I don’t see how even Microsoft can reliably test against such a standard, let alone third parties. That inability to test will inevitably undermine efforts at reducing bugs.