Skip to main content

Close
Get the best research tool on the web today,and free!
Connect with people with common interests!

All Annotations of [Preview]

saved by2 people, first byGary Edwards on 2008-06-25, last byJesper Lund Stocholm on 2008-06-25

  • on 2008-06-25 Garyedwards
    Hi Jesper,

    OpenOffice.org has an internal layout engine with an implementation model very different from that of the MSOffice editors. The differentials of internal representation of basic document structures is very problematic to roundtrip conversion processes. ODF is an XML encoding of the OpenOffice internal representation, sometimes called the "binary dump" or in-memory-binary-representation. OOXML is similarly an XML encoding of the MSOffice internal representation.

    The layout engine differences (internal representation) results in serious incompatibilities at level of file format "presentation" layers. While it's easy to exchange "content", "presentation" is impossibly application specific.

    Given enough time, one would hope that the entire presentation layer of OpenOffice-ODF could be fully described in terms of both syntax and semantics. After five years and a half years of work, ODF has not made much progress in this area. Once the presentation semantics are fully documented, the plan was to then "neutralize" the application specific aspects with more generic representations.

    If you go back to January 2003, you'll find in the eMail archives of the OASIS ODF TC some remarkable discussions concerning proposals to throw out the OpenOffice XML submission, and start from scratch with an entirely application independent specification. This was suggested by Phil Boutros (Stellent), and supported by Corel, Boeing, ArborText, and SpeedLegal (Jason Harrop). In the end however, Sun prevailed with the promise that the obviously application specific aspects of ODF would be fully documented and neutralized. Sadly this has yet to happen. Worse, the OASIS routine of highjacking W3C namespaces further confuses things since these application specific implementations are similarly undocumented.

    IMHO i believe there is a solution to this mess. But it's not going to be what most expect; the harmonizing of ODF and OOXML. That isn't going to work - ever! For harmonization to work, one would have to harmonize OpenOffice and MSOffice by rebuilding the aging layout engines. And that isn't going to happen.

    A more reasonable approach would be to target an application independent format that is entirely neutral. The truth is that you can not ask an established application (layout engine and feature set) to implement another applications internal representation model. What you can do though is target a neutral format that is flexible and generic at the document structure level.

    My favorite "neutral" format is the WebKit document model. The primary reason is that it's "web-ready". Who in their right minds would today target any document format not "web-ready"? Yet that's what we did with ODF.

    I fully believe it is possible to set up round trip conversions from both OpenOffice and MSOffice to the advanced but highly interoperable WebKit flow document model. Don't even try to re invent the layout engines of legacy applications. Set as your target a neutral, application independent format that is Web-ready" and useful across the widest span of devices, desktop, and web application and service systems. Then never look back.

    I hope you will consider joining marbux and i over at the Diigo "Future of the Web" group. Your contributions would be much appreciated. Besides, all my blogging is there :)

    One last point; i did get your eMail concerning RelaxNG. But it took some time. I had lost my openstack.us domain for some 90 days. When i finally got it reinstated, a ton of eMails showed up. My apologies. But you are absolutely right in your assesment. I have asked Florian and Marbux to confirm this and perhaps explain how this came to be. The issue is once again that of Sun highjacking namespaces and committing two applications specific crimes in the process. The first fault is that the implementation of the digital signature standard is limited to exactly what OpenOffice chooses to implement. The second fault is that the OpenOffice limited implementation is nowhere documented!

    The highjacking of namespaces really came to head in the metadata subcommittee where the OpenDocument Foundation had successfully inserted into the requirements document a model for using RDF to describe in a very neutral and generic way all of the problematic presentation aspects. Meaning, we believed that format and styling information is simply object metadata. We really thoguht we could neutralize the application specific problems of ODF using a metadata approach. Sun however had different ideas. They ended up convincing IBM and Patrick Durusau to limit the use of XML ID to only those elements approved and implemented in OpenOffice. Once this monstrous example of application specific crap was approved by the OASIS TC, i quit ODf.

    Hope all is well,
    ~ge~