OpenDocument |
The OpenDocument format (ODF), short for the OASIS Open Document Format for Office Applications, is an open format document file format for saving and exchanging editable office documents such as text documents (including memos, reports, and books), spreadsheets, charts, and presentations. This standard was developed by the OASIS_(organization) industry consortium, based upon the XML-based file format originally created by OpenOffice.org.
The standard was publicly developed by a variety of organizations, is publicly accessible, and can be implemented by anyone without restriction. The OpenDocument format is intended to provide an open alternative to proprietary document formats including the popular Microsoft Word, Microsoft Excel, and Microsoft PowerPoint formats used by Microsoft Office, as well as Microsoft Office Open XML format (this latter format has various licensing requirements that prevent some competitors from using it). Organizations and individuals that store their data in an open format such as OpenDocument avoid being Vendor lock-in to a single software vendor, leaving them free to switch software if their current vendor goes out of business, raises their prices, changes their software, or changes their software license terms to something less favorable.
OpenDocument is the only standard for editable office documents that has been vetted by an independent recognized standards body, has been implemented by multiple vendors, and can be implemented by any supplier (including proprietary software vendors and developers using the GNU General Public License).
=Public policy implications=
Since one objective of open formats like OpenDocument is to guarantee long-term access to data without legal or technical barriers, governments have become increasingly aware of open formats as a public policy issue. For example, in 2002, Dr. Edgar David Villanueva Nuñes, a lawyer and Congressman of the Republic of Perú, [http://www.gnu.org.pe/resmseng.html wrote a letter] to Microsoft Peru raising questions about free and permanent document access with proprietary formats. Europe and Massachusetts in particular have been examining the ramifications of selecting a document format.
== Europe ==
European governments have, since at least 2003, been investigating various options for storing documents in an XML-based format, commissioning technical studies such as the Valoris Report (Valoris). In March 2004, European governments asked an OpenOffice team and a Microsoft team to present on the relative merits of their XML-based office document formats (Bray, September 29, 2004).
In May 2004, the Telematics between Administrations Committee (TAC) issued a set of recommendations, in particular noting that, Because of its specific role in society, the public sector must avoid [a situation where] a specific product is forced on anyone interacting with it electronically. Conversely, any document format that does not discriminate against market actors and that can be implemented across platforms should be encouraged. Likewise, the public sector should avoid any format that does not safeguard equal opportunities to market actors to implement format-processing applications, especially where this might impose product selection on the side of citizens or businesses. In this respect standardisation initiatives will ensure not only a fair and competitive market but will also help safeguard the interoperability of implementing solutions whilst preserving competition and innovation. It then issued recommendations, including:
OpenDocument is already a standard by a recognized independent standards body (OASIS), and is being submitted to International Organization for Standardization for standardization, while there is no evidence that the Microsoft XML formats or the older DOC/PPT/XLS formats will go through such a process. Many expect ISO will accept and approve OpenDocument using its fast-track process, and that once ISO ratifies the standard, the European Union will require OpenDocument as the office suite standard for the European Union. (Marson, October 18, 2005)
== Massachusetts ==
Massachusetts has also been examining its options for implementing XML-based document processing. In early 2005, Eric Kriss, Secretary of Administration and Finance in Massachusetts, was the first government official in the United States to publicly connect open formats to a public policy purpose: It is an overriding imperative of the American democratic system that we cannot have our public documents locked up in some kind of proprietary format, perhaps unreadable in the future, or subject to a proprietary system license that restricts access. [http://www.mass.gov/eoaf/open_formats_comments.html]
At a September 16, 2005 meeting with the [http://www.masoftware.org Mass Technology Leadership Council] Kriss stated that he believes this is fundamentally an issue of sovereignty. [http://danbricklin.com/log/2005_09_07.htm#meetingphotos] While supporting the principle of private intellectual property rights, he said sovereignty trumped any private company s attempt to control the state s public records through claims of intellectual property. [http://www.softwaregarden.com/cgi-bin/oss-sig/wiki.plOpenFormatMeetingSept2005]
Subsequently, in September 2005, Massachusetts became the first state to formally endorse OpenDocument formats for its public records and, at the same time, reject Microsoft proprietary XML format, now named Microsoft Office Open XML format (see WordprocessingML). This decision was made after a two-year examination of file formats, including many discussions with Microsoft, other vendors, and various experts. Microsoft Office, which has a nearly 100% market share among the state s employees, does not currently support OpenDocument formats. Microsoft has indicated that OpenDocument formats will not be supported in new versions of Office, even though they support many other formats (including ASCII, RTF, and WordPerfect), and analysts believe it would be easy for Microsoft to implement the standard. If Microsoft chooses not to implement OpenDocument, Microsoft will disqualify themselves from future consideration. Several analysts (such as Ovum) believe that Microsoft will eventually support OpenDocument.
After this announcement by Massachusetts supporting OpenDocument, a large number of people and organizations spoke up about the policy, both pro and con (see the references section). Adobe, Corel, IBM, and Sun all sent letters to Massachusetts supporting the measure. In contrast, Microsoft sent in a letter highly critical of the measure. A group named Citizens Against Government Waste (CAGW) also opposed the decision. The group claimed that Massachusetts policy established an arbitrary preference for open source, though both open source software and proprietary software can implement the specification, and both kinds of developers were involved in creating the standard (CAGW, 2005). Many considered this group s statement as simply a paid statement by Microsoft; InternetNews and Linux Weekly News noted that CAGW has received funding from Microsoft, and that in 2001 CAGW was caught running an astroturfing campaign on behalf of Microsoft when two letters they submitted supporting Microsoft in Microsoft s anti-trust case, were found to have the signatures of deceased persons (Linux Weekly News). James Prendergast, executive director of a coalition named Americans for Technology Leadership (ATL), also criticized the state s decision in a Fox News article ([http://www.foxnews.com/story/0,2933,170724,00.html Prendergast 2005]). In the article, Prendergast failed to disclose that Microsoft is a founding member of ATL. Fox News later published a follow-up article disclosing that fact ([http://www.foxnews.com/story/0,2933,172063,00.html FOX News, 2005]; Jones, September 29, 2005).
== Other countries ==
According to OASIS OpenDocument datasheet, Singapore s Ministry of Defense, France s Ministry of Finance and its Ministry of Economy, Finance, and Industry, Brazil s Ministry of Health, the City of Munich, Germany, UK s Bristol City Council, and the City of Vienna in Austria are all adopting applications that support OpenDocument.
=Standardization=
== Process ==
Version 1.0 of the OpenDocument specification was developed after lengthy development and discussion by multiple organizations. The first official OASIS meeting to discuss the standard was December 16, 2002; OASIS approved OpenDocument as an OASIS standard on May 1, 2005. The group decided to build on an earlier version of the OpenOffice.org format, since this was already an XML format with most of the desired properties, and had been in use since 2000 as the program s primary storage format (demonstrating its utility). Note, however, that OpenDocument is not the same as the older OpenOffice.org format; many changes and lessons learned were incorporated based on the feedback from many different individuals and companies.
According to Gary Edwards, a member of the OpenDocument TC, the specification was developed in two phases. Phase one (which lasted from November of 2002 through March of 2004), had the goal of ensuring that the OpenDocument format could capture all the data from a vast array of older legacy systems. Edwards expressed this goal as perfecting the Open Document XML as a transformation layer (a universal intermediate format) where interoperability with legacy information systems was our primary concern. This considered at least 30 years of legacy information systems that cross an incredible spectrum of information and file format types, including various versions of Microsoft Office and many other products and formats as well. Phase Two focused on Open Internet based collaboration. (Einfeldt, 2005).
== Participants ==
The standardization process included the developers of many office suites or related document systems, including (in alphabetical order):
Notably absent from the group of active participants was Microsoft, especially since Microsoft is a member of OASIS and is the dominant vendor of office suite software. This absence was in spite of the European Union s TAC (Telematics between Administrations Committee) 2004 request for all industry actors to consider participating in the OASIS Open Document Format work (TAC, 2004). Instead, Microsoft decided to only develop their own incompatible format, without external input or review. Due to this lack of widespread independent and public review of Microsoft s format, many are concerned that Microsoft s format will be harder for others to implement or that Microsoft s format lacks important capabilities compared to OpenDocument. For example, the European Union commissioned a report (Valoris, 2004) which noted that, It is quite trivial to add elements to an XML document that place processing requirements and restrictions on the document, thus preventing cross-platform processing capability... While properly developed XML should in theory be platform-neutral, experience has shown that vendors who wish to maintain and protect their platform s market will go to extents to encode elements that are capable of being processed only by their own application suites. The only counter-balance to this natural force is the development of open, cross-industry, widely adopted standards that serve to block the inclusion of application or platform specific encoding. Microsoft also imposes additional license conditions on users of their format; many believe these additional license conditions inhibit competition, as discussed below.
The OpenDocument standardization process also included many document users, especially those with the need to handle complex documents or to be able to retrieve documents for long periods of time after their development. Document-using organizations who initiated or were involved in the standardization process included (alphabetically):
As well as having many formal members, draft versions of the specification were released to the public and subject to worldwide review. Many others, who were not formal members of the standardization committee, submitted comments to the committee. These external comments were then adjudicated publicly by the committee.
== Next Steps ==
OASIS has submitted the OpenDocument standard to a joint technical committee of the International Organization for Standardization ISO and the International Electrotechnical Commission (IEC) for approval as an international ISO/IEC standard. ISO spokesman Roger Frost stated that the committee will send the specification out to its members, probably at the end of this month, and they will have five months to study and vote on it (Sayer, 2005). Many expect that OpenDocument s broad support and demonstrated open development process will result in quick passage of OpenDocument as an ISO/IEC standard. OASIS is one of the few organizations which has been granted the right to propose standards directly to ISO as a proposed publicly available specification (PAS). This process is specifically designed to fast-track public specifications into becoming ISO standards when they have already been developed in an open manner. OpenDocument advocates note that, in contrast, there is no evidence that the competing Microsoft XML formats or the older DOC/PPT/XLS formats will go through an independent standardization process to be standardized. The older DOC/PPT/XLS formats are not even publicly specified, which is one reason why documents written in these formats sometimes cannot be read by later versions of the same office suite.
Gary Edwards, a member of the OpenDocument TC, says that after ISO standardization, there is no doubt in my mind that OpenDocument is heading to the W3C for ratification as the successor to HTML and XHTML. (Einfeldt, 2005). The W3C has not made any public statements supporting or denying this statement, however.
=Licensing=
The OpenDocument specification is available for free download and use [http://www.oasis-open.org/committees/tc_home.phpwg_abbrev=office]. An irrevocable intellectual property covenant made by key contributor . In short, anyone can implement OpenDocument, without restraint, and as shown below both proprietary and open source software programs implement the format.
All of this is in contrast with the competing Microsoft Office Open XML developed by Microsoft. Microsoft has released their format royalty-free, but with additional conditions not imposed by OpenDocument. Independent analysts have stated that Microsoft s licensing requirements will prevent many competitors from ever implementing Microsoft s format. The extent of this incompatibility is the source of significant controversy between Microsoft and other parties. The text below attempts to capture these differences, since they are often one of the reasons people consider using OpenDocument.
Microsoft states, in their FAQ, that they believe that some open source software licenses are compatible with their license, and that if a developer believes that some license is in conflict, they must choose other forms of open source licenses. Microsoft has not publicly issued its opinion about the compatibility of any particular open source software license. However, several independent analysts have determined that the legal obligations for the Microsoft format are such so it cannot be used by competing programs licensed under the GNU General Public License (GPL), and possibly many other open source software / Free-libre software licenses as well. This is important because the GPL is the most popular license by far for Open source software. In particular, the GPL is used by many competing office applications such as the entire KOffice office suite, the Gnumeric spreadsheet program, and the AbiWord word processor. Microsoft is well aware of widespread use of the GPL license by many of its competitors; at one time Microsoft CEO Ballmer referred to Linux as a cancer because of the effects of the GPL (the license the Linux kernel uses) (Greene, 2001). Thus, many independent analysts believe that Microsoft s license terms are designed to inhibit competition, in spite of Microsoft s claims otherwise. Some of these concerns are described as follows:
Microsoft has stated that it has been granted a number of patents related to its format, and that it may have more pending. Microsoft states that it offers royalty-free rights both to its issued patents and patents that may be issued in the future as an outcome of the patent process in order to implement the Microsoft Office XML Reference Schemas. However, these patents can be used to force anyone to strictly adhere to their license, and as noted above, many people have analyzed the license in detail and concluded that the license inhibits competition. The most common open source software license (the GPL) forbids these kinds of limitations; if software is included, it must be usable for any purpose. There is also concern by some that Microsoft could change its licensing terms at any time; no contract actually binds Microsoft to these terms. Microsoft did restate in a clarification that their terms were offered in perpetuity, but since no enforceable contract was signed, there appears to still be some suspicion. These concerns about patents were raised in part because formerly secret Microsoft documents (known as Halloween documents I and II), which were developed in collaboration with key people in Microsoft, recommended that Microsoft suppress competition by de-commoditizing protocols (creating proprietary formats that could not be used by others) and by attacking competitors through patent lawsuits.
Dan Ravicher argues that Microsoft s licenses may not be valid, saying, we should not presume Microsoft has any valid rights here. For example, one of the relevant patents was a patent Microsoft was granted covering the conversion of programming objects into XML files, based on a filing by Microsoft on June 2001. However, only a week after the announcement of the patent, independent analysts found that SXP, an open source software library for converting C++ programming objects into XML files, was made available on SourceForge in February 2000. Since SXP s release predates Microsoft s filing, many believe Microsoft s patent is invalidatable in court due to the existence of prior art. Ravicher and others speculate this may be true for all the patents; patent offices have no database for examining software patent claims, and spend very little time examining patent claims, so there is general consensus that many invalid software patents are granted (Galli, 2005). However, since software patent litigation typically costs millions of dollars, invalidatable patents can still be used to intimidate and inhibit competition if the patent-holder chooses to do so.
After discussions with the European Union and Massachusetts, Microsoft issued a clarification. In particular, in the clarification Microsoft stated that, We are acknowledging that end users who merely open and read government documents that are saved as Office XML files within software programs will not violate the license. However, observers quickly noted that this exception only applied to government documents (not other documents) and only for opening and reading them (not for writing them, and possibly not for printing them or translating them to another format). Neither governments nor software developers want formats that are limited for use only by governments; it is much better to have a single format for any such data. This exemption would not by itself permit open source software implementations, since the Open Source Definition forbids discrimination against persons (including non-government personnel), groups, or fields of endeavor; this exemption also contradicts the Free Software definition, which requires as freedom 0 the freedom to run the program for any purpose . Also, the whole point of these formats is to permit editing, not just reading them; for read-only documents, other formats such as PDF tend to be used instead. If the term reading is interpreted as applying only to humans, then this grant is even more limited (prohibiting printing and transforming), but even a broad interpretation is limiting since it does not grant the privilege to write the format. Thus, independent analysts reported that none of these clarifications addressed the concern that Microsoft s XML format cannot be used by many of Microsoft s competitors, while OpenDocument can be used by anyone -- both Microsoft and its competitors.
= Promotion =
OASIS promotes OpenDocument (since it is their work). In October 2005 the Open Document Fellowship was founded with the aim of [supporting] the work of community volunteers in promoting, improving and providing user assistance for the OASIS Open Document Format for Office Applications (OpenDocument) and software designed to operate on data in this format. It was founded by Friends of OpenDocument Inc.m an incorporated association in the State of Queensland, Australia. [http://opendocumentfellowship.org/Fellowship/AboutUs] Some early reports incorrectly stated that is was founded by OASIS [http://www.oreillynet.com/pub/wlg/8042]. Other promotional websites include friendsofopendocument.org and spreadopendocument.org.
=Applications supporting OpenDocument=
== Current support ==
A number of applications currently support OpenDocument; listed alphabetically they include:
Microsoft s letter to Massachusetts claimed that all current OpenDocument implementations were based on OpenOffice.org and its derivatives. However, this turns out to be untrue. For example, KOffice is a completely independent implementation of OpenDocument not based on OpenOffice.org -- their main functions have been implemented independently, and even their code for reading and writing the OpenDocument format was developed independently (Wallin, 2005). This is important, because independent implementations from the same specification are generally considered the best way to find and fix any problems in a specification. For example, the IETF even requires two independent implementations for its final stage of standardization.
The first application to implement OpenDocument was KOffice. OpenDocument was developed starting from an XML format developed for OpenOffice.org; OpenOffice.org has since been updated so that it also supports OpenDocument.
== Corel WordPerfect status ==
Corel s WordPerfect office suite may release support for OpenDocument, even though they have not yet made a formal announcement. Corel is an original member of the OASIS Technical Committee on the Open Document Format, and Paul Langille, a senior Corel developer, is one of the original four authors of the OpenDocument specification. Also, Corel sent a letter to Massachusetts supporting their selection of OpenDocument, saying, Corel strongly supports the broad adoption of the open standards Massachusetts has outlined, including XML, the OASIS Open Document Format and PDF.... Corel remains committed to working alongside OASIS and other technology vendors to ensure the continued evolution of the ODF standard and the adoption of open standards industry-wide. [http://www.mass.gov/Aitd/docs/policies_standards/etrm3dot5/responses/corel.pdf] Many find it improbable that Corel would invest so much effort, and say that they will work to ensure adoption, without implementing it themselves.
At the September 16, 2005 Town Meeting, an IBM representative said that they were implementing OpenDocument and that Corel was also actively implementing OpenDocument. Steven J. Vaughan-Nichols s eWeek article of September 26, 2005, states without caveats that Corel is actively implementing OpenDocument in their WordPerfect suite. On September 28, 2005, he clarified further that Corel s WordPerfect will soon be supporting the OpenDocument format , noting that while Corel won t commit to a date for adding OpenDocument to WordPerfect, the company made it clear that it is working towards that goal.
A month later, on October 18, 2005, a Corel representative described a different position in an interview for BetaNews [http://www.betanews.com/article/Interview_Corel_on_Sun_Open_Standards/1129672161]: they do not see OpenDocument format support as a priority for them just now, and cannot even evaluate the time it would need for them to support it, if ever. This report was immediately questioned; Berlind later reported that Corel confirms OpenDocument commitment . (Berlind, October 25, 2005).
== Programmatic Support ==
OpenDocument is an ordinary Jar (file format) (JAR) containing standard XML files. JAR files are simply a set of files compressed together using the ZIP (file format). Thus, any of the vast number of tools for handling zip/jar files and XML data can be used to handle OpenDocument. Nearly all programming languages have libraries (built-in or available) for processing XML files and zip files.
J. David Eisenberg has developed and released the Java class [http://books.evc-cit.info/odf_utils.zip com.catcode.odf.OpenDocumentTextInputStream], which extracts the text information from an OpenDocument text file. It extracts only the text within <text:p> and <text:h>, unless they are in <text:tracked-changes> (i.e., it automatically handles tracked changes). The lists of capture and omit elements is user-selectable.
Some free Perl extensions for OpenDocument file processing are available at CPAN, such as [http://search.cpan.org/dist/OpenOffice-OODoc OpenOffice::OODoc], [http://search.cpan.org/dist/OpenOffice-OOBuilder/OOCBuilder.pm OpenOffice::OOCBuilder], [http://search.cpan.org/dist/OpenOffice-OOSheets OpenOffice::OOSheets], [http://search.cpan.org/~ptandler/Bundle-PBib-2.08.01/lib/PBib/Document/OpenOffice.pm PBib::Document::OpenOffice], and others.
== Microsoft ==
For most of 2005 Microsoft had publicly stated that it did not plan to support OpenDocument. Its stated rationale was that OpenDocument is missing some important functionality, though it has not identified any particular missing functionality (making this claim difficult to prove or refute). Many are very sceptical of this claim; ZDNet said, Does OpenDocument, which is the result of a lot of hard work from people fully versed in contemporary corporate computing, really fail at the very things it was designed to provide , and closes urging Microsoft to add support for OpenDocument (ZDNet UK, September 2, 2005). InfoWorld s Neil McAllister noted that even if OpenDocument were missing important functionality, this statement is inconsistent; Microsoft Office already supports formats with far less functionality than OpenDocument (such as HTML and ASCII text). Instead, he believes that the real reason Microsoft will not support OpenDocument (so far) is because An open document standard won t help Microsoft lock in its loyal addicts -- excuse me, customers -- so an open standard isn t in Microsoft s business interests. Microsoft refuses to support OpenDocument; it doesn t get more bald-faced than that (McAllister 2005).
A Boston Globe article quoted Peter Quinn of Massachusetts saying that the state could implement OpenDocument without abandoning Microsoft Office: We are not asking anybody to take anything off their desktop. Instead, they plan to modify an estimated 50,000 computers with software that would let Office users store their files in the OpenDocument format, instead of Microsoft s proprietary format, if Microsoft continues to refuse to support the format (Bray, September 23, 2005).
Recent reports suggest that Microsoft is considering supporting OpenDocument in the future; at this time it has not committed itself either way. Nick Tsilas, a Senior Attorney at Microsoft, said that, features are dictated by customer demand and, until the Massachusetts-related activity occurred, Open Document was not even on our radar screens. This is a surprising revelation, because in 2004 the European Union directed all parties (including Microsoft) to get involved with the OpenDocument standard. Microsoft General Manager of Information Worker Business Strategy Alan Yates confirmed that this was the company position; For us this has been, and will continue to be a matter of evaluating the flow of customer requirements, and this is a new issue. (Updegrove, 2005)
On Sep. 25, 2005, Alan Joch of Federal Computer Week reported that Microsoft has changed its stance and that its next Office release will support OpenDocument, though not natively. This means users would have to select that format option every time they save a file. (Joch, 2005) As of this time this report has not been independently confirmed, however, and other reports suggest this is still merely being considered.
On October 25, 2005, Dan Farber reported on his conversation with Microsoft CTO Ray Ozzie. Ozzie told me that supporting ODF in Office isn t a matter of principle. Microsoft isn t opposed to supporting other formats. ... Ozzie attributed the tentativeness on ODF support in Office to resource allocation issues... Microsoft is working with a French company on translators to determine the scope of the problem in exporting Office documents to ODF. Farber then speculated, It sounds to me that support for Save As ODF in Office is a when, not an if (Farber, October 25, 2005).
Groklaw readers believe they have traced this unnamed French company as Clever Age, who is developing a translator named ooo-word-filter. This project translates from an OpenOffice format into WordML. It is currently very incomplete (only a few constructs are translated). It unclear if the OpenOffice format it reads is OpenDocument or the old .sxi format, and it appears to generate WordML (the Office 2003 XML format) instead of the incompatible Open XML format to be used by Office 12. Note that it only reads the OpenOffice.org format (it cannot generate it), nor does it cover the OpenDocument features outside of Word Processing. (Jones, October 27, 2005).
Note that there are many other mechanisms for using Microsoft Office to support OpenDocument. Any office suite that can read and write both Microsoft Office binary formats and OpenDocument can be used as a translator. docvert translates to OpenDocument, and ooo-word-filter is a plug-in for Microsoft Word for the 2003 XML format. Those who want to use Microsoft Office without exiting the suite, yet use all of OpenDocument, are likely to consider using OpenOpenOffice -- discussed next.
== Phase-n OpenOpenOffice Plug-in for Microsoft Office ==
Phase-n is developing [http://www.phase-n.com/openopenoffice/ OpenOpenOffice] ( O3 ), a open source software plug-in for Microsoft Office. With this free plug-in, Microsoft Office will be able to read and write OpenDocument documents (and any other formats supported by OpenOffice.org). Instead of installing a complete office application or even a large plug-in, O3 will install a tiny plug-in to the Microsoft Office system. This tiny plug-in would automatically send the file to some server, which would then do conversions and send it back. The server could be local to an organization (so private information won t go over the Internet) or accessed via the Internet (for those who do not want to set up a server).
The plug-in is expected to be available by the end of November 2005.
Phase-n argues that the main advantage of their approach is simplicity. Their website announces that O3 requires no new concepts to be explored, no significant development, and leverages the huge existing body of work already created by the OpenOffice.org developers, the CPAN module authors, and the Microsoft .NET and Office teams. Initial ballpark estimates are for less than 2,000 lines of code and only a few hundred hours of development time to get to an initial stable release of the O3 client and server. They also argue that this approach significantly simplifies maintenance; when a new version of OpenOffice.org is released, only the server needs to be upgraded.
The OpenOpenOffice project is a partnership between the software industry group Open Source Victoria, the technology company Phase N Australia, and the wider Open Source community. Open Source Victoria was convened by Con Zymaris and includes more than 100 Victorian firms and developers (Varghese, 2005).
== Other Planned Support ==
The general manager of [http://www.software602.com/ Software602] reports that they plan to release a new version of their [http://www.software602.com/products/pcs/ commercial office suite], currently named 602PC Suite, as 602Office 2. The product 602Office 2 will be based on OpenOffice.org 2, so it will include native support for OpenDocument. The release date is expected to be October 31, 2005.
=File types=
The recommended file extensions and MIME types are included in the official standard (OASIS, May 1, 2005).
==Documents==
The most common file extensions used for OpenDocument documents are .odt for text documents, .ods for spreadsheets, .odp for presentation programs, .odg for graphics and .odb for database applications. These are easily remembered by considering .od as being short for OpenDocument , and then noting that the last letter indicates its more specific type (such as t for text). Here is the complete list of document types, showing the type of file, the recommended file extension, and the MIME:
==Templates==
OpenDocument also supports a set of template types. Templates represent formatting information (including styles) for documents, without the content themselves. The recommended filename extension begins with .ot (which can be viewed as short for OpenDocument template ), with the last letter indicating what kind of template (such as t for text). The supported set are:
=Capabilities=
As noted above, the OpenDocument format can describe text documents (e.g., those typically edited by a word processor), spreadsheets, presentations, drawings/graphics, images, charts, mathematical formulas, databases, and master documents (which can combine them). It can also represent templates for many of them.
The official OpenDocument standard (OASIS, May 1, 2005) defines OpenDocument s capabilities. Haumacher (2005) provides a hyperlinks formal specification (Haumacher, 2005) derived from the official standard. Eisenberg (2005) s book describes the format in more detail. The text below provides a brief summary of the format s capabilities.
==Metadata==
The OpenDocument format supports storing Metadata (data about the data) by having a set of pre-defined metadata elements, as well as allowing user-defined and custom metadata. The predefined metadata are: Generator, Title, Description, Subject, Keywords, Initial Creator, Creator, Printed By, Creation Date and Time, Modification Date and Time, Print Date and Time, Document Template, Automatic Reload, Hyperlink Behavior, Language, Editing Cycles, Editing Duration, and Document Statistics.
==Content==
OpenDocument s text content format supports both typical and advanced capabilities. Headings of various levels, lists of various kinds (numbered and not), numbered paragraphs, and change tracking are all supported. Page sequences and section attributes can be used to control how the text is displayed. Hyperlinks, ruby text (which provides annotations and is especially critical for some languages), bookmarks, and references are supported as well. Text fields (for autogenerated content), and mechanisms for automatically generating tables such as tables of contents, indexes, and bibliographies, are included as well.
In the OpenDocument format, spreadsheets are an example of a set of tables. Thus, there are extensive capabilities for formatting the display of tables and spreadsheets. Database ranges, filters, and data pilots (known to Excel users as pivot tables ) are also supported. Change tracking is available for spreadsheets as well.
The graphics format supports a vector graphic representation, in which a set of layers and the contents of each layer is defined. Available drawing shapes include Rectangle, Line, Polyline, Polygon, Regular Polygon, Path, Circle, Ellipse, and Connector. 3D Shapes are also available; the format includes information about the Scene, Light, Cube, Sphere, Extrude, and Rotate (it is intended for use as for office data exchange, however, and not sufficient to represent movies or other extensive 3D scenes). Custom shapes can also be defined.
Presentations are supported. Animations can be included in presentations, with control over the Sound, showing a shape or text, hiding a shape or text, or dimming something, and these can be grouped. In OpenDocument, much of the format capabilities are reused from the text format, simplifying implementations.
Charts define how to create graphical displays from numerical data. They support titles, subtitles, a footer, and a legend to explain the chart. The format defines the series of data that is to be used for the graphical display, and a number of different kinds of graphical displays (such as line charts, pie charts, and so on).
Forms are specially supported, building on the existing XForms standard.
==Formatting==
The style and formatting controls are numerous, providing a number of controls over how information is displayed.
Page layout is controlled by a variety of attributes. These include page size, number format, paper tray, print orientation, margins, border (and its line width), padding, shadow, background, columns, print page order, first page number, scale, table centering, maximum footnote height and separator, and many layout grid properties.
Headers and footer can have defined fixed and minimum heights, margins, border border line width, padding, background, shadow, and dynamic spacing.
There are many attributes for specific text, paragraphs, ruby text, sections, tables, columns, lists, and fills. Specific characters can have their fonts, sizes, and other properties set. Paragraphs can have their vertical space controlled through attributes on keep together, widow, and orphan, and have other attributes such as drop caps to provide special formatting. The list is extremely extensive; see the references (in particular the actual standard) for details.
==Spreadsheet formulas issue==
OpenDocument is fully capable of describing mathematical formulas that are displayed on the screen. It is also fully capable of exchanging spreadsheet data, formats, pivot tables, and other information typically included in a spreadsheet. OpenDocument can exchange spreadsheet formulas (formulas that are recalculated in the spreadsheet); formulas are exchanged as values of the attribute table:formula.
However, some believe that the allowed syntax of table:formula is not defined in sufficient detail. The OpenDocument version 1.0 specification defines spreadsheet formulas using a set of simple examples which show, for example, how to specify ranges and the SUM() function. Some critics argue that a more detailed, precise specification for spreadsheet functions, including syntax and semantics, should be created to augment these examples. The OpenDocument committee argued that this was outside their scope, since the syntax of such formulas is not in XML. Others have argued that, while the specification is less specific than one might like, the intent is fairly clear (especially since formulas tend to follow decades-long traditions), and also because the vast majority of spreadsheets only use a small set of functions (such as SUM) which are universally supported by all spreadsheet implementations anyway. In practice, many developers look to OpenOffice.org as a canonical implementation ; since its code is public for anyone to review, and its XML output can be trivially inspected, this can resolve many questions. There is draft work proposing a more detailed specification for spreadsheet formulas (e.g. OpenFormula). Such work is expected to simply clarify in more detail what is acceptable in a spreadsheet formula; no one expects such work to invalidate any of the current OpenDocument standard. For more information, see the OpenFormula article.
Note that this is not a disadvantage compared to Microsoft Open XML, which also does not specify formulas in detail. Nor is it a disadvantage compared to Microsoft Excel binary format, whose format and semantics have never been completely defined this way in public.
=Format internals=
An openDocument file can be either a simple XML file which uses as the root element or a Jar (file format) compressed archive containing a number of files and directories. Because the simple XML format does not allow for embeding binary content or thumbnails, the JAR-based format is used almost exclusively. Applications that use openDocument might not support saving and loading of the plain XML file, but all should support the JAR-based format. This simple compression mechanism means that OpenDocument files are normally significantly smaller than equivalent Microsoft .doc or .ppt files. This smaller size is important for organizations who store a vast number of documents for long periods of time, and to organizations those who must exchange documents over low bandwidth connections. Once uncompressed, most data is contained in simple text-based XML files, so the data contents (once uncompressed) have the typical ease of modification and processing of XML files. Directories can be included to store non-Scalable Vector Graphics images, non-Synchronized Multimedia Integration Language animations, and other files that are used by the document but cannot be expressed directly in the XML.
The zipped set of files and directories includes the following:
The OpenDocument format provides a strong separation between content, layout and metadata. The most notable components of the format are described in the subsections below. The files in XML format are further defined using the RELAX NG language for defining XML schemas. RELAX NG is itself defined by an OASIS specification, as well as by part two of the international standard ISO/IEC 19757: Document Schema Definition Languages (DSDL).
==content.xml==
content.xml is the most important file. It carries the actual content of the document (except for binary data, like images). The base format is inspired by HTML, and though far more complex, it should be reasonably legible to humans:
This is a title This is a paragraph. The formatting information is in the Text_body style. The empty text:p tag above is a blank paragraph (an empty line).
==styles.xml==
styles.xml contains style information. OpenDocument makes heavy use of styles for formatting and layout. Most of the style information is here (though some is in content.xml). Styles types include:
The OpenDocument format is somewhat unusual in that you cannot avoid using styles for formatting. Even manual formatting is implemented through styles (the application dynamically makes new styles as needed).
==meta.xml==
meta.xml contains the file metadata. For example, Author, Last modified by , date of last modification, etc. The contents look somewhat like this:
2003-09-10T15:31:11 Daniel Carrera 2005-06-29T22:02:06 es-ES
The names of the tags are taken from the Dublin Core XML standard.
==settings.xml==
settings.xml includes settings such as the zoom factor or the cursor position. These are properties that are not content or layout.
==mimetype (file)==
mimetype is just a one-line file with the mimetype of the document. One implication of this is that the file extension is actually immaterial to the format. The file extension is only there for the benefit of the user.
==Reuse of existing formats==
OpenDocument is designed to reuse existing open XML standards whenever they are available, and it creates new tags only where no existing standard can provide the needed functionality. So, OpenDocument uses DublinCore for Metadata, MathML for displayed formulas, Scalable Vector Graphics for vector graphics, Synchronized Multimedia Integration Language for Multimedia, etc.
= References =
These references were used to justify the article text above, but not all of them are specifically cited. Please help us modify the text above to identify which statements are supported by which references.
General:
Official Information from the Commonwealth of Massachusetts:
Formal comments to Massachusetts on their decision for [http://www.mass.gov/portal/index.jsppageID=itdsubtopic&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd Open Formats] and posted by Massachusetts (alphabetical order): *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_etrm_35_responses_adobe_response&csid=Aitd Adobe Systems, Inc.] *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_etrm_35_responses_corel&csid=Aitd Corel Corporation]. *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_etrm_35_responses_ibm&csid=Aitd IBM Corporation]. *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_etrm_35_responses_microsoft&csid=Aitd Microsoft Corporation] *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_etrm_35_responses_sun&csid=Aitd Sun Microsystems, Inc.] *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_etrm_35_responses_sam&csid=Aitd Sam Hiser] (Managing Director of Hiser + Adelstein). *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=policies_standards_ETRMVersion3.0PublicReviewandOpenFormats&csid=Aitd Statement from Peter Quinn on ETRM v.3.5 Public Review and Data Formats] *[http://www.mass.gov/portal/index.jsppageID=itdterminal&L=4&L0=Home&L1=Policies%2c+Standards+%26+Legal&L2=Open+Standards&L3=Open+Formats&sid=Aitd&b=terminalcontent&f=_policies_standards_open_formats_summit_notes060925&csid=Aitd Open Formats Summit Notes - June 9, 2005].
Other commentary specifically about Massachusetts decision to use OpenDocument, besides those posted by Massachusetts ( note that the length of this list justifies the claim in the main text that many people and organizations discussed the Massachusetts decision ):
=External links=
; Organizations
*[http://www.oasis-open.org/committees/tc_home.phpwg_abbrev=office OASIS Open Document Format Technical Committee] coordinates the OpenDocument development and is the official source for specifications, schemas, etc. *[http://opendocumentfellowship.org OpenDocument Fellowship] is an industry coalition that provides information about OpenDocument and advocates its deployment. *[http://friendsofopendocument.org/ friendsofopendocument.org] advocates OpenDocument *[http://www.spreadopendocument.org/ spreadopendocument.org] advocates OpenDocument
; Deployment in Europe
; Debate
*[http://forum.redlers.com/viewtopic.phpt=14 Forum Debate] an informative debate over whether or not a product should adopt the OpenDocument format
=See also=
*List of applications supporting OpenDocument *Comparison of applications supporting OpenDocument *WordprocessingML *List of document markup languages *Comparison of document markup languages *Open Document Architecture - An older standard file format that failed to gain acceptance. *Open format *OpenFormula|
|
