This site is currently in beta and more functionality and content will be added over the coming months. We welcome your comments. Please click here to provide feedback.

Sharing or collaborating with government documents: proposal

Challenge: 

Domain: 

Short description: 

Update: The time period for commenting on this proposal has been extended to 17:00 GMT on Friday 28 February.

Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials, sharing and editing documents. Officials within government departments also need to work efficiently, sharing and collaborating with documents. Users must not have costs imposed upon them due to the format in which editable government information is shared or requested.

User need approach: 

Users in the context of this proposal include citizens, businesses and delivery partners who need to share information with government in editable formats. Users are also officials within government departments who need to share and work on information together.

As technology progresses, government’s production of editable information in formats traditionally associated with documents will become less important for users. 

Government services are being redesigned to make them more straightforward and easier to use by making them digital by default. This will diminish the use of traditional government document formatting even further as information is published or collected directly on the web.

This proposal recognises that changes in technology and service delivery will therefore mean that document formats become less important as collaborative editing and transactions increasingly become an online experience. However, documents formatted in office software are still prevalent amongst users of government information and the formats used by government should meet user needs.

Users need to:

  • Open, edit and save information online and offline
  • Submit information in response to a request, to perform a transaction or to access a service
  • Share information with specific people
  • Publish information online so that a wide audience can access and work with it
  • Edit information and be confident that it remains usable and editable when saved and shared with other users
  • Create a new document with the same style as documents previously created
  • Export the documents created in a non editable format so that they can share a document as they intend it to be presented
  • Export the documents they create in a format compatible with other software so that other people can use the information
  • Share information so that they can gather feedback
  • Share information so that they can respond to a request for information
  • View/edit the information shared with them so that they can read/act upon the content
  • Provide input on information created by someone else
  • Copy and paste content from one source to another so that they can quickly collate pieces of information in one place
  • Edit information created by an integrated system they work with so that they can add additional information
  • Gather feedback on information they have drafted so that they can apply other people’s recommendations to the content
  • See version updates so that they can be sure they’re working on the latest version of a document
  • Access information from any appropriate place so that they can get on with their work
  • Their devices not to be clogged up with downloads
  • Ensure integrity of specific documents, e.g. audit trail for editing, versioning
  • Use the information on the device and platform of my choice, for example laptop, tablet or smartphone
  • Be able to use accessibility tools with information in online and offline formats

Achieving the expected benefits: 

  • Users are able to efficiently share and work on editable government information
  • Users are not required to buy new software to submit or work with government information
  • Users are able to re-use data and text, where licences permit

Functional needs: 

The format should support:

  • Characters associated with Unicode 6.2 for text based file formats (in accordance with the standards profile for cross-platform character encoding)
  • Digital continuity - having implementations that enable support for import of older formats
  • Use of metadata
  • Imports and exports to/from other applications
  • Fonts and graphics that are reusable in other formats
  • Creation of templates

Citizens, businesses and delivery partners must be able to interact with government officials and services, or those working on behalf of government, sharing appropriately formatted, editable information.

Users should be able to work on their device of choice and must not have costs imposed upon them due to the document format in which government information is provided or requested.

Documents should be editable on different devices without loss of integrity - the information should not become spoiled. Documents in this context include:

  • Word processed text
  • Spreadsheets
  • Presentations

When dealing with citizens, information should be digital by default and therefore should be published online. Browser-based editing is the preferred option for collaborating on published government information. HTML (4.01 or higher e.g. HTML5) is therefore the default format for browser-based editable text. Other document formats specified in this proposal - ODF 1.1 (or higher e.g. ODF 1.2), plain text (TXT) or comma separated values (CSV) - should be provided in addition. ODF includes filename extensions such as .odt for text, .ods for spreadsheets and .odp for presentations.

For statistical or numerical information, CSV is the required format, preferably with a preview provided in HTML (4.01 or higher e.g. HTML5).

Forms and information exchanges should be digital by default where this is enabled, therefore use of office formats should not be encouraged for the completion of forms.

For information being collaborated on between departments, browser-based editing is preferable but often not currently available. Therefore, information should be shared in ODF (version 1.1 or higher e.g. ODF 1.2). The default format for saving government documents must be one of the formats described in this proposal.

To avoid lock-in to a particular provider, it must be possible for documents being created or worked on in a cloud environment to be exported in at least one of the editable document formats proposed.

Information that is newly created or edited should be saved in one of the formats described in this proposal. There is no requirement to transfer existing information, unless it is newly requested by a user and shared.

Other steps to achieving interoperability: 

  • A government body must not refuse to accept or supply a document in at least one of the open formats described in this proposal 
  • Documents may be shared in other formats but only in response to a specific request from a user
  • Existing documents should be migrated to the formats specified in this proposal if they are re-opened for editing or are requested by a user
  • Government bodies should avoid bespoke implementations which may limit their ability to migrate information or to share it with other users
  • Macros should be avoided wherever possible, particularly when sharing documents.
  • Government officials should engage with interoperability testing initiatives for document formats
  • Government officials should engage with standards bodies associated with the maintenance of standards that are agreed for document formats for use in government

This proposal, if agreed, would apply to information produced by or on behalf of central government departments, their agencies, non-departmental public bodies (NDPBs) and any other bodies for which they are responsible. These government bodies would need implementation advice to give clarity about when to use particular formats, the user needs they meet and the interoperability that can be expected.

A document metadata profile is outside the scope of this proposal, although this may be the subject of other challenges taken through the Standards Hub process.

Assessment of tools that can be used for providing multiple formats from a single standardised format are also outside the scope of this standards challenge.

Standards to be used: 

Other standards to be used: 

Incorporated in: 

Phase: 

Proposal

Associated standard versions: 

Comments

Sadly there is no such thing

Sadly there is no such thing as a registered open standard.  To my view OOXML (any variant really) fails the UK gov's chosen definiton on the issue of its initial creation, its mainetnance procedures do not guarantee the required IPR and future revisions are unkown, despite the fact that guarantees on the issue could be produced by vested vendors - eg no future variant implementations without going through ISO/IEC JTC1 SC34.  

On the other points here it confuses what is basically a choice of the sort made when implementing any system, albiet of significant scale, with a market intervention on competition grounds.  THe UK Gov doesn't buy and implement Google's systems, Google does that.

Hi Alun,

Hi Alun,

Do you mean allowing "ISO/IEC 29500 Strict" or "ISO/IEC 29500 Transitional" or both?

I doubt that the majority would object to the former, but the the latter is a problem due to its proprietary nature.

Bear in mind that the only version of MS Office that can write files in the strict format is 2013, but even then it's not the default save format...

Having worked for a Microsoft Gold Partner myself in the past, I'm concious how communications tend to be fairly selective...

 

hi alun, thanks for taking

hi alun, thanks for taking the time to make your views known. they're good points that need addressing.

"However, if OOXML is a genuine registered Open Standard"

it's not. it was rail-roaded through on a fast-track without due consideration and in direct violation of the review process which brought up severe violations that were then completely ignored. its adoption brought ISO's good name into disrepute.

"It would be discriminatory to exclude it based on who was behind creating it"

it's not being. it's being excluded because of severe interoperability incompatibilities (even within microsoft office itself), vendor lock-in, and total cost of ownership *to all citizens of the united kingdom* (not just "to one government department").

there are many stories on here about people who have tried - very hard - to save documents in "strict" OOXML. they have to purchase the latest ultra-expensive version (only from last year) of that proprietary software. which means an upgrade of their hardware as well, because it's so bloated it won't run on older computers.

by contrast as you're no doubt aware it's possible to run the latest LibreOffice and GNU/Linux on much older systems, often available at much lower cost, 2nd-hand or dumped precisely because they wouldn't run the latest-and-greatest microsoft offerings!

As key supplier of Records

As key supplier of Records Management technology to the UK government we are very concerned about the very limited discussion and scope of the dialog on limiting the UK government to a single standard choice of only ODF for complex word processing documents.

RecordPoint believes that the most cost effective way for the government to operate is to embrace all standards that exist in the market particularly when the competing format OOXML is already a recognized standard ISO/IEC 29500:2008. What’s more not only is a Standard it is the defacto standard that all of our UK customers current have their digital records stored in.

We are very concerned about the additional costs this proposal would impose on UK government agencies. From a records and archive perspective the costs could be exponential as the change may lead to the need to transform the existing records corpus in the UK government to ODF, which would seem unnecessary when they are already stored in Standards compliant format.

The reality is that when you are working with records material (we think the same applies to collaboration) then whatever you're doing to archive and store this data needs to be visible now and in the future. The great thing about both ODF and OOXML, is that they both serve this purpose. Hence the logical conclusion is that both formats should be supported now and in the future to ensure we can cost effectively maintain the past.

The standards proposal is a

The standards proposal is a welcome return to Openness as originally envisaged when the Government Secure intranet was built, a platform which was based on RFC standards and has endured as a consequence.

Open, unencumbered standards do not only promote accessibility, they also prevent single source monopolies and ensure correct long term stability and archive retrieval ability.  They reduce cost by fostering honest competition and permitting cross-platform interoperability - in effect, they reduce the need for Government to prescribe IT standards which could prove a cost barrier to integration and the citizen's open and easy access of government. 

The best promotion for Open standards is that they tend to be created as part of a thorough, needs driven concensus process.  They take years to mature, and, as a consequence, tend to be precisely defined, pragmatic and easy to implement.  An open and collaborative process has no tolerance for hidden features or restrictions which encumber innovation.

Last but not least, Open standards also enable the use of cost effective Open Source solutions, which may be more interesting now in a post-Snowden era as they can be publicly examined for "creativity", a feature never possible with proprietary solutions.  It goes without saying that this also introduces significant cost savings as license costs as well as licence management overhead expenditure disappears, although I would caution against the assumption that those products are truly "free" - implementation and training will cost the same as proprietary solutions.

I congratulate you with this position.  No doubt there will be a fight with established vendors, but this is truly the best position to start from.

OOXML is effectively NOT an

OOXML is effectively NOT an open standard; not even Microsoft have been able to implement it properly. It was clearly intended to circumvent the use of ODF (why would anyone need two?!) and should never have been accepted as a standard.

 

ODF has to be the sensible way to go in terms of "open" formats. Lock-in to one company helps nobody except that company. Ever.

Microsoft have had little

Microsoft have had little positive incentive to properly support ODF and if the UK adopted ODF, this would change overnight.  They clearly can't charge more for Office to offset the required development work, as this would push even more people to switch to Open/Libreoffice or equivalents.

Standardising on ODF fundamentally benefits everyone in the UK.  Microsoft will have to improve compatibility or face losing customers to open source alternatives, hence they will improve compatibility - regardless of how difficult they say it is to achieve now!

As mentioned in another reply

As mentioned in another reply, there are two variants of the standard: ISO/IEC 29500 Strict and ISO/IEC 29500 Transitional.

The problem with supporting the latter format (the only one Office 2007 & 2010 can write in) is that it supports proprietary extensions, which negates the whole point of defining open standards for data transfer.

An additional problem is that of licensing.  Microsoft made a promise that they would not make any claims action against organisations using the standard; however some groups have questioned the legal standing.

My own preference is standardising on ODF only; however also allowing use of ISO/IEC 29500 Strict (not transitional) might be a pragmatic step for the UK government to take.

The simple reason is that

The simple reason is that having two standards is less good than one because it ends up forcing everyone to do the work to support both, so having just one ends up better even for the people that need to initially do a little extra work to comply with the standard.

The complicated reason is that OOXML is a horrible quasi-standard that is not supported by anything that does not also support ODF.

MS Office 2013 was the first version that supported the strict version -- the transitional version of the 'standard' is even less of a standard, so should be considered just as proprietary as any of Microsoft's many other (often mutually incompatible) file formats.

MS Office 2013 also supports ODF 1.2.

i think the standards

i think the standards suggested (ODF/HTML/CSV/TXT) are excellent choices as they do not have dependencies upon specific software to read and edit  documents. Would ODF also displace PDF?

 

My understanding is that ODF

My understanding is that ODF is not currently compatible with a number of office products on the market which includes Google Docs. I'm aware of a number of individuals and businesses who use Google Docs as their primary source for documents, spreadsheets, etc. I believe a number of University's also use Google Docs for both students and staff. If this proposal is adopted, will it still be possible for those individuals/organisations to interact with local / national government agencies?

Google are one of the many

Google are one of the many supporters of ODF and are involved in developing LibreOffice.  Every year they run a "Google summer of code" and invariably at least 1 thing gets developed for LibreOffice (and then goes through the normal Quality Assurance checking that all code goes through before getting into the main branch). 

Google docs certainly used to use ODF.  If it doesn't at the moment then it's something that they could easily add.  As could other vendors who are currently locked into proprietary formats

Regards from
Tom Davies

Although the principal is

Although the principal is honourable, in reality, how easy will it be for end users across the various government agencies to actually use & execute this new approach / format for creating & sharing documents?
 
Those who work in, for & with government agencies (specifically local government) find user adoption of any new 'everyday' core technology challenging at best. Most end users in government are most likely already using Office in both the workplace and at home and are used to is - suddenly they now have to adopt, adapt and quickly respond to multiple formats? History tells us this won't work ... so with that my thoughts are to retain what people already know and understand, build on that and negate any further risks to corruption by using multiple formats.
 
Any major 'revolutionary' changes to the norm is likely to increase costs, cause dissatisfaction amongst all end users, add significant complexity to the process of dealing with government, inhibit multi-agency collaboration and will negatively impact suppliers to government.

There is really no issue with

There is really no issue with adoption of the new standards. Existing software that is ODF compliant (such as LibreOffice) is very similar to Microsoft Office in function. LibreOffice is resonably good at opening propriatary file formats (such as doc, docx xls etc) with only slight problems of formatting so that it is easy to transfer legacy data. There would admittedly be some work needed on modification of macros, but where macros were complex that would usually be performed by IT specialists. It is even possible that Microsoft could be pursuaded to properly implement ODF support within Microsoft office in order to retain some market share, and the change would then be wholly transparent to end users. In reality the move from Microsoft propriatary formats to ODF would be less painful than the change from old Microsoft formats such as doc xls etc to OOXML formats such as docx, xlsx.

Sadly that just is not an

Sadly that just is not an option! 

Moving to a different version of MS Office involves retraining and the User Interface is often radically different from what many people still use as many are alarmingly still on MS Office 2003 or earlier. 

Weirdly, moving to LibreOffice, OpenOffice or any of many others presents them with a much more familiar User Interface than the one that recent versions of MS Office uses.  At worst it feels like going back to something much more familiar

Regards from
Tom :) 

This is an excellent proposal

This is an excellent proposal. The standards suggested are true open standards and address the problems of supplier lock-in.

OOXML should definitely not be added to the list. It provides nothing that ODF does not and is heavily biased towards Microsoft. As other commenters have pointed out, the OOXML "standard" is so Byzantine and poorly-specified there are no current tools that properly implement it - not even Microsoft's tools. In my opinion OOXML should be cancelled as an ISO standard on the basis of its severe technical problems (which effectively render it unimplementable) and on the basis of the disgraceful Microsoft-sponsored corruption of the ISO that led to it being adopted in the first place. It really is that bad. The only reason is was created was as a (very effective) anti-competitive measure to perpetuate the Microsoft stranglehold on the word processor market.

I strongly support this

I strongly support this proposal as written, and without amendment.  

I am against the addition of Microsoft's partially implemented standard. I have difficulty accepting that the suggestion to add OOXML is in the best interests of the general public or the Government, rather than the best interests of a single corporation. As a convicted monopolist I believe any lobbying by them, or on their behalf is suspect and should be held up to closer scrutiny than other companies that don't attempt to force, or advocate, a monoculture for their own benefit.

My name is Eduardo Romero and

My name is Eduardo Romero and I am the CTO of an small IT technical migration department (AZLinux) inside the Technology area at a Spanish PA, Zaragoza City Council (Spain's 5th largest city)  .

Our job consists in migrating to open formats and free/open software tools our PC infrastructure. We consider ourselves an authorised voice in this field: within the last 6 years working on this process, we have migrated more than 3000 computers from different departments of the City Council.

Nowadays all of them work with ODF, and more than 1000 use only Linux operating system for all their daily operations.

Therefore, and given our experience, I express my strong support to this initiative, assessing it in five main conclussions:

1. Public administrations should ensure general interest, and in this case should ensure data exchange with citizens and between other public administrations. Also is a need to assure data can be accessed in the future.

2. Only standard and open formats can ensure the previous point. In my humble opinion, ODF is an standard and open format and OOXML is not. It can be argued that OOXML is an ISO standard, indeed it may be considered as so (ISO 29500-1:2012). But, on the other hand, it can be stated that, as clear disadvantages, it is confusing and difficult to implement (MS can't do it completely) and created and managed for an unique company (vendor lock-in)

Further reading about these issues can be found here:
 
https://blogs.technet.com/b/mpn_uk/archive/2014/02/19/government-open-standards-consultation-will-likely-impact-all-of-us-make-sure-your-voice-is-heard-by-26th-february.aspx
http://en.wikipedia.org/wiki/Standardization_of_Office_Open_XML

3. Adopting new file formats requires some effort, this is undeniable. Nowadays, format exchange is clearly a "jungle" as well.

4. ODF is implemented in a large amount of applications. Some of them are free/open source (LibreOffice, OpenOffice, ...) and other are not (Microsoft Office, Google, ...). Having the opportunity to choose a free/open tool is a key point. Free/open tools don't have license costs, and the code can be studied, improved and audited. IMHO public administrations should promote this open knowledge model.

On the other hand, OOXML is only partially implemented for Microsoft tools. I want to stress that MS tools can only run in MS operating systems, leaving out of the equation any free/open operating systems.

5. Our experience is that migration process has some difficulties, and most of them are due to the close nature of DOC format. But, in general, the problems that we have experienced are medium-low level of complexity.

Furthermore,  an extra effort is required regarding education about file format exchange between end users, stressing topics like:

  • Document goal: read or edition
  • Data are more important than formats
  • Please, no more PDF as an edit format ;-)

We therefore believe the UK Government is heading in the right direction by adopting the ODF standard.

Congratulations for your brave decision

As someone involved both in

As someone involved both in the standards process, and implementation of the spreadsheet side of OOXML, I would like to clarify that OOXML is an international standard (IS29500) and is certainly implementable.

If people have issues with OOXML, they can certainly participate in the process as I have done.  I was also a co-editor on an amendment to the specification.  Although we are Microsoft partners (as are many thousands of companies who develop for the Windows platform and Office applications), the fact that they cede editorial control to any third party is a very positive sign. 

Microsoft cannot be held accountable for a lack of participation from those who have an interest and strong opinion in document standards and consider OOXML as somehow not a "real" standard.

The software we create, using both libraries from Microsoft and the specification is designed to meet customer demands.  All our customers use Excel (XLS/XLSX/XLSM) extremely extensively in their business.  We have customers who create and write new data into thousands of Excel files on a daily basis.  These spreadsheets are essential to the running of the business and many use complex functionality, interdependent spreadsheets and macros.  Although there are those that say this is not sensible, the fact is that it has evolved organically over many years into something that is extremely difficult to migrate.

We have many customers in the UK public sector in the same position.  In order for them to migrate, the effort would be immense, starting with gap analysis through migration, testing and deployment. Any diktat on file format usage must be clear on a transition plan from current assets, as well as performing a detailed evaluation of switching costs from current infrastructure.

My expertise in on the spreadsheet side, rather than word processing, so I would also be interested in how the UK gov't aims to help businesses pressure suppliers of Line-Of-Business systems to work with ODF spreadsheets.  These systems have evolved over many years to output and ingest Excel files with accuracy, replacing this functionality with an ODF version is going to be a long term project, even if the vendors consider there is enough business value in doing so.  As far as I know, none of the ERP vendors support ODF (SAP, Oracle, Sage, Epicor etc) nor do the Business Intelligence vendors (SAP, Oracle, IBM, SAS, QlikView, Tableau etc) In fact, SAS already have their own .ODS file which is not an ODF Spreadsheet, but their Output Delivery System format.

Search for "ODS Import" / "opendocument spreadsheet import" and see how many major applications you can find that support importing ODF Spreadsheets.

Let me also clear up some misconceptions about what can read and write OOXML and ODF.

MS-Office is perfectly capable of reading and writing ODF and their support is improving as ODF versions are released.  Interoperability issues are often when a feature exists in one format and not another - or the level of functionality of a feature is higher in one or other version.  For example, there may be a claim that a conversion from DOCX to ODT is lossy, but that is understandable if there is change tracking in the DOCX file, since ODF change tracking is not as sophisticated.

LibreOffice is particularly good in terms of reading and writing OOXML, at least two of the code contributors have been members of the OOXML working group, but there are still functionality gaps, given the limited, highly-skilled resources working on the project.

As a company, we would prefer that both formats were allowed, it would mean lower costs for our customers with existing assets in OOXML and we could still benefit from the considerable time and resources we have spent over the years investing in the techonology and standards process.  If such a well-established format, now an ISO standard, is simply legislated away, it would be a severe disincentive for companies to invest in participating in the standards process at all. If ODF becomes more popular, then demand from customers will mean that we also implement it in our products, but currently, that demand is not there.

If organizations with simplistic or small document environments choose to use ODF instead, or newly formed organizations where all their system vendors may already support ODF, then that may well be an appropriate business decision, however, punishing existing organizations for their infrastructure is unfair.

There needs to be a choice of formats, prescribing a format with very limited market traction runs counter to the essence of a free market. 

Gareth Horton

Datawatch

 

Open XML as a standard

Open XML as a standard process was certainly a scandal of transatlantic business intervention. As a format it may be of decent quality. Yet the patent licensing conditions were unclear and valid technical objections ignored when ISV partners flooded the national committees.

Without wanting to offend,

Without wanting to offend, this is textbook FUD.  

* It's too hard to change because people are locked in.

* It's too expensive to change because some tools only output XLS.

* It isn't a free market unless you include own format.

Do all of the tools you mention ingesting, and exporting Excel data do so as OOXML or as old-proprietary xls files?  I'm guessing the latter as you don't explicitly mention OOXML.

 

 

 

Gareth: it was interesting to

Gareth: it was interesting to read your input here.

> As someone involved both in the standards process, and implementation
> of the spreadsheet side of OOXML, I would like to clarify that OOXML
> is an international standard (IS29500) and is certainly implementable.

    That the standard is implementable is a given; that there is
one high fidelity implementation from a single vendor is a consequence
of having standardized that vendor's file-format. The real question is
what degree of fidelity is possible for any other office implementations
vs. such a large specification, and whether it is reasonable to
require all vendors to have to interoperate on these terms.

> If people have issues with OOXML, they can certainly participate
> in the process as I have done.  I was also a co-editor on an
> amendment to the specification.  Although we are Microsoft partners

    Participation in the process in my experience was limited to
minor tweaking of the description of Microsoft's file format, without
scope to actually improve the underlying XML representation. That
doesn't suggest a very significant concession to openness or
collaboration with other Office Suite vendors. Having said that, it is
clearly extremely useful to have a good description of Microsoft's
Office file format, and I'm glad you helped out.

> Microsoft cannot be held accountable for a lack of participation
> from those who have an interest and strong opinion in document
> standards and consider OOXML as somehow not a "real" standard.

    It's an interesting point; of course - I participated in the
ECMA process, and Jody (initially working for me) did an excellent
job, personally submitting a staggering proportion of all amendments /
improvements to the spec. However, no matter how accurately the
standard describes Microsoft's implementation, I do not believe that
makes it a suitable basis for a mandated document standard, unless you
want to specify the status-quo.

> the fact is that it has evolved organically over many years into
> something that is extremely difficult to migrate.

    It is easy to over-play the problems here by focusing on a
minority of extreme use-cases. Many organisations have migrated to use
ODF nearly exclusively, some have responded to this
consultation. Indeed, the need to immediately migrate all existing
documents is wisely explicitly ruled out: "There is no requirement to
transfer existing information, unless it is newly requested by a user
and shared."; thus a migration can be done incrementally reducing
transition costs.

> LibreOffice is particularly good in terms of reading and writing
> OOXML, at least two of the code contributors have been members of the
> OOXML working group, but there are still functionality gaps, given the
> limited, highly-skilled resources working on the project.

    Jody who did by far the lions share of the ECMA TC45 work no
longer works on the LibreOffice code-base, but Jody and I worked for
Novell in that TC initially to support Microsoft in what I still
believe was a useful and helpful endeavour: to provide the world with
an extremely detailed specification of Microsoft Office's file format.

    Such a specification in itself is significantly disjoint from
the question of whether the UK Government should standardize on an
extremely detailed description of a single vendor's feature-set (as
somehow codified in an ISO standard).

    As the ECMA TC45 work went on, Novell entered a Collaboration
Agreement with Microsoft [ http://www.sec.gov/Archives/edgar/data/758004/000075800406000109/novl-8k... ] that eventually included work to implement
OOXML support in OpenOffice then LibreOffice. That gives rather a useful
perspective on the difficulty of producing a high-fidelity implementation.

> There needs to be a choice of formats, prescribing a format with very
> limited market traction runs counter to the essence of a free market.

    ODF support has been implemented by Microsoft, giving it
significant market traction, there are also multiple high-fidelity
implementations of ODF, providing the basic framework necessary for
choice and meaningful competition in the free market.

Thank you for the thoughtful

Thank you for the thoughtful and detailed comment Michael.

I do have to disagree about edge cases, I have nearly 20 years experience in dealing with complex spreadsheet environments - nearly all of the Fortune 500 use our products, generally to create and update data within spreadsheet assets. We have only had a single request to implement ODS support from a small organization in Europe.

Current financial, budgeting and forecasting systems rely heavily on Excel and macros.  Irrespective of the appropriateness of the state of affairs, it is a fact.  Even companies with very advanced IT functions who still have the people that develop the spreadsheets have some difficulty maintaining them - the thought of a wholesale migration to a format which cannot even encapsulate the existing functionality is nightmarish. 

As you can imagine, we have actually managed to implement the spreadsheet side with enough fidelity to stay in business, given that our software touches many hundreds of thousands of spreadsheets worldwide on a daily basis.

Other companies have done the same, as I mentioned in my previous comment, plus there are hundreds of third party development component vendors that sell spreadsheet libraries which support Open XML.

As a UK taxpayer and given the government's appalling record with complex IT projects, I would prefer they did not attempt to migrate from their current systems with any alacrity. 

Given the large functionality gaps between ODF and OOXML, I am not sure where choice comes in.  If I have a current functionality set which I am happy with and is productive, then I am mandated to use a less functional set and suffer, what part of that is choice? What part of that is useful? I think that applies more to government's own operations than the issue of disseminating and receiving information from the citizenry, but I don't understand the business case. 

Choice and competition would be the ability to choose between Office suites, based on their functionality and performance, irrespective of format.  If a certain Office suite does not have the necessary resources and technical talent to compete with another, then it will fail. That is competition in the free market.  It is unfortunately a fact of life that it is difficult to compete with incumbents. Try breaking into the pharma industry.

If governments wish to improve competition in such areas, they should give incentives to enable those upstart competitors to compete, maybe tax breaks, grants etc to nurture innovation, not by punishing successful organizations by legislation.

I am also rather baffled by the phrase "significant market traction". I admit my experience is on the business side, rather than the consumer side - but I have not seen any traction at all.  Even countries/organizations that claim to mandate ODF and use our products, which don't support ODF have not asked for support to be implemented. They may be using CSV to import into Calc, of course.  

Perhaps if LibreOffice / Apache OpenOffice / whoever wants to push ODF offered simple and function rich libraries for developers to consume and create ODF, that market traction might accelerate. 

I think we can also gauge the calibre of this exercise by the fact that CSV is even mentioned as a "standard".  I have avoided writing anything about that for my own mental health.

Novell were also a customer of ours by the way, exporting to Microsoft Office ;-) Even your Finance organization didn't want to dogfood OpenOffice.

Best regards

Gareth

 

> Thank you for the

> Thank you for the thoughtful and detailed comment Michael.

   Hey - likewise, its good for both of us to go deeper on this. I
suspect that the root of our disagreement lies in the meaning of what
"supporting OOXML" really means. I think we talk at cross-purposes
here, my concern is high-fidelity support: loading, calculating,
rendering, editing, collaborating, exporting etc. ie. what is required
to have a genuine choice of Office Productivity Software in the
marketplace. I believe that your talk of support is primarily based
around generating spreadsheet data by using a small subset of the
OOXML format: ie. the equivalent of a stream of 'print' statements to
a zip container.  If that is the case I agree that if 'support' means
"generating some OOXML" then support is nearly trivial, however if the
goal is a genuine choice of Office Productivity Software across the
full gamut of OOXML, then a yawning gap appears.

   I tried to search for details on Datawatch's implementation of
OOXML, interestingly a google search for "Datawatch OOXML Microsoft"
took me to a dead link to a press release via Brian Jones (what a
guy!)'s blog and then to here:

http://www.businesswire.com/news/home/20070724005072/en/Datawatch-Select...

    which I excerpt; my comments are bracketed:

<i>"Microsoft High Potential Managed ISV partners are chosen based on strict
selection criteria including revenue growth and an established track record
of success working closely with Microsoft. ...
Another example of this close collaboration is the recent support for Office
OOXML file formats introduced in Monarch V9. Datawatch customers now
benefit from the ability to easily extract data from a variety of different
systems and formats and consume the information in ... </i>[ OOXML ]<i> ...
"Datawatch has always been tightly integrated with Microsoft, as more than
95 percent of Monarch customers use Monarch to export data into Office Excel,”
said John Kitchen, chief marketing officer for Datawatch. "... What we
accomplished with Office OOXML file formats is a perfect example of how
access to development resources and high-level technical support from
Microsoft can benefit Datawatch developers and users. We look forward to
a strong, long-lasting relationship."</i>

    I assume that that strong relationship continues to this day;
but that at least highlights the simple "export data as OOXML"
use-case. Incidentally on that topic I see nothing in the Cabinet
Office proposal that would stop DataStream from using your existing
OOXML export as an uninteresting implementation detail of a system
that ultimately converted that for storage / transfer to ODF. It would
be possible to do that using Microsoft Office even.

> Current financial, budgeting and forecasting systems rely heavily on
> Excel and macros.  Irrespective of the appropriateness of the state
> of affairs, it is a fact.

    Some financial systems do so. I agree spreadsheets are very
heavily used in these sectors, but I see no analysis that ODF is not a
suitable container for those - assuming a good quality ODF filter that
is. Indeed, it is great to see Microsoft at OASIS in the ODF TC, and I
see no barrier to improving the spreadsheet formats and
implementations collaboratively there to meet any missing needs that
can be identified.

    As for macros; despite explicitly asking during the ECMA / TC45
process for macro and form streams to be standardized, that did not
happen for various reasons. Indeed, at the time there were noises
about dropping macro support from MS Office completely (though that
didn't happen), and the topic certainly presents a significant
challenge.

    For interest, you can read the gory details of the documentation
that Microsoft (kindly) released for it's binary formats (which as one
who in the past struggled for hours reverse engineering their
compression format I greatly appreciate) here:
http://msdn.microsoft.com/en-us/library/cc313118.aspx particularly the
[MS-OVBA].pdf it contains a number of wonders - search for
"PerformanceCache" for example.

    Macros though are far worse, as they tend to introduce very
significant cross-platform issues, as the Windows / COM APIs are
prolifically used to work-around missing features in the VBA / Office
bindings. I believe it is normal for VBA macros not to function
correctly in Microsoft Office on Macs, VBA support being omitted
in some versions there.

> Even companies with very advanced IT functions who still have the
> people that develop the spreadsheets have some difficulty maintaining
> them - the thought of a wholesale migration to a format which cannot
> even encapsulate the existing functionality is nightmarish.

    I agree that the situation sounds nightmarish, and that some
people often mis-use the tools they are given to create over-complex,
under-documented, fragile, expensive to maintain, error prone tooling.
I'm not convinced that creating an opportunity to identify and migrate
that to more appropriate and robust tooling is a worse nightmare.

> As you can imagine, we have actually managed to implement the
> spreadsheet side with enough fidelity to stay in business

    As I said, I think the fundamental dis-connect here is around the
scope and meaning of 'implement'.

> Other companies have done the same, as I mentioned in my previous
> comment, plus there are hundreds of third party development
> component vendors that sell spreadsheet libraries which support
> Open XML.

    There are excellent third party tools all around the place that
parse and generate ODF. The ASF provide an 'ODF Toolkit' for exactly
this purpose for example. Writing trivial parsers and generators for a
series of XML files inside a .zip file is not the difficult piece of
creating a high-fidelity Office Suite implementation. It is rather
unclear to me though that such libraries provide a magic bullet for
providing Office Suite choice in the market.

> Choice and competition would be the ability to choose between Office
> suites, based on their functionality and performance, irrespective
> of format.

    Actually, I agree. I want people to love and choose LibreOffice
because it is better. However, back to the CO proposal: by requiring
interoperability with a standard that is a precise silhouette of
Microsoft Office: a suite with a -very- complicated surface – the
real-world effect is to reduce the choice to a single one. Mandating
ODF levels this playing field while being vendor and implementation
neutral – that exclusive choice is thus what drives choice, and is
to be applauded.

>  If a certain Office suite does not have the necessary resources
> and technical talent to compete with another

    On a level playing field, LibreOffice competes excellently. If the
playing field is drawn to effectively require perfect-fidelity
interoperability with Microsoft Office by mandating their standard the
situation is less clear cut.

> Novell were also a customer of ours by the way, exporting to
> Microsoft Office ;-) Even your Finance organization didn't want
> to dogfood OpenOffice.

    I can't speak for Novell, but it is clear that LibreOffice has
come a very long way in the last decade, particularly in the area of
spreadsheets and particularly while using a version with enterprise
support to solve customer problems.

    I can assure you that Collabora's Finance and Engineering personnel
use LibreOffice Calc spreadsheets (on Linux incidentally) without
significant problems (beyond the normal, well-known problems of mis-use
familiar to all spreadsheet users).

    Clearly an overnight transition to using ODF is not feasible for a
small minority of highly demanding use-cases, at least without
improved support for ODF from in Microsoft Office. However it is
certainly possible to migrate to ODF (and LibreOffice), as many
existing large, enterprise deployments show.

    Warmest regards,

        Michael Meeks.

"As someone involved both in

"As someone involved both in the standards process, and implementation of the spreadsheet side of OOXML, I would like to clarify that OOXML is an international standard (IS29500) and is certainly implementable."

Gareth: then you are in a position to offer an explanation as to why there exist half a dozen different representations of timestamps within *each section* of the OOXML standard.

i have mentioned in my comments here that my understanding of what OOXML is is that it is an automated "memory dump" of the internal data structures from the various office suite applications, and, as such, it has neither been designed nor reviewed.

in your understanding, would this be an accurate reflection of the reality of how OOXML was put forward as a standard?

additionally, could you please advise us of, in your professional opinion as an engineer, the amount of man-years it would take to implement a fully compliant converter for any office suite - fully compliant with all 450+ pages (or however long the current OOXML standard now is).

"LibreOffice is particularly good in terms of reading and writing OOXML, at least two of the code contributors have been members of the OOXML working group, but there are still functionality gaps, given the limited, highly-skilled resources working on the project."

exactly. you made precisely the right points. that to attempt to create a proper interoperable implementation of a non-designed standard (because it's a memory-dump of decades of internal data structures) is a completely unreasonable expectation.

and because of that, OOXML is completely unacceptable as a standard.

"If such a well-established format, now an ISO standard, is simply legislated away,"

several people have already made it clear in the comments here that the ISO organisation was brought into disrepute for being party to the process by which OOXML became a standard. UKUUG was one of the few groups that endeavoured to stand up to that, but the judge threw out the request for a Judicial Review due to the fact that the ISO group uses majority voting not unanimous voting.

Hi lkcl,

Hi lkcl,

I am not sure which timestamp representations you are referring to.  SpreadsheetML uses a a serial date representation in Transitional, ISO 8601 in Strict for storing data in cell values.  Do you mean areas such as pivot tables, where timestamp information is simply an indicator of the SQL data type of the data source? Some examples would be most useful.

Other date/times are ISO 8601 / XSD DateTime.

Take a look at Part 1, 18.17.4

If you want to read more on serial dates, I did a blog post some time ago which explains why they exist and why they are also used in ODF:

http://aristippus303.wordpress.com/2009/10/22/why-do-we-need-serial-dates-in-the-transitional-form-of-is-29500/ 

I am not highly familiar with WordprocessingML/PresentationML, but if you have particular examples, please post them and I can certainly look at them.

Of course, most standards trickle down from an implementation, so does ODF. You are not under the impression that someone sat down and wrote ODF in a vacuuum, then wrote the support into OpenOffice are you? If one could do some automated memory dump into a complex set of XML structures, generating it's own schemas into a workable file format, then my hat would be off, but no, this is an oversimplification. It has been reviewed as part of both the ECMA and the ISO ratification processes.  Of course, everything would benefit from more reviewing, ODF is no exception either.

In order to produce an estimate, it would require far, far more time than I have to spend on a comment. Is it just a specious/mischievous question, or would you seriously expect anyone to invest that amount of work in a response to a comment on the web?

No, it is not unreasonable, just hard.  Just think of it as a function of how many man-years have been invested by Microsoft, or on the ODF side StarOffice, Sun, Novell, IBM, generous independents etc etc. It's a big area - no sane person would relish the prospect of implementing the entire functionality set of a feature-rich office suite, whether it be Microsoft Office or OpenOffice.

If several people commenting here have said that ISO is a disreputable organization, then of course it must be true and no further discussion of the matter is required.

Gareth

 

SHARING AND COLLABORATING

SHARING AND COLLABORATING WITH GOVERNMENT DOCUMENTS
Microsoft Response to the government’s proposal

SUMMARY
Microsoft believes that the government will only meet its objectives of reaching the most people at the least cost for all if it includes the most popular and most widely used open standard document format – Open XML (ISO/IEC 29500) – in its short list of standard formats for sharing and collaborating with government documents.

The inclusion of HTML in the government’s proposal (alongside ODF) means the government recognises the value of avoiding a single standard for documents, but using HTML for discrete documents (or for more complex content like that often found in a spreadsheet) which can be edited “off line” is not practical, making the sole choice in the government’s proposal effectively ODF.

Mandating one open standard for discrete document formats over another completely ignores benefits enabled by a choice of modern formats and is therefore likely to increase (not decrease) costs (link to original open standards consultation), risk widespread citizen dissatisfaction (which government is attempting to avoid) and add (not remove) complexity to the process of dealing with government.

User productivity is gained by focusing on the delivery of modern applications for creating documents, spreadsheets and presentations. Document formats are simply a means of storing and transporting the content created as a reflection of the features enabled in those applications. While it’s possible to use a plain text format, for example, where the format is simple and light-weight, it is unable reflect any of the advanced features that modern apps such as Microsoft Office or Apache Open Office can represent. Support for multiple formats by such applications is a recognition of the fact that there are many reasons why people choose the features and functions in differing ways to generate content. Placing an artificial limitation on formats effectively says people must choose one tool over another. A mandate for ODF says that no one should open content in applications that don’t support ODF, like Google Docs for example or Pages on an Apple iPad.

Microsoft believes that the least cost and most effective way forward for any organisation seeking to ensure the maximum range of interoperability, the richest range of functionality and the widest use of common formats should be to embrace multiple open standard document formats e.g. both Open XML (ECMA-376, ISO/IEC 29500-1:2012) and ODF (OASIS ODF v1.1, ISO/IEC 26300-1:2006).

In this response we set out to address each element of the proposal and to present evidence that shows the government risks increasing costs and reducing interoperability by ignoring the fact that the vast majority of citizens and businesses already use Open XML as their preferred document format. While including ODF is a choice that Microsoft supports, ignoring and omitting Open XML will only ensure that the very things the government is trying to avoid are actually more likely to happen.

IN THIS RESPONSE

We will show that Open XML enjoys a popularity of use across the major domains of interest to the government, including across the major domains of the UK Public Sector (gov.uk and nhs.uk) and of UK business (co.uk) and of the 3rd sector (.org.uk).

We will also show that both Open XML and ODF are both open standards, both are recognised by the International Standards Organisation (ISO) and the International Electrotechnical Commission (IEC), that they are both supported by a range of tools (applications, apps, programs or services) widely available to the government, to business and to citizens in a range of circumstances to suite user needs.

And we will also show that it is the nature of standards to develop and occasionally supersede one another, and that for a time, multiple standards that appear to do the same thing can and do exist and thrive in parallel for many years to the benefit of all.
 
On this basis, we urge the government to avoid a costly and unnecessary focus on too narrow a selection of standards. Having already recognised in its proposal that no single standard will be adequate for its needs, we now urge the government to include Open XML alongside ODF and HTML in its list of standards for sharing and collaborating with government documents.

ADDRESSING THE PROPOSAL

In the sections below we have used the convention of repeating statements from the government’s proposal in bracketed text, e.g. [Text in brackets] and presenting our response in the subsequent paragraphs.

[Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials, sharing and editing documents.]

Citizens, business and delivery partners, such as charities and volunteer groups are able to interact best with government officials, sharing and editing documents if the latter embraces a reasonable range of international open standards for document formats, a choice that should include both Open XML (OOXML) and Open Document Format (ODF). Evidence of use (Given in the table below in the section entitled “Facts about standards”) already shows that most documents that can be found on the internet in domains most closely associated with these groups use Open XML over one hundred times more frequently than they use ODF. In fact, Internet searches show there are over 180 times as many Open XML (.docx) documents than ODF (.odt) documents found in the gov.uk domains, over 120 times as many in the org.uk domains, over 130 times as many in the co.uk domains and over 200 times as many across the ac.uk domains.
 
[Officials within government departments also need to work efficiently, sharing and collaborating with documents.]

Officials within government can work more efficiently, sharing and collaborating with documents if they embrace any XML-based open standard format other than Microsoft's legacy binary formats. Both Open XML and ODF are supported by a wide range of productivity tools, including all version of Microsoft Office since 2007. Open XML and ODF both offer dramatic reductions in file size, being based on data compression technologies. Officials already enjoy considerable familiarity with Microsoft Office and the main benefits of efficiency come from having access to the most effective and flexible tools, not from the format that any output is eventually saved in.
 
[Users must not have costs imposed upon them due to the format in which editable government information is shared or requested.]

By embracing and publicising that the government can create and consume content in a range of common open standard formats, the government will avoid imposing undue costs on any segment of society. Since the open standards considered here – Open XML, ODF and HTML – themselves present no cost to users of digital objects described by the standard nor for implementers to gain access to the specification of the standards (http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html), what we are considering here is the range and choice of tools (applications, apps, programs or services) users may access in order to create, edit or read documents saved in these formats. Making too narrow a choice (e.g. only ODF) risks alienating those citizens and businesses that either have no capability to read and edit ODF or who have already chosen to use Open XML as their default. That capability is found lacking in tools included with products like Apple's iPad (which needs additional software to read ODF) or in services like Google Docs (which has recently deprecated its ODF support). By embracing both Open XML and ODF, the government can guarantee to reach the widest audience across both citizens and businesses without obliging either to expend money just to communicate with government.
 
[As technology progresses, government’s production of editable information in formats traditionally associated with documents will become less important for users.]

It is, perhaps, too soon to call an end to information shared in discrete digital objects (we can call these "files") in any format or another. On the contrary, easily found evidence shows that the creation and consumption of discrete digital objects (files) is increasing and that the increase is faster than at any time in the past. What is important is that the format for a file is based on an open standard that allows the content to be easily consumed, with or without original layout "metadata", either by a person or - increasingly important - a service. It is true that much more information published by any creator will default to one of a number of formats common on the Internet, e.g. HTML, which is an important standard to use in this context, although this should not be the only standard used. It is also true that information sought by one party (e.g. the government) from another (e.g. a citizen or business) will be provided through some sort of web site or web service. Increasingly, this may become a communication from one service to another (and back again). In that context it will be important that the format of the data is identifiable between the services and that the format can cope with the richness of the content in an efficient way. But on either side of the transaction there will be, for a long time yet, a need to "take a copy" of the content in a format that can be stored, viewed and managed as a discrete digital object - a file. People will choose the format that most suits them. Government should seek to embrace the most popular, the most stable and the most developed standards to reach the most citizens and business.
 
[Government services are being redesigned to make them more straightforward and easier to use by making them digital by default. This will diminish the use of traditional government document formatting even further as information is published or collected directly on the web.]

We do not dispute this, in fact, we address this with the observations above. This is more a statement of direction than a requirement or evidence of a need supporting the proposal made.
 
[Users need to open, edit and save information online and offline]

This is possible in Open XML as well as ODF and HTML and is an inherent feature of the tool (application, app, program or service) chosen by the user to create the content they need to communicate, not an inherent function of the format chosen to store that content.

[Users need to submit information in response to a request, to perform a transaction or to access a service]

This is possible in Open XML as well as ODF and HTML and is an inherent feature of the tool (application, app, program or service) chosen by the user to create the content they need to communicate, not an inherent function of the format chosen to store that content.

[User need to share information with specific people]

This is possible in Open XML as well as ODF and HTML and is an inherent feature of the tool (application, app, program or service) chosen by the user to create the content they need to communicate, not an inherent function of the format chosen to store that content.

[Users need to publish information online so that a wide audience can access and work with it]

Although a document saved as HTML may appear to have an advantage for online publication, in practice an element of re-formatting will always be needed because there are significant layout differences between a "web page" and an "A4 page". Collaboration during the creation phase of information destined for publication can rely on one format, while the tools used will then readily convert the "collaboration" format into the "presentation" format (of, perhaps, HTML). Since these are widely supported open standards, there exists a wide range of tools able to do this, hence, this is more a requirement on the capability of the tools chosen in any given context than of the formats used at each stage in the document production. This user need also touches on the difference between “sharing and collaborating” and “viewing” documents, since formats like PDF can allow people to access the content and (through copy and paste functions) to work with it. This requirement can also be met using the proposed standard for “viewing government documents” and is not exclusive to the formats in this proposal.

[Users need to edit information and be confident that it remains usable and editable when saved and shared with other users]

There are two elements essential to the success of this requirement - the first is to use a mature, feature rich and widely supported format – Open XML and ODF are both excellent examples and the second is the proven interoperability of the tools chosen to transact the documents. Interoperability between tools is a function of two things: [1] the capabilities of the developers of the tools and [2] the ease of interpretation of the detailed features of the relevant standards and how easy or difficult they are to implement. It is evident from the widespread support of both Open XML and ODF that either standard is described in sufficient detail, with sufficient richness of features, for them to be successfully implemented by more than one party. LibreOffice, Apache Open Office, Google Docs and Microsoft Office offer rich support for Open XML. Thereafter, what we see is that the subtle nuances of implementation, and how those implementations handle different content captured within different formats, become the subject of extensive interoperability testing. A "reference implementation" may be chosen and different data (content) is encoded in the standard format by the reference implementation and experts compare other implementations of the standard against it. This is a dynamic process and, to some extent, an on-going process for as long as the standard is in development. All the standards considered here are "in development" and, as such, there will always be minor differences of implementation to be resolved through interoperability testing. The point we are making is that this is not an absolute science and there is no automatic or guaranteed "perfect" implementation. Confidence is therefore built on the basis of experience and commodity. The available evidence suggests that the most common "modern" open format used is Open XML and on that basis it is reasonable to expect that experience with that format is more likely to have resolved interoperability differences than with other formats. That said, it is important to recognise that there are many events organised to test the interoperability of products that use the Open XML and ODF formats and both communities organise interoperability workshops (sometimes also called “plug fests”). It would be an academic exercise to differentiate between instances of the implementation of either Open XML or ODF and, for all practical purposes, their implementations should be considered equivalent (wherever they have been implemented and tested against their "reference").

[Users need to create a new document with the same style as documents previously]

This is possible in Open XML as well as ODF and HTML, but again, it is perhaps more a function of the capability of the editor or tool used to create the document than the format itself. While the standard for the format may allow certain features to be expressed in the layout ("style"), it is entirely in the hands of the developer of the tool as to whether their tool (application, app, program or service) will implement all the capability of the standard. Hence, a user with access to the same tool will be able to create multiple documents in the same style (but with different content), whereas the government cannot guarantee that a document sent in a given "style" sent to another user (which they may display, print or even edit the content in-situ) can be used as a template for other documents with different content. That is a capability of the tools used and not the format of the document.

[Users need to export the documents created in a non editable format so that they can share a document as they intend it to be presented]

This is a requirement of the tools used to create and manage documents, not on format in which the document exists in its editable form. This therefore sets a procurement requirement which then relies on the standard selected for viewing government documents where the PDF format can be used to view government documents. It is worth observing, however, that the ability of the recipient to edit the content is often conferred or denied by the choice of the tool (application, app, program or service) used to view the content. Free (no cost) viewers exist for most, if not all platforms for PDF and allow the user simply to view (and in some cases annotate) the content, but not to edit it. In other cases, subject to the rights set within the format, users may be able to copy the content or import the content (and its layout) into an editor, which may be saved into another format or re-exported into the PDF format. All this raises further questions about what the user is trying to achieve through this requirement – is it simply for most people to read the document or is it an absolute prohibition on being able to re-use or re-purpose any of the content. PDF is a good standard to use for the former, but other technologies may better support the latter.

[Users need to export the documents they create in a format compatible with other software so that other people can use the information]

This could mean the tool has the ability to save other or many formats or it could mean they are attempting to select the format that's most widely implemented, but could also refer to an interoperability requirement which we have already addressed above.

[Users need to share information so that they can gather feedback]

This is possible in Open XML as well as ODF and HTML.

[Users need to share information so that they can respond to a request for information]

This is possible in Open XML as well as ODF and HTML.

[Users need to view/edit the information shared with them so that they can read/act upon the content]

This is possible in Open XML as well as ODF and HTML.

[Users need to provide input on information created by someone else]

Open XML provides unique capabilities in this area that go beyond what is currently possible in either ODF or HTML. Although we accept there are a number of ways to achieve this user need, perhaps the most efficient means depend on capabilities existing in both a product’s features and in the chosen document format. Change tracking is a product feature that relies on robust support in the chosen document format. As if to illustrate the developing nature of open standards in general, while Open XML supports a comprehensive changing tracking feature set, supported by a number of products, including LibreOffice and Microsoft Office, ODF support for change tracking is still somewhat a work in progress (https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office-collab) and (as if also to illustrate the collaborative development of open standards) there are calls and offers of help to address these deficiencies (http://blogs.msdn.com/b/dmahugh/archive/2011/03/16/change-tracking-in-od...).

Users could, of course, iterate to the lowest common denominator within a set of features to achieve their objectives, but many organisations cannot tolerate such inefficiency and, instead, choose products and standards that enable the most effective and efficient way of collaborating on a shared document that modern technology can enable. This is a good example of where a tolerant and flexible choice of open standards can assist organisations who seek the lowest total cost of ownership from their own IT and to avoid imposing unnecessary costs on others.

[Users need to copy and paste content from one source to another so that they can quickly collate pieces of information in one place]

This is possible in Open XML as well as ODF and HTML.

[Users need to edit information created by an integrated system they work with so that they can add additional information]

This is possible in Open XML as well as ODF and HTML.

[Users need to gather feedback on information they have drafted so that they can apply other people’s recommendations to the content]

This is possible in Open XML as well as ODF as revision marking is possible in both formats, but is only available depending on the choices made by the developer of the tool and in HTML only if the web developer builds such capability into their web site, but there is no standard mechanism in the HTML standard to capture such feedback - it becomes a unique feature of the web site. We would also reiterate here the comments above on differences in support for change tracking features between the Open XML and ODF standards, where the latter could be said to still be a work in progress, even in the latest version (OASIS ODF v1.2).

[Users need to see version updates so that they can be sure they’re working on the latest version of a document]

This is possible in Open XML as well as ODF where each standard contains the ability to record such data in the metadata of the file. In HTML this would be a unique feature of the relevant web site, since the HTML standard does not provide a prescribed way to record this.

[Users need to access information from any appropriate place so that they can get on with their work.]

Success against this requirement hinges on the definition of "appropriate place". Even for HTML there may be places where it is impossible to render the content to be worked on appropriately to the task at hand. For example, just because a smart phone has a web browser compliant with the HTML standard does not mean that it is practical to use any and all web services that may allow for the editing of content necessary to complete any given task. A web browser does not specify the screen resolution or aspect ratio used (and even on a tablet or laptop, the user can re-size the browser window) and some browsers may attempt to "interpret" the content best for the screen size and resolution of the device it runs on. These are choices of the browser developer and beyond the knowledge of the creator or intended editor of the content. So what is an "appropriate place" in the context of the work needing to be done? One might ask whether this is more about an "appropriate device" that can be used in an "appropriate place" in the context of the work needing to be done. Again, there are no absolutes when satisfying this requirement. Some devices may be appropriate, but only in certain places and some devices may be used anywhere, but may only be appropriate for certain tasks (they can do some things, but not everything). What is appropriate will depend heavily on the choice of the user, the place, the time and the range of acceptable solutions to the task in hand. Both Open XML and ODF find implementations on a broad range of devices, although the evidence suggests that Open XML perhaps enjoys a marginally broader range of device support at this time (more web services support this and more phones, tablets and other devices support Open XML), making the choice of ODF in this context a little less certain.

[Users need their devices not to be clogged up with downloads]

This is a function of the tool (application, app or service), not of the format in which the data is then stored. There are apps that don't download the file at all, but allow the user to work on the parts they can display. There are apps that download the file temporarily, permitting "cached" editing which is then saved to an on-line (cloud) storage service when the editing is complete (and the cache is deleted) and others allow the user to choose where to store the file at any stage in the process. This is not determined by the format, but by developer choices for the tool.

[Users need to ensure integrity of specific documents, e.g. audit trail for editing, versioning]

Again, this is more a function of the tool than the format. A user may be able to edit a document on their smart phone and save it in any of the standards considered here, but they may not have the ability to view or change the meta-data associated with the file. Equally, they may edit the same file on their tablet and the tool they use there can view and edit the meta-data. The tool may even be able to remove all or most of the meta-data. The ability to control the right to edit the meta-data may also depend on the capability of the original author's tools (and not on which format it is chosen to write the data in).

[Users need to use the information on the device and platform of my choice, for example laptop, tablet or smartphone]

As discussed above, this is dependent on what "use" of the information the author of that information (content) may be expecting a recipient to make. It depends on the definition of "appropriate" for the place, the time and the device of the user's choosing. This does not imply that a user will be obliged to incur additional cost to use the information, but that the available uses are a function of the user's choice of tool, rather than the author's expectations, irrespective of the format of the content. In the most extreme terms, an author who requests recipients to inspect and edit data in a 200 column by 10,000 row spreadsheet is not, in any practical sense, allowing the recipient (the user) to perform that action (easily) on (say) a smart phone. Some choices depend on the scale and layout and complexity of the content and the user may therefore sometimes not have a "free" choice - but whatever choice they get, it is largely independent (or much less dependent) on the document format than on the content of the document and on the action that needs to be performed.

[Users need to be able to use accessibility tools with information in online and offline formats]

Accessibility is a rich and complex subject, dependent on the user's choice of device and, perhaps, the inherent tools (applications, apps, programs, services) that they have available on that device. E.g. HTML does not, inherently, include any accessibility features in the standard, but the standard itself allows information to be displayed to be tagged so that applications enabling greater accessibility of the information can recognise and react to those tags. The same is true for other applications used to create content in other formats - accessibility is largely a function of the tool and effort of the content author to include appropriate structure and tagging to support accessibility and more independent of the document format.
 
[Users are able to efficiently share and work on editable government information]

This is possible in Open XML as well as ODF and HTML.

[Users are not required to buy new software to submit or work with government information]

There are many tools (applications, apps, programs or services) available at a range of costs to meet this requirement. An Internet search will reveal many capable of handling the major formats recommended here (Open XML, ODF, HTML and PDF). Many devices now include software (inclusive to the cost of the device) that is capable of handling these formats. Many citizens and businesses will already have software that is capable of handling these formats. Viewers which permit “copy & paste” of the content are available for no cost for all the formats proposed here, permitting (almost as a last resort) the user to import the content into a tool of their own choosing. Several tools (applications, apps, programs or services) are available at no cost to the user to work with the Open XML format, including LibreOffice, Apache Open Office and consumer services like Word Online. Companies and organisations with more complex information technology needs should consider total cost of ownership when selecting tools to process documents, since while some tools may not have a license fee, they must still be deployed, supported and managed throughout their lifetime of use by the organisation.

[Users are able to re-use data and text, where licences permit]

This is not a function of the document format. It is the responsibility of the data owner to set the intellectual property rights on the content within a document. None of the formats recommended here – Open XML, ODF, HTML or PDF – assert or constrain rights over any content in any document created in the format. Some commentators have asserted there are limitations caused by different approaches to intellectual property rights between different standards organisations. We believe these assertions are false and cause unnecessary concerns. Widespread implementation of all the standard above, including by open source software, is evidence enough to reassure users that the content any user creates and saves in any one of these formats is and remains the user’s content and is not limited in any way by intellectual property rights approach taken by any standard setting body. Further, since both Open XML and ODF are standards ratified by more than one international standards organisation, users should take comfort that such support further ratifies the openness of these standards.

Additional requirements (functional needs) given are that the format should support:

[Characters associated with Unicode 6.2 for text based file formats (in accordance with the standards profile for cross-platform character encoding)]

This is true of both Open XML and ODF.

[Digital continuity - having implementations that enable support for import of older formats]

This is a function of the tools used, not a feature of the format selected (see above).

[Use of metadata]

Use of the metadata is a function of the tools used, not a feature of the format selected (see above).

[Imports and exports to/from other applications]

This is a function of implementation and interoperability between different implementations. Both Open XML and ODF have many implementations. Many implementations are now an inherent (i.e. inclusive) feature of a device or service and one can easily argue that Open XML has popular support over a broader range of popular devices and services. While this is true, we would still encourage the government to select both Open XML and ODF for sharing and collaborating with government documents.

[Fonts and graphics that are reusable in other formats]

Graphics are another subject for a discussion of standardisation. For all practical purposes, both Open XML, ODF and HTML can "embed" a range of common and popular standardised graphics formats within their documents.

[Creation of templates]

Both Open XML and ODF standards permit the creation of templates for documents. HTML is more flexible and less prescribed in this context, having coding standards in which web templates may be stored, but with more freedom for the developer to define the "template" structure. In this context, that flexibility may not offer the least cost or most productive mechanism for sharing and collaborating with government documents (unless all recipients just happen to be web developers).
 
[Citizens, businesses and delivery partners must be able to interact with government officials and services, or those working on behalf of government, sharing appropriately formatted, editable information.]

This is more likely and possible, at lower cost to all parties if the government includes Open XML along with ODF and HTML.

[Users should be able to work on their device of choice and must not have costs imposed upon them due to the document format in which government information is provided or requested.]

There is no better argument to support a wider range of document standards than this. Evidence shows that the most popular modern format in widespread use by citizens, businesses, charities, in education and by government itself is Open XML, making the choice of ODF alone quite perplexing. We urge the government to include Open XML in its list of document format standards to ensure the broadest range of people have no costs imposed upon them. Indeed, we observe a wide range of tools are available across a variety of platforms which supports Open XML, including applications like LibreOffice and Apache Open Office and services like Google Docs and Microsoft Office Online, with some available at no cost and others available under commercial terms which include licensing and support. Even users who currently enjoy free services will incur a cost to change, even if that is only to find and install some other application, if the service they use does not support the government’s proposed standard. Including Open XML in the proposal will eliminate more, if not all of the potential need for change because it permits the continued use of a wider choice of device, application or service. A narrower choice, as in the current proposal, will inevitably force some to change.

[Documents should be editable on different devices without loss of integrity - the information should not become spoiled.]

This is not a function of the format chosen, but of the quality of the implementation of the format. No format is inherently easier to implement than any other in this space. Implementation extent is a choice of the developer, not the author of the document format standard or the author of any document in a particular format. Developers can and do choose to implement more or (often) less of the features made possible by the standard depending on the platform they are targeting with their tool (application, app, program or service). Implementation quality is measured by collaborative interoperability testing - interoperability is not an inherent feature of any standard, it is a goal achieved through excellence in implementation.
 
[When dealing with citizens, information should be digital by default and therefore should be published online. Browser-based editing is the preferred option for collaborating on published government information. HTML (4.01 or higher e.g. HTML5) is therefore the default format for browser-based editable text. Other document formats specified in this proposal - ODF 1.1 (or higher e.g. ODF 1.2), plain text (TXT) or comma separated values (CSV) - should be provided in addition. ODF includes filename extensions such as .odt for text, .ods for spreadsheets and .odp for presentations.
 
For statistical or numerical information, CSV is the required format, preferably with a preview provided in HTML (4.01 or higher e.g. HTML5).
 
Forms and information exchanges should be digital by default where this is enabled, therefore use of office formats should not be encouraged for the completion of forms.
 
For information being collaborated on between departments, browser-based editing is preferable but often not currently available. Therefore, information should be shared in ODF (version 1.1 or higher e.g. ODF 1.2). The default format for saving government documents must be one of the formats described in this proposal.]

There is no evidence in the proposal to support these recommendations. When evidence of use and implementation is gathered and examined, it suggests that the use of Open XML is more prevalent than the use of ODF as we will show below. This justifies the claims made in this response and the assertion that the government will achieve its goals quicker, at lower cost and with a lower impact on citizens and businesses if it includes Open XML and ODF and HTML in its list of standards for sharing and collaborating with government documents.

[To avoid lock-in to a particular provider, it must be possible for documents being created or worked on in a cloud environment to be exported in at least one of the editable document formats proposed.]

This is again a requirement on the web services (cloud environment) chosen. It would be safer to embrace a slightly broader range of standards and enjoy a wider choice of web services.
 
[Information that is newly created or edited should be saved in one of the formats described in this proposal. There is no requirement to transfer existing information, unless it is newly requested by a user and shared.]

This is a matter of policy, but it may be unavoidable that some documents have to be converted. Absolute prohibition of this conversion will be difficult and impractical to manage, monitor or control. Any new standard (mandated or simply recommended, since, following RFC2119, the use of "should" above implies a recommendation, not a mandate) will drive behaviors that are unnecessary or undesirable. The simple inclusion of Open XML in this standard will obviate the need for much of this unnecessary conversion.

[A government body must not refuse to accept or supply a document in at least one of the open formats described in this proposal]

This is a statement which seems to imply that a government body MAY refuse to accept or supply a document in any open format that is NOT described in this proposal, which seems to be counter to the earlier principle that the government should not oblige any citizen or business to incur additional costs in order to share and collaborate with government documents. Accepting Open XML into this list will enable the government to follow this mandate with the largest proportion of citizens and businesses (whether they use Microsoft Office or not), because it will include anyone who can deal with Open XML, ODF and HTML - a cohort of the constituency which must be close to including everyone.

[Documents may be shared in other formats but only in response to a specific request from a user]

This should be an action reserved for only the least popular or esoteric of formats. Open XML is neither of these. It is, in fact, the most popular of the modern open (XML-based and compressed) document formats.

[Existing documents should be migrated to the formats specified in this proposal if they are re-opened for editing or are requested by a user]

If the government follows our recommendation to include Open XML in its list of standards, then either much of this work would be unnecessary or the cost of converting from older formats (e.g. from Microsoft Office legacy binary formats) would be greatly reduced or even completely eliminated.

[Government bodies should avoid bespoke implementations which may limit their ability to migrate information or to share it with other users]

Such a principle may seem attractive in general, but clearly begs the question “a bespoke implementation of what?” There may be good reasons for avoiding bespoke implementations of anything for which there is a commodity, off-the-shelf alternative (“why build, not buy?”), but sometimes a unique piece of business process logic may need to be written in order to address a unique business requirement (e.g. the UK’s income tax or benefits system is unique to the UK, existing, as they do, in single systems operated by and on behalf of a single customer). So long as the implementation is proven to adhere to the specification of any standard chosen for use within it, there should be no migration problems for data written to that standard. For example, most government web sites are bespoke to one extent or another (they certainly appear nowhere else on the Internet without a degree of modification), but they are written to comply with standards (like HTML) which are chosen to ensure interoperability (largely, in the case of web sites, with as many web browsers as it is practical to support). If, for example, the tax system produced information to be consumed (viewed or edited) by a taxpayer (citizen), the system could be written to produce Open XML (or ODF or PDF) directly by following the standard, without the need to produce the content in an intermediate format and convert it into an open standard format before sending it to the citizen. This is the benefit of open standard – there is no limitation on what sort of tool (application, app, program or service) can use them. Just because a standard describes a document format should not constrain the software that produces the document to being a word processor.

[Macros should be avoided wherever possible, particularly when sharing documents.]

Macros are incredibly useful, but also a subject of some debate. In broad terms, they are simply a means to an end and there are a number of circumstances where their use is either invisible to the user or benign to all the recipients of a particular document. Again, the capability to process a macro depends more on the capabilities of the tool (application, app, program, service) used than on the format chosen. If an author depends on the widest reception of their document - and use of the macro therein - then it is clear that with a range of tools available to users (not to mention devices and their inherent processing capabilities chosen by users), the author cannot guarantee that the macro can be processed as intended. However, to set a rule which says "macros should be avoided" (even with the "wherever possible" rider) risks inadvertently disabling their use in situations where all collaborators are known to have access to a tool which will process the macro or where it is explicitly declared in the document that a macro needs to be run, guiding the recipient in their choice of tool (and, perhaps device), either of which many be the most effective and least cost means of getting the job done.

We would also observe that what is and what is not a “macro” is difficult to define and police. “Avoid macros in spreadsheets” may seem a limiting, but actionable instruction, whereas “avoid JavaScript in a web page”, while equally limiting, may seem to many a ridiculous thing to discourage. Yet there is arguably little difference between the functions and outcomes of JavaScript in a web page and a macro in a document (in the form of a spreadsheet or otherwise). And when the web page in question is not hosted on a web server, but is, instead, a discrete digital object (a document) containing HTML and JavaScript which may be processed automatically by a recipient, the distinction between the two narrows and the ease with which one can be avoided, but not the other calls this user need into question.

[Government officials should engage with interoperability testing initiatives for document formats]

We would wholeheartedly encourage this activity across all the leading open standard formats and specifically for Open XML, ODF, PDF and HTML for one very good reason - that no standard will wholly survive the ingenuity, creativity or innovation of the users of any tools that implement the standard when they create content to share with anyone else. Interoperability testing is more effective when tools and standards are exposed to real world users and uses.

[Government officials should engage with standards bodies associated with the maintenance of standards that are agreed for document formats for use in government]

We would wholeheartedly encourage this activity too and encourage the government to make an appropriate investment in skills and time in order to add and extract value from these processes, which can last for several years. More participation in these processes is to be welcomed. To our knowledge, only Microsoft has been consistently involved directly in the maintenance of all the standards under consideration here, contributing significant technical expertise to improve the interoperability and capability of these standards. Microsoft has been consistently involved with every stage of each of the specifications and has submitted meaningful contributions and implemented the standards. It has been our privilege to work alongside experts from many companies and organisations, including the UK’s own BSI, as well as experts from more than 30 countries. In particular we note that the British Standards Institute (www.bsigroup.com) has had three of their experts directly involved in the maintenance of the ISO/IEC 29500 (Open XML) standard for many years, making significant and valuable contributions to the standard through their work on ISO/IEC JTC1/ SC34 / WG4 and that the British Library has been an influential participant in developments of the same standard too. While these organisations are not part of the UK government, their involvement illustrates the breadth of contributions made to this standard, especially by experts from the UK.
 
[This proposal, if agreed, would apply to information produced by or on behalf of central government departments, their agencies, non-departmental public bodies (NDPBs) and any other bodies for which they are responsible. These government bodies would need implementation advice to give clarity about when to use particular formats, the user needs they meet and the interoperability that can be expected.]

We would observe that whatever choices are made by central government, the effect of those choices, particularly in this subject area, will be felt much wider than the organisations strictly within this scope. The impact of this decision will extend to all parties - citizens, businesses and other parts of the public sector - whether that is the government intention or otherwise. Such widespread impact should be considered explicitly.
 
[A document metadata profile is outside the scope of this proposal, although this may be the subject of other challenges taken through the Standards Hub process.]

Microsoft welcomes the opportunity to consider any future challenge on metadata.
 
[Assessment of tools that can be used for providing multiple formats from a single standardised format are also outside the scope of this standards challenge.]

We would again recommend that the government does not use a single standardised format, but embraces and supports a small range of formats which it may then choose from appropriate to the user need. When selecting tools that can be used for conversion between formats, the government should clearly assure itself of any tool’s claimed compliance with a standard, but the government should also consider carefully the total cost of ownership for tools that would be widely used by officials. To cut to the chase, license costs alone are not a reliable indicator of the Total Cost of Ownership (of any IT system) and a thorough and robust TCO analysis should be as much a part of any procurement decision as an assessment of the functional capability of the tool itself against the functional requirements or user needs.

The remaining sections of our response cover “FACTS ABOUT STANDARDS”, “DATA ON THE USE OF POPULAR FORMATS”, “APPLICATION SUPPORT FOR DIFFERENT FORMATS”, “THE DYNAMIC NATURE OF STANDARDS”, “IMPACT ON THE WIDER ECONOMY” and “CONCLUSIONS”.

FACTS ABOUT STANDARDS
 
ODF is an open international standard for document formats with at least two Standards Development Organisations (SDO). By OASIS as ODF v1.2 and by ISO/IEC as ISO/IEC 26300:2006 (also known as ODF v1.1).

ODF standards are governed by ISO/IEC Joint Technical Committee 1, Sub-Committee 34, Working Group 6 (or ISO/IEC JTC1/SC34/WG6)

ODF v1.2 is not yet an ISO/IEC standard, but will be considered by the ISO/IEC JTC1 in due course.

Open XML is an open international standard for document formats with at least two Standards Development Organisations (SDO). As ECMA-376 and as ISO/IEC 29500:2012. (Suggestions that Open XML is not an open standard devalues the work and efforts of a number of individuals, organisations and companies (other than Microsoft) who have worked hard contributing to and improving this open standard over many years.)

ISO/IEC 29500 has four parts (ISO/IEC 29500-1, -2, -3, -4:2012) and allows for two states of implementation commonly known as Open XML "transitional" and Open XML "strict". These are not separate standards and are fully disclosed in the ISO/IEC 29500 documentation.

Open XML standards are governed by ISO/IEC Joint Technical Committee 1, Sub-Committee 34, Working Group 4 (or ISO/IEC JTC1/SC34/WG4)
 
Microsoft Office 2007 supports ODF v1.1 and Open XML "transitional"
Microsoft Office 2010 supports ODF v1.1 and Open XML "transitional"
Microsoft Office 2013 (and Office 365) supports ODF v1.2 and Open XML "transitional" and Open XML "strict"
 
The current versions of LibreOffice 4.2 and Apache OpenOffice 4.0.1 support ODF v1.2 and Open XML "transitional", a fact which would seem to addresses suggestions that this (or any) part of the Open XML standard is not “open” or that this part is only implemented by Microsoft.
 
Older versions of Open Office and LibreOffice support ODF 1.1 and Open XML “transitional”.

PDF v1.7 is a superset of PDF standards described collectively in ISO 32000-1:2008, which includes related standards like ISO 19005-1:2005 (or PDF/A). PDF v2.0 is in development ISO Technical Committee 171, Sub-Committee 2 (or ISO/TC 171/SC 2), and is currently in the Committee Draft stage (ISO CD 32000-2:2013).

Microsoft Office support for PDF is explained in the following URL: http://blogs.msdn.com/b/officeinteroperability/archive/2013/04/04/micros....
 
Microsoft has published its legacy binary document formats under its Open Specification Promise, making them available free of charge to anyone who chooses to implement them. This means that while some may still do so, nobody actually needs to "reverse engineer" these formats, as their specifications are freely available.
 
Microsoft Office is the most popular office productivity suite both globally (with over one billion users) and in the UK.

DATA ON THE USE OF POPULAR FORMATS

Searching on http://www.google.co.uk for the frequency of different document formats in different Internet domains revealed (on 5th February 2014) the following:
 
In all domains there are more .doc, .xls and .ppt files than any other editable format, which is unsurprising given that these formats have been in the market for over twenty years.
 
In the gov.uk domain there are 182 times as many DOCX files as ODT files, 12 times as many XLSX spreadsheets as ODS spreadsheets and 34 times as many PPTX presentations as ODP presentations.
 
In the nhs.uk domain there are 13,000 times as many DOCX files as ODT files, 1,910 times as many XLSX spreadsheets as ODS spreadsheets and 1,620 times as many PPTX presentations as ODP presentations.
 
In the org.uk domain there are 124 times as many DOCX files as ODT files, 61 times as many XLSX spreadsheets as ODS spreadsheets and 129 times as many PPTX presentations as ODP presentations.
 
In the co.uk domain there are 133 times as many DOCX files as ODT files, 50 times as many XLSX spreadsheets as ODS spreadsheets and 108 times as many PPTX presentations as ODP presentations.

And in the ac.uk domain there are over 200 times as many DOCX files as ODT files, over 40 times as many XLSX spreadsheets than ODS spreadsheets and over 30 times more PPTX presentations than ODP presentations.

The table below shows the number of items found by Google’s search engine in each domain for each file type. File types are grouped into those related to textual documents (PDF, DOC, DOCX, ODT, TXT, RTF), those related to spreadsheets (XLS, XLSX, ODS, CSV) and those related to presentation graphics uses (PPT, PPTX, ODP).

(Table of data)

gov.uk domain
PDF 12,400,000
DOC 2,420,000
DOCX 43,400
ODT 238
TXT 41,800
RTF 127,000
XLS 561,000
XLSX 9,640
ODS 780
CSV 37,900
PPT 29,300
PPTX 1,690
ODP 50

nhs.uk domain
PDF 2,980,000
DOC 698,000
DOCX 13,000
ODT 1
TXT 5,260
RTF 2,220
XLS 28,300
XLSX 1,910
ODS 1
CSV 1,140
PPT 12,500
PPTX 1,620
ODP 1

org.uk domain
PDF 14,500,000
DOC 4,130,000
DOCX 261,000
ODT 2,100
TXT 3,180,000
RTF 109,000
XLS 191,000
XLSX 18,700
ODS 307
CSV 35,700
PPT 163,000
PPTX 16,300
ODP 126

co.uk domain
PDF 9,890,000
DOC 4,340,000
DOCX 456,000
ODT 3,440
TXT 2,370,000
RTF 335,000
XLS 76,400
XLSX 38,800
ODS 774
CSV 44,000
PPT 184,000
PPTX 27,800
ODP 258

ac.uk domain
PDF 9,130,000
DOC 4,060,000
DOCX 413,000
ODT 1,970
TXT 13,700,000
RTF 148,000
XLS 235,000
XLSX 29,300
ODS 704
CSV 115,000
PPT 41,000
PPTX 41,500
ODP 1,340
 
Similar results are produced by Microsoft Bing, but we have used Google to avoid any suggestion of a conflict of interest.

These results (dynamic and “live” as they are) illustrate (at this point in time) that the most popular current “revisable” format used by people across all four domains for each use type is Open XML.

We would add the following notes:

1. Strictly speaking, PDF is also a “revisable” format, but it is most often used to create documents for viewing. No-cost viewers are available for almost every device and platform allowing a PDF document to be viewed (read) and in some cases even allowing the reader to bookmark or annotate their copy of the document with comments. In general, free PDF readers do not allow the user any ability to modify the original content received.

2. According to the results above, Microsoft’s legacy binary formats actually remain the most popular “revisable” formats in use. Re-running the search above after even just a few hours showed that the number of items found in these old formats had increased. However, our focus here should be on “current” formats, i.e. those specified in an up-to-date, modern and international “open” standard. And while Microsoft has made these legacy binary format specifications available at no cost under our Open Specification promise (one could describe them as “de facto open” formats), they are not the result of an open standardisation process nor are they based on an extensible mark-up language (XML) schema in the way that both Open XML and ODF are.

Even before Microsoft made its legacy binary formats freely available under our Open Specification Promise (http://msdn.microsoft.com/en-us/library/gg615407(v=office.14).aspx), Microsoft's legacy binary file formats had become (and seem to remain) the most widely used and most widely implemented revisable document formats. Since publication in 2011, developers of other office productivity suites need not "reverse engineer" these formats. These formats still enjoy the widest range of use in both on-line services (some at no cost) and in applications which can be installed locally on a variety of devices (some at no cost). Such applications are all licensed in one way or another, whether use of the license is paid for or not and whether their source code is openly available or not, giving people tremendous choice in how they create and consume a variety of content.

APPLICATION SUPPORT FOR DIFFERENT FORMATS

There appears to be no authoritative source for the comparison of personal productivity applications and their capabilities, but perhaps, even as problematic as it might be, this reference on the ubiquitous Wikipedia (http://en.wikipedia.org/wiki/Comparison_of_office_suites) offers some indicative comparisons.

Following historical use of Microsoft’s legacy binary formats, which appears to make them remain the de facto first choice for applications seeking the widest compatibility, it seems that the second most widely used format (or format family) across such applications is (arguably) Open XML, with ODF a close third. Several of these applications are notable for their "genealogy", particularly amongst the open source products, where several applications now exist because of earlier "forks" in the code. Some of the branches have died out (e.g. KOffice), with others persist (e.g. forks of the now defunct Sun StarOffice exist now as Apache OpenOffice and LibreOffice). Some proprietary products listed on the reference above have left the market (e.g. Microsoft Works, which Microsoft discontinued in late 2009 and Lotus SmartSuite, which IBM ceased to market in June 2013 and for which all support will cease in September 2014), while others continue (e.g. Corel WordPerfect Suite). While the contents of the reference are not perfect (or up-to-date), it does show some broad trends.
 
For most of these applications, there is a continuing and building theme to support both Open XML and ODF. The most recent versions of OpenOffice and LibreOffice both claim additional development and interoperability testing for "round trip" Open XML support.
 
It is also notable that some on-line services have recently chosen to focus only on Open XML, abandoning ODF support (see https://productforums.google.com/forum/#!topic/docs/S6IkdnuH91E and http://www.muktware.com/2012/10/why-is-google-not-supporting-the-open-do...). And we note also that a number of free-to-use and paid-for applications (or “apps” as they’ve become known in many tablet or phone on-line stores) that support a range of mobile device platforms do not support ODF and have chosen to implement Open XML because of its broad popularity.
 
This would indicate that the market is still highly active and that support for these formats continues to evolve. Things do change and it is difficult to predict either how or when.

THE DYNAMIC NATURE OF STANDARDS

To illustrate this point, in the section “The dynamic nature of standards”, we take a closer look below at a format which is widely used now, but not yet “standardised” (as in approved and maintained by an appropriate Standards Development Organisation or SDO), but which may become popular in the future. This is both the nature of the market and the process of establishing open standards. The lesson would seem to be that it is wiser to avoid limiting one’s self to a narrow selection of standards.
 
It then follows that the safest way forward for any organisation seeking to ensure it had access to the maximum range of interoperability, the richest range of functionality and the widest common use of formats should be to seek to embrace multiple document formats.

Specifically, we would again recommend ODF (ISO/IEC 26300-1:2006 or OASIS ODF v1.1, with a path to accommodate the use of OASIS ODF v1.2 when ratified as an ISO/IEC standard), Open XML (ECMA-376, ISO/IEC 29500-1:2012 "transitional" and "strict") and PDF (ISO 32000 and its associated standards like ISO 19005 or PDF/A).

In addition, we also concur with the proposal that many use cases can be best accommodated using a web-based HTML form, with W3C’s HTML v5 specification being the most up-to-date (though not yet finalized) version of that standard.
 
User adoption and application support then offers the prospect of the lowest operational cost in the long term and the greatest ability to provide or receive documents in a format that the citizen may prefer.
 
Contrary to the assertions of a number of commentators, both Open XML and ODF remain vibrant and live standards in active development. Nowhere is this revealed better than in the latest Business Plan from the ISO/IEC JTC1/SC34 ("SC34"), of which WG4 covers the development of ISO/IEC 29500 (Open XML) and WG6 covers the development of ISO/IEC 26300 (ODF).

(The latest SC34 Business Plan can be found at http://isotc.iso.org/livelink/livelink?func=ll&objId=15737505&objAction=...)
 
SC34's Business Plan for the Period October 2013 to September 2014 clearly details the close and cooperative working relationships between OASIS and WG6 for ODF and between ECMA and WG4 for Open XML. Achievements listed for the prior period (up to September 2013) include (quoting from the plan):
 
⦁ "SC 34 has been actively maintaining ISO/IEC 29500. WG4 has started a project for revising ISO/IEC 29500-3 in reply to far-reaching defect reports, and will publish a working draft in the very near future."

⦁ "WG 6 experts have assisted the OASIS ODF Technical Committee in work to consolidate the technical alignment of ISO/IEC 26300:2006 with Amendment 1 and ODF v1.1. Over the past year some further minor defects have been is covered in ODF v1.1, OASIS has published corrections to their own specification and work has now started on correcting ISO/IEC 26300 to align with these corrections."
 
Both statements show that ODF and Open XML are being actively maintained and improved by these communities.
 
Indeed, the Business Plan also notes that "SC34 looks forward to the anticipated PAS submission of ODF 1.2 during the next period", indicating that SC34 fully expects ODF v1.2 to update the ISO/IEC 26300 standard, and for it to begin its journey through the ISO/IEC acceptance process prior to September 2014. Indeed, Microsoft expects this process to begin either late in the first quarter or early in the second quarter of 2014.

New standards are always being developed and sometimes these appear to offer similar functionality to existing standards, which themselves may also continue to be maintained and developed. A number of comments in the SC34 Business Plan relate to the standardisation of the specification of EPUB (in collaboration with the International Digital Publishing Forum or IDPF). EPUB is XML-based digital publishing standard for e-books. If nothing else, consideration of this standard demonstrates that the ecosystem and enthusiasm for new document formats is healthy and vibrant, with this standard perhaps becoming a potential candidate for "viewing government documents" at some future time - a possibility that would place EPUB seemingly in direct competition to PDF.
 
Yet this illustrates an important conclusion we seek to emphasise here - that international standards can and do happily co-exist even where they seemingly compete. While those that do appear to compete cover many of the same "use cases" for similar groups of users, each also often exhibit unique features or user benefits that are both valuable and valid to retain.
 
At some future time the UK government may ask people to consider (or, for that matter, the people may ask the government to consider) the adoption of the EPUB standard for viewing government documents. At that time it will have to ask whether to do so would invalidate or deprecate the need for PDF. The answer should be a resounding "no". EPUB and PDF may offer greater benefits together to users (citizens and civil servants) for no significant or measurable increase in operating costs for the government. Applications that the government might then choose may (or should) be able to cope with both formats covering each format's shared and unique features.
 
The same is true right now and in a completely analogous way for document formats for sharing and collaboration. ODF and Open XML both have their place and their preferred usage - and that is a preference both of users (having chosen that format) and uses (being capable in some different and valuable ways).

IMPACT ON THE WIDER ECONOMY

We believe that including Open XML, along with ODF and HTML, in the standards for sharing and collaborating with government documents will allow the government to achieve all the objectives it has set out when setting this Challenge. Indeed, we would say that it is more able to deliver what users need only if Open XML is included, because it widens choice and interoperability opportunities across all IT systems, not merely the choice of a small number of specific productivity applications. Consideration of this latter outcome demands that before a final decision is taken, the government should appraise (and share that appraisal) the impact on wider IT economy of making the current proposed narrow selection of ODF only.

We support and applaud the government’s G-Cloud programme and the progress it has made to opening up government IT business to small and medium sized enterprises (SMEs). Microsoft’s Partner ecosystem comprises over 35,000 partner organisations in the UK (and over half a million partners worldwide), employing over 250,000 people in the UK alone. Our partners tell us they are encouraged and excited by the developments the government is making its markets more open and accessible to them. The overwhelming majority of these partners are SME.

A significant proportion of these partners build or develop applications that run on or work with our products and services. The role of our support for open standards across our products, platforms and services is a factor that allows our partners to create interoperable products and services they can then sell to their customers across the UK and further afield – including into the Public Sector.

While Microsoft Office supports a wide range of open standards, including Open XML and ODF, our partners have a choice of which open standards they use in their products.

By making such a constrained choice, as in the current proposal, the government risks either alienating many UK SME businesses who have based their products, quite freely and reasonably, on open standards like Open XML.

Open XML is not merely a standard for word processors, with documents only ever created and read in one or other word processor, it can be used by applications to assemble and communicate complex information which will eventually be read by people, but not necessarily by those people using a word processor. This is a decision that has impacts much wider than how many different word processors the government has to choose from.

We are surprised that throughout this consultation no mention or assessment has been made (other than by Microsoft) of the financial impact of choosing a narrow range of document format standards. We believe such an assessment, taking into account the wider impact on the UK IT ecosystem – particularly the SME IT suppliers in the UK (who may not be able to afford the investment needed to change their product to make it attractive to government) – should be thoroughly considered, through an open process and made public before any final choice is made.
 

CONCLUSIONS

We have shown here that Open XML enjoys a popularity of use across the major domains of interest to the government, including across the major domains of the UK Public Sector (gov.uk and nhs.uk) and of UK business (co.uk) and of the 3rd sector (.org.uk).

We have also shown that both standards (Open XML and ODF) are vibrant and developing, that they are both supported by a range of tools (applications, apps, programs or services) widely available to the government, to business and to citizens in a range of circumstances to suite user needs.

And we have also shown that it is the nature of standards to develop and occasionally supersede one another, but that for a time, multiple standards that appear to do the same thing can and do exist and thrive in parallel for many years.
 
On this basis, we urge the government to avoid a costly and unnecessary focus on too narrow a selection of standards. Having already recognised in its proposal that no single standard will be adequate for all its needs, we now urge the government to include Open XML alongside ODF and HTML in its list of standards for sharing and collaborating with government documents.

Pages