This site is currently in beta and more functionality and content will be added over the coming months. We welcome your comments. Please click here to provide feedback.

Sharing or collaborating with government documents: proposal

Challenge: 

Domain: 

Short description: 

Update: The time period for commenting on this proposal has been extended to 17:00 GMT on Friday 28 February.

Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials, sharing and editing documents. Officials within government departments also need to work efficiently, sharing and collaborating with documents. Users must not have costs imposed upon them due to the format in which editable government information is shared or requested.

User need approach: 

Users in the context of this proposal include citizens, businesses and delivery partners who need to share information with government in editable formats. Users are also officials within government departments who need to share and work on information together.

As technology progresses, government’s production of editable information in formats traditionally associated with documents will become less important for users. 

Government services are being redesigned to make them more straightforward and easier to use by making them digital by default. This will diminish the use of traditional government document formatting even further as information is published or collected directly on the web.

This proposal recognises that changes in technology and service delivery will therefore mean that document formats become less important as collaborative editing and transactions increasingly become an online experience. However, documents formatted in office software are still prevalent amongst users of government information and the formats used by government should meet user needs.

Users need to:

  • Open, edit and save information online and offline
  • Submit information in response to a request, to perform a transaction or to access a service
  • Share information with specific people
  • Publish information online so that a wide audience can access and work with it
  • Edit information and be confident that it remains usable and editable when saved and shared with other users
  • Create a new document with the same style as documents previously created
  • Export the documents created in a non editable format so that they can share a document as they intend it to be presented
  • Export the documents they create in a format compatible with other software so that other people can use the information
  • Share information so that they can gather feedback
  • Share information so that they can respond to a request for information
  • View/edit the information shared with them so that they can read/act upon the content
  • Provide input on information created by someone else
  • Copy and paste content from one source to another so that they can quickly collate pieces of information in one place
  • Edit information created by an integrated system they work with so that they can add additional information
  • Gather feedback on information they have drafted so that they can apply other people’s recommendations to the content
  • See version updates so that they can be sure they’re working on the latest version of a document
  • Access information from any appropriate place so that they can get on with their work
  • Their devices not to be clogged up with downloads
  • Ensure integrity of specific documents, e.g. audit trail for editing, versioning
  • Use the information on the device and platform of my choice, for example laptop, tablet or smartphone
  • Be able to use accessibility tools with information in online and offline formats

Achieving the expected benefits: 

  • Users are able to efficiently share and work on editable government information
  • Users are not required to buy new software to submit or work with government information
  • Users are able to re-use data and text, where licences permit

Functional needs: 

The format should support:

  • Characters associated with Unicode 6.2 for text based file formats (in accordance with the standards profile for cross-platform character encoding)
  • Digital continuity - having implementations that enable support for import of older formats
  • Use of metadata
  • Imports and exports to/from other applications
  • Fonts and graphics that are reusable in other formats
  • Creation of templates

Citizens, businesses and delivery partners must be able to interact with government officials and services, or those working on behalf of government, sharing appropriately formatted, editable information.

Users should be able to work on their device of choice and must not have costs imposed upon them due to the document format in which government information is provided or requested.

Documents should be editable on different devices without loss of integrity - the information should not become spoiled. Documents in this context include:

  • Word processed text
  • Spreadsheets
  • Presentations

When dealing with citizens, information should be digital by default and therefore should be published online. Browser-based editing is the preferred option for collaborating on published government information. HTML (4.01 or higher e.g. HTML5) is therefore the default format for browser-based editable text. Other document formats specified in this proposal - ODF 1.1 (or higher e.g. ODF 1.2), plain text (TXT) or comma separated values (CSV) - should be provided in addition. ODF includes filename extensions such as .odt for text, .ods for spreadsheets and .odp for presentations.

For statistical or numerical information, CSV is the required format, preferably with a preview provided in HTML (4.01 or higher e.g. HTML5).

Forms and information exchanges should be digital by default where this is enabled, therefore use of office formats should not be encouraged for the completion of forms.

For information being collaborated on between departments, browser-based editing is preferable but often not currently available. Therefore, information should be shared in ODF (version 1.1 or higher e.g. ODF 1.2). The default format for saving government documents must be one of the formats described in this proposal.

To avoid lock-in to a particular provider, it must be possible for documents being created or worked on in a cloud environment to be exported in at least one of the editable document formats proposed.

Information that is newly created or edited should be saved in one of the formats described in this proposal. There is no requirement to transfer existing information, unless it is newly requested by a user and shared.

Other steps to achieving interoperability: 

  • A government body must not refuse to accept or supply a document in at least one of the open formats described in this proposal 
  • Documents may be shared in other formats but only in response to a specific request from a user
  • Existing documents should be migrated to the formats specified in this proposal if they are re-opened for editing or are requested by a user
  • Government bodies should avoid bespoke implementations which may limit their ability to migrate information or to share it with other users
  • Macros should be avoided wherever possible, particularly when sharing documents.
  • Government officials should engage with interoperability testing initiatives for document formats
  • Government officials should engage with standards bodies associated with the maintenance of standards that are agreed for document formats for use in government

This proposal, if agreed, would apply to information produced by or on behalf of central government departments, their agencies, non-departmental public bodies (NDPBs) and any other bodies for which they are responsible. These government bodies would need implementation advice to give clarity about when to use particular formats, the user needs they meet and the interoperability that can be expected.

A document metadata profile is outside the scope of this proposal, although this may be the subject of other challenges taken through the Standards Hub process.

Assessment of tools that can be used for providing multiple formats from a single standardised format are also outside the scope of this standards challenge.

Standards to be used: 

Other standards to be used: 

Incorporated in: 

Phase: 

Proposal

Associated standard versions: 

Comments

I agree with this approach.

I agree with this approach. ODF is the best option available to provide device independence. Using on-line ODF tools (e.g. Google Docs) gives the ability to create in this format or convert to this fromat from almost all devices, not just destop PCs.

I would suggest that use of Open Fonts should also be included in the proposal. A large number of format issues when converting document types (even between proprietary formats such as Microsoft Word for Windows and for OS/X) are down to different fonts being used for display to those used at creation, leading to font replacements. A freely downloadable set of cross-platform fonts would help resolve these problems. Careful selection of fonts with identical font metrics would also assist the conversion and replacement process.

I totally support the

I totally support the proposals and ask thay they be given more publicity.

I have minor niggles; I'd prefer TSV to CSV (or consider allowing both) and also some clarification of the end of line character in TXT files issue. More thought needs to be given to the issue of character encoding in text based data and how this might be indicated to the user. It is within the bounds of possibility that new characters (such as the euro symbol) will come into widespread use in the future and thought should be given to the implications of such changes on long term readability of documents.

On the general issue of open standards I'd refer people to the dawn of the Internet when there were rival standards - ISO supported by government bodies, difficult to understand, difficult and expensive to obtain and the IETF standards, free, widely available in plain English. The then US government took the attitude that since the US taxpayer had already funded the development of early implementations, they should be free and freely available to all. The rest, as they say, is history.

I really welcome the proposal

I really welcome the proposal to use a clearly defined open standard. 

ODT ensures interoperability, and future legibility in a way OOXML sadly cannot do in its current format. ODT has the added benefit of removing the requirement to purchase proprietary software, which has the potential to save enormous sums of money.

Similarly, PDF has issues when using proprietary addons (by Adobe), so sticking to the "normal" formatting (is this part of the ISO?) is ideal.

There are issues with CSV which have been raised elsewhere, a clearer definition will be required.

Hi all,

Hi all,

This Opinion can be downloaded from the following web page:
http://www.jukkarannila.fi/lausunnot.html#nro_47

There is a PDF file in that web page.

I have copied here the text from the text file (ODF).

1. Some background

This opinion is about following standards:

1) ODF 1.1 - ISO/IEC 26300: 2006/Amd 1: 2012 Open Document Format for Office Applications (OpenDocument) v1.1
2) ODF 1.2 - Open Document Format for Office Applications (OpenDocument) Version 1.2

will EXLUDE discussion about the following standards

3) HTML 4.01 - ISO/IEC 15445:2000 Information technology - Document description and processing languages - HyperText Markup Language (HTML)
4) HTML5

However, we can not discuss about ODF without some considerations about the following:

5) Standard ECMA-376: Office Open XML File Formats (OOXML)
6) ISO/IEC 29500 – standards series, based on ECMA-376

Following web pages should be consulted, when discussing ODF / OOML

1-2)
Technical Committee
OASIS Open Document Format for Office Applications (OpenDocument) TC
https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office
5-6)
Standard ECMA-376: Office Open XML File Formats
http://www.ecma-international.org/publications/standards/Ecma-376.htm
7)
Freely Available Standards – ISO – ISO - International Organization for Standardization
http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html

All relevant standards are listed (7) on the ISO web page.

2. Amount of the documents and quality of the documents (ODF and OOXML

From the ISO web page (7) we can a download following documents related to 26300 series:
ISO/IEC 26300:2006
ISO/IEC 26300:2006/Amd 1:2012
ISO/IEC 26300:2006/Cor.1:2010
ISO/IEC 26300:2006/Cor.2:2011

In short: there is the base standard, one amendment and two corrigenda. Now we can add the number of pages in these documents:

728 pages: ISO/IEC 26300:2006
108 pages: ISO/IEC 26300:2006/Amd 1:2012
10 pages: ISO/IEC 26300:2006/Cor.1:2010
13 pages: ISO/IEC 26300:2006/Cor.2:2011

All together 859 pages – the 26300 series

From the ISO web page (7) we can a download following documents related to 29500 series:

5030 pages: ISO/IEC 29500-1:2012
138 pages: ISO/IEC 29500-2:2012
46 pages: ISO/IEC 29500-3:2012
1550 pages: ISO/IEC 29500-4:2012

All together 6764 pages – the 29500 series

However, ISO web page (7) contains also Electronic inserts for the 29500 series, and those inserts contain hundreds of different documents; Altogether those electronic inserts are 6,64 Mb.

3. Amount of the documents and quality of the documents should be manageable!!

As we can see, the quantity and quality of the documents vary in those two standards (ODF and OOXML).

Those two standards (ODF and OOXML) are meant fot the same functionality: Sharing or collaborating with (government) documents.

IF Cabinet Office decides something for OOXML, the quality and quantity for OOXML conformance is a serious issue; Is there enough market support for OOXML?

4. Conformance with OOXML (Office Open XML JTC 1/SC 34/WG4)

First we should consult the following web page:
http://www.jtc1sc34.org/wg4/ (Office Open XML JTC 1/SC 34/WG4)

This working group 4 is dedicated for OOXML maintainance.

From web page of the WG 4 there is a link for the following web page:
http://www.29500sc34comments.org/
However, this link is not working. This missing web page should be about defect reports related to the 29500 standard series.

Therefore, we have to look indirectly the defect report from the search page:
http://lucia.itscj.ipsj.or.jp/itscj/servlets/ScmDoc10?Com_Id=w4
From this web page we can select “Defect reports”. There are fourteen (14) different “Defect reports” for OOXML:

The latest “Defect Report” is the document with number 0138.
http://kikaku.itscj.ipsj.or.jp/sc34/wg4/archive/sc34-wg4-2010-0138.zip
This latest “Defect Report” contains 1018 pages of 347 defects.

What I am actually saying? The conformance of OOXML means dealing with a numerous list of different defect reports (hundreds in other words). It is unclear to me, what is the timetable for dealing with ALL current defects and possible NEW defects.

If the Cabinet Office decides something about the OOXML conformance, the Cabinet Office has to be very clear about the current defect reports with the conformance.

Since the actual timetable for correcting ALL current defects in OOXML is unclear, this means that the Cabinet Office has to be very specific in requests for proposals, i.e. the actual version of OOXML and the actual defect reports, which affect the conformity of OOXML.

5. Standardisation efforts for OOXML and ODF (JTC 1/SC 34)

Personally, I attended JTC 1/SC 34 working group meetings (WGs 1, 4 and 5) in Helsinki (14-17 June 2010). I have written an opinion about the meeting
http://www.jukkarannila.fi/lausunnot.html#nro_24

Both ODF and OOXML have their own problems: that is my conclusion from the meeting(s).

Personally, I made the conclusion in June 2010, that ultimate winner of ODF and OOXML standardisation efforts will be PDF (Portable Document Format).

25 February 2014 I can conclude, that PDF is still the ultimate winner (situation from June 2010 to February 2014).

The practical reality is, that PDF has gained so much support, that it is a de facto and partly de jure standard for viewing (government) documents.

PDF can handle situation with non-editable documents, and therefore PDF should be endorsed in the first phase.

6. Selecting internal document format for internal usage

Based on previously highlighted problems, I have made the conclusion, that ODF has LESS problems than OOXML.

ODF is NOT a perfect standard, but it has several advantages:
1) the page amount is manageable (859 vs. 6764 pages)
2) the number of defect reports is manageable when using ODF
3) It should be easier to conform to ODF – less pages and less defect reports.

7. Selecting ODF for internal usage and external usage (Cabinet Office)?

The practical reality in this case (standards endorsed by the Cabinet Office) is, that the Cabinet Office has to be in touch with innumerable stakeholders in the near and distant future. Therefore, the Cabinet Office using internal document format means, that some internal documents will ultimately distributed outside.

Like said before, PDF can handle situation with non-editable documents.

Based on these two main dimensions, i.e. number of pages and number of defects, I have to conclude, that ODF will have more advantages when compared to OOXML:

However, I have reiterate, that ODF is not perfect. PDF is still the winner.

8. Creating possible test suite for ODF conformance

Since ODF is not perfect, the Cabinet Office can use an existing test suite for ODF conformance or develop their own test suite of ODF conformance.

This proposed test suite of ODF should take care of reported defects in ODF.
This proposed test suite should take care of specific needs for the Cabinet Office usage.

With this test suite for ODF can different stakeholders conform their products to the specific needs for the Cabinet Office usage.

Creating or selecting a specific test suite for ODF conformance means, that in public procurement there is fair requirements for different vendors, since the test suite is crafted to the Cabinet Office usage.

9. Instructing stakeholders to use ODF format

The practical reality is, that the Cabinet Office will receive documents in several forms, e.g. RTF, DOC, TXT, ODF and OOXML. Therefore, the Cabinet Office can convert those documents to ODF in several cases. It can be concluded, that it will take years of educating different stakeholders to use ODF as the selected format for sharing or collaborating with government documents.
Therefore, the Cabinet Office must have a clear marketing/educating strategy for ODF usage.

10. Good luck!!

This opinion is quite limited, and hopefully other opinions will result some constructive ideas for selecting standards for sharing or collaborating with government documents.

With Kind Regards,

Jukka S. Rannila
http://www.jukkarannila.fi/

jukka, hi,

jukka, hi,

although complex and detailed, your comments make a lot of sense and have great merit. in effect what you are saying is, if i may summarise, that if a particular standard is selected, then it is *essential* that Government Offices run the documents through "validators" *before* allowing the documents to be published (or transferred to other departments).

and, that, although ODF is not perfect, the so-called "OOXML standard" is so riven with defects and ambiguities that it is flat-out impossible to carry out any such validation.

i would be interested to hear, in each case, the efforts of individuals or organisations to have fixed the defects within the various standards proposed. i'm referring here to the actual *text* of the standard (not the implementation). take the following cycle:

* write a validator for ODF (or OOXML)
* run documents through the validator prior to public release
* discover a serious flaw within the *validator* (due to ambiguity or the standard ODF or OOXML being plain wrong)
* REPORT that flaw to the Standards Body responsible for that document standard
* start the clock
* count how long it takes for the flaw to be fixed in the standard.

it would also be instructive to run a separate clock counting how long the *implementation* of that fix to the standard takes to reach the end-user. given that microsoft's proprietary software is... well.... proprietary, in some ways it is not necessary to even run the exercise, because automatically the cost to end-users (Citizens of the United Kingdom) is prohibitive when compared to the completely zero cost of Software Libre such as LibreOffice.

I don't think there is

I don't think there is anywhere to post problems with OOXML.  There is no OOXML bug-tracker service that is open and transparent for normal users to just post their concerns to.  Similarly with MS Office.  There are 3rd party forums but nothing centralised.  So i don't think there is any way of seeing a list of problems reported by other people or of getting an idea of how many people are affected. 

Each other office suite and program does seem to have their own bug-trackers and a system for forwarding problems in the format itself to a central place.  So it's fairly easy to grumble about ODF and to find a list of problems and how they have been handled. 

Regards from
Tom Davies

 

 

 

This This is an excellent

This This is an excellent proposal.

I think an important point to consider is the huge growth in diverse internet-aware platforms that are now available; primarily phones and tablets. The key word here is 'diverse'; virtually none of these devices support any kind of standard that is not freely available (for example, OOXML - Microsoft Office will never be made to work on these devices). However, they either support ODF directly or free software is available that easily allows them to.

Basically, things have moved on considerably since the 'old' PC days, and not keeping up with this shift is not an option.

By not standardising on a single, freely available and easily accessible format, one is denying a broad, and very quickly widening, group of people access to information that, by its very nature, should and must be made easily available.

In addition, standardising on ODF would bring these advantages at no disadvantage to users of any existing tools. It would promote competition in the business, which has been stagnant for many years, and costs would reduce (to zero in many cases) for end-users inter-operating with government offices (ie - virtually everyone in the country!).

To those that make claims of huge expense resulting from training, I would ask "what training?". 99% of people who use a word processor on a daily basis have never had any formal training, and they get by quite nicely. As for any disruption to businesses, I think this would be trivial and 6 months after the switch everyone will wonder what the fuss was all about.

Finally, whatever you decide, please do not allow OOXML (or whatever new format MS develop at the time) to be included as an 'alternative'. There is simply no need, and to allow it will completely negate the whole proposal and will mean that many many people are still denied access to government publications for decades to come.

We very much welcome the

We very much welcome the proposal for mandating use of ODF for document formats. Since ODF is an open standard supported by several providers, including several well established Open Source projects, it contributes to minimise risks for different types of lock-in, and promotes interoperability between software from different providers.

It should be noted that Microsoft (MS) has presented claims concerning the extent to which different versions of their software (MSO) provide support for the (strict) standard ISO/IEC 29500 (hereafter referred to as OOXML). For both MSO 2010 and MSO 2013 it has been claimed that there is support for reading strict OOMXL. For MSO 2013, it has been claimed that there is also support for writing strict OOXML. Further, it has been claimed that several versions of MS software (including MSO 2007, MSO 2010 and MSO 2013) provide support for reading older ".docx" files (transitional OOXML).

However, initial observations from from our ongoing research suggest that actual support is not provided at a level that can be expected by any professional organisation. For example, when using MSO 2013 for saving documents in strict OOXML our results indicate that actual support for reading such files in MSO 2010 and MSO 2013 are poor and certainly not in line with what can be expected. Further, when using MSO 2013 for reading old docx-files it also seems that such cannot be saved in strict OOXML without problems, something which negatively impacts on any practical usage scenario when documents are exchanged between organisations.

Related to this, one may wonder why the default setting in MSO 2013 does not promote creation of new document in a standardised document format (ODF or OOXML) given that it as default use the transitional version of OOXML (which is for legacy documents)?

Further, some Open Source projects have ambitions to provide support for OOXML, but our results show that such attempts have not managed to achieve professional quality. Consequently, it is clear that such ambitions for achieving interoperability for OOXML have not been successful. In some cases, the software even crashes when interpreting the OOXML-file.

Consequently, expectations from any professional organisation concerning actual support for interoperability and longevity of documents are not a reality when trying to use OOXML

I would agree with the use of

I would agree with the use of ODF as a document forward for standardising documents, rather than the Microsoft based OOXML. Although many people have an issue with upgrading their software to use ODF, at least they can do that and convert at no cost. As to the cost of retraining on newer software, having provided assistance for people changing word processors, there is less disruption moving from office 2003 to one of the Open Office clones, rather than office 2003 to 2007.

I have made a considerable amount of money writing software to convert from microsoft word formats, both doc and docx to other open standards (mostly XML for data extraction) and I know how hard it is to read this format in a way which is completely compatible with the layout within MSWord, as a lot of the standard is assumed rather than implied. I found it actually easier working with doc, as it was smaller.

When mandating formats although some effort is to be put into backward compatibility, the amount of documents produced increases year on year, so it is more important to establish something now to provide for documents yet to be made.

ODF is designed to be a format to share data, not a format to keep the user using one proprietary package. I think what matters is that the data is available to all in a format that is truly compatible and if it is the case that more people have access to software that can produce OOXML (which I don't know if it is true), I don't see that is a reason to pick that format, as I am sure that if the format is mandated to be 'X format', many manufacturers of office tools will then support 'X format' very quickly. I personally don't mind what word processor someone uses, as long as I don't have to buy that to deal with them.

I would agree that tab separated values is probably easier to work with than comma separated values, although the technicalities of dealing with it has been solved a long time ago, and software that deals with one generally deals with the other, so I don't see it as a major issue either way.

I support the definition of

I support the definition of an open standard for interacting with the public. ODF & standard HTML in this case, means

  • various relatives of mine with PCs from the Dark Ages will still be able to read & respond to online government information in the same way I can on a tablet.
  • there's a real chance that currently-produced documentation can still be easily read by the public in 20 years time; including such long-lived public records such as births/deaths/marriages, Land Registry, Government proposals etc; both ODF & HTML specifications are open and implemented over many programs.
  • and whilst governments and students can get discounts, I don't have to go buy Office every few years, or pay annual subscriptions for Office365, in order to interwork with the government (there will be more and more available online).

There's lots of other work needed too - like getting sensible internet access across all of the UK - but there's no sense at all in forcing documents into a format that feeds the coffers of Microsoft, makes migration hard and can only be read as long as a commercial company deems it profitable to support.

I find it curious that the

I find it curious that the only formats discussed are dynamic (editing) formats. 

A great deal of collaboration occurs on final-format documents, typically PDF files. The above proposal does not address this use-case at all.

PDF includes an annotation mechanism very commonly used for collaboration purposes.

As no other option (in file format terms) is presently included for sharing or collaboration on static documents, I strongly suggest the addition of ISO 32000-1 (PDF 1.7) to the list of standards to be used.

NOTE: These opinions are my own and not representative of the ISO 32000 committee.

hi duff,

hi duff,

starting here http://standards.data.gov.uk/comment/569 i am one of the few people who has also raised this issue (that non-final-format documents - good word! - do not necessarily contain the right fonts for example). the problem is that there are now hundreds of comments on here, and the good ones are being lost in the noise.

but hey: the poor buggers who have to read them all, they'll get there - hi guys :) like... they *have* to read them all, so don't worry, it's been mentioned already, and sound and reasonable technical justifications given in some detail.

Pdf is only necessary because

Pdf is only necessary because people have been using OOXML and found it so unreliable that they need something to show how they meant the file to look.  

 

It is also OOXML usage that has driven some companies and rich people to buy expensive PDF editors.  This all increases the "digital divide".  This proposal seems to be aiming to decrease the digital divide so it needs to move away from the idea of expensive editors such as the ones for PDF

Regards from 
Tom Davies

I was hoping that the means

I was hoping that the means to readily compare revisions of documents would be part of the criteria for document format selection, but I can't see any mention of it.

The ability to readily compare changes between versions documents via a visual diff is more than just useful, it can be of upmost importance in some cases.

If there aren't a range of cross platform tools that are able to compare different versions of a document in a given format, i believe that format is not fit for purpose.

Normal

I agree with this proposal.

At home I use ODF with open office/libre office and have done so for the past few years where it has done everything I've ever needed it to do. Having given up on using any windows OS many years ago due to their restrictiveness / vendor lock in nature, my machines at home vary from ubunt linux, osx to android and it's great to be able to read any of my document on any of these devices. Microsoft has an intrinsic desire to spread it's propriety-but-called-open file format as far and as wide as possible to aid vendor lock in ( http://en.wikipedia.org/wiki/Vendor_lock-in ). If it is included in this standard, this will come back to bite us in some way that benefits the microsoft of the future.

Having said that, I've no problem with microsoft attempting to compete on a level playing field with their office suite and they are more than capable of doing this by making their ODF plugin work well, but they seem to prefer the lazier option of laying down the vendor lock in trap which then means we would be at the mercy of a single vendor - never a good idea. And let's not forget that they have previous form for this type of bullying: http://en.wikipedia.org/wiki/Browser_wars

In summary, my

In summary, my recommendations for editable documents: 1) choose ODF from version 1.2 and later for editable documents, 2) contribute to the ODF standard, 3) store ODF in PDF when precise rendering is required, 4) use RDF in ODF to convey as much original information as is appropriate for the purpose at hand. For HTML, I recommend to explicitly state which version of CSS and JavaScript are supported.
Thank you for opening up this process to the public. Collecting feedback on a policy like this is of immense value. Or as they say in open software development: with many eyes, all bugs are shallow.

Standardizing on a document format will is a big step towards a more efficient organization. All participants are clear on what documents are supported. In an open standard like ODF, all stakeholders can have a say in how the standard evolves instead of having to rely one one party with an incentive different from the other stakeholders.

A bit of background: I am member of the ODF technical committee, am a contributor to Calligra, an office suite, and am the initiator of WebODF a JavaScript library for viewing and editing ODF documents. I've also written software for generating PDF, HTML and ODF documents from XML input for the Dutch government which has chosen ODF as a national standard for editable documents. In this work as volunteer and independent software developer, I've learned that ODF is the best choice to move the information society forward at this time.

There is a healthy group of independent implementations of ODF including Microsoft Office, LibreOffice and OpenOffice, Calligra, AbiWord, WebODF. (There is currently no comprehensive automated method for determining to what extent a computer program implements ODF. The same is true for any slightly complex document format for which the implementation is separate from the specification.) Just like with HTML, the choice to move to an open format will combine the efforts of all market parties to improve and align implementations of that format. Adoption of ODF by the UK government will boost adoption of the format by others, further decreasing document exchange overhead.

HTML and ODF are both needed to fulfil the many functions that are expected of documents. Ideally, one standard would suffice. Pragmatically, standards with many common parts are used. ODF and HTML do have many parts in common. CSS and the way styling is done in ODF both inherit from XSL-FO. The image formats JPEG, PNG and SVG can be used in both formats. And both formats support the RDF standards which has been embrace by the United Kingdom for publishing open data. MathML formulas can be embedded in ODF as well as in HTML.

The overlap between ODF and HTML is seen most clearly in web based tools for editing ODF. For example, WebODF can load an ODF document in a browser for editing while leaving the document completely intact. This allows editing of documents without converting the contents to a document representation, internal to a specific program. As such ODF is eminently suited for a web-based workflow.

Some important requested features that are available in ODF: font-embedding, templating, arbitrary metadata, imports and exports (this is a feature of the ODF software, not the format). In addition to that, an ODF document is a single file. This is different from most HTML documents which have e.g. graphics external to the document. Archiving and versioning is much simpler for documents that are single files.

I recommend to use ODF 1.2 and up and to skip ODF 1.1.

Since version 1.2, ODF supports the semantic web (as RDF). This means that any kind of information can be stored losslessly in an ODF document. ODF has the ability to store entire databases as RDF in a document. Individual words, paragraphs and many other entities can be linked to this knowledge graph. This makes ODF 1.2 suitable for publishing open data. It also make it possible to annotate the contents of ODF documents.

Since version 1.2, ODF standardizes the spreadsheet formulas with OpenFormula. The older version ODF 1.1 does not have standardized formulas and is not suitable as a standard for spreadsheets. Since the current proposal also mentions CSV, this may not be an issue.

ODF 1.2 has better support for styling of lists. ODF 1.2 introduces the attribute text:style-override on text:list-item. This allows different styling for individual items in a list. I have used this feature a lot when generating ODF documents from source materials that require such individual styling.

I fully support the standards

I fully support the standards proposed, and agree with those insisting on truly open standards, rather than proprietary ones.

I also agree that there should be some clarification regarding the use of CSV, as regards the particular type of quoting to be used. TSV seems an appealing alternative, although, bearing in mind most users are on Windows/Office, if tsv isn't associated with Excel by default, that may introduce an additional source of confusion, while CSV "just works"...

I agree with the principle of

I agree with the principle of digital first, and the need for standardisation of sharing and collaboration formats.  However, as with most standardisation considerations there are two aspects to consider - long term standardisation goal and cost of change. 

From a perspective of long term standardisation the imperative should be minimum variation, as this simplifies interchange and minimise the costs of interfaces.

From a perspective of cost of change, the amount of change to existing investments in knowledge, systems and interfaces should be minimised to that which is necessary to move towards the first goal.  Otherwise, we will have implemented a new standardisation directive which unecessarily introduces costs with sufficient benefit.

In this respect, I think that this standard proposal should expanded to include currently widely used and likely long-lived formats.  The browser based editing and statistical interchange formats apear to hit this mark.  However, for document interchange both ODF and OOXML are widely used formats that organisations have invested significant effort and cost into familiarising with and tooling up to use.  Restricting one of these at this stage is uneccesary and introduces a cost of change to a broad range of organisations without corresponding benefit. 

My view would be to keep both.  If you want to declare a long term direction of travel e.g. migrating to HTML5 only, mandating a particular minimum version of ODF or OOXML, or restricting to 1 in futue then fine - but don't force everyone to change on day 1.

 

 

 

For government to be full

For government to be full open the citizens need to have unrestricted access to the format in which documents are exchanged. A propitiatory format is unexceptionable. We MUST be able to independently verify a documents contents before transmission to government. How else can we know that no tracking information has been secretly included ?

 

 

The only way we can be sure of anything is through an ability for independent verification. Any closed or complicated standard makes this practically impossible and should be avoided.

 

 

Document formats have been used to control market share. When everyone is using the same format then we can have open competition at the level of the software not at the format of the document.

 

 

This has to not only reduce cost by also give the citizens more choice.

 

I'm a strong supporter of the

I'm a strong supporter of the proposal to move to open ODF standards for all documents. Anyone can download a free Office package such as LibreOffice or OpenOffice to read, edit and create ODF documents and there is no vendor lock in (unlike with .docx) and no citizen will ever HAVE to pay Microsoft to read official documents. Microsoft Office is not available for Andriod devices or iPads, or for the growing number of Linux users. The web is slowly moving towards full open and agreed standards that are device independant, and government documents must do the same.

As a linux and Android user only, I have no good way of accessing and creating proprietary formats such as Microsoft's OOXML .docx format. If we do adopt that, will Microsoft chandge it again? Also, documentation for this standard is very poor indeed, unlike the excellent documentation for ODF format.

Please do impliment this.

Having open document formats

Having open document formats is important.

It's not just commercially important to avoid vendor lock in.
Given that some government records must remain readable for 30, 50, 70 or more
years, it is important that as we move to digital archiving, documents remain as readable in the future as their paper counterparts are today. Consider the difficulty right now of reading computer file formats from
20 years ago - current software can't read most of them, and software from the
time (if available) won't run on current operating systems. Specs and
source code for that software unavailable, or even lost, with the result that information is trapped.

So it's important to use open standards, so that knowledge of how to read
documents is available, to avoid vendor lock in, and being trapped by
(sometimes enforced) obsolescence of proprietary software.
Open standards don't prevent use of proprietary software.
No-one is arguing that (apart from fearful proprietary software companies).
But they prevent abuse of the situation by any single dominant (or would be
dominant) software vendor, which is vitally important to keep the free market
working.

I think that it's important to have one standard internally. The proposal is
clear that this will not be achieved overnight, and that there's no
requirement to convert unmodified existing documents to any new format,
which I think is sensible.

The proposal is also clear that "Documents may be shared in other formats but
only in response to a specific request from a user". I think that this
covers interoperability well. Adopting the proposal and having one preferred
file format is not going to mandate that everyone interacting with
government uses that preferred file format (and incurs cost in doing so),
because the situation will no different from today - today not everyone can
read whichever current default format information is interchanged in, and
might need to request that their contact converts it for them into something
mutually exchangeable. There's no cost requirement today to upgrade to
whichever format any government department defaults to, and there will be no
change in the future if the government adopts a single standard to which
departments must conform.

There seems to be lobbying that ODF is the wrong choice. Specifically, claiming
that it will cost more than using Open XML as well (or instead) would impose
less cost on government and third parties, because the existing most popular
package (Microsoft Office) can already communicate in this format.

Implied is that this format is "open standard". This is disingenuous, and
carefully omits important details, by avoiding to note the distinction between
Transitional Open XML and Strict Open XML, and thereby attempting to exploit
confusion between the two.

Transitional Open XML is not an open standard. Only Strict Open XML is.
However, only the most recent version of Office can write Strict Open XML
(the standard). So it's a myth that the government choosing "Open XML"
instead of (or as well as) ODF would save costs, because the software in
question (installed versions of Office) would still need to be updated to
write open standard Open XML. There would actually be a cost of complying
with the new directive because all non-current versions of Office would need
to be upgraded. Whereas the three most recent versions of Office are
capable of saving ODF. So suggesting "Open XML" actually costs more, and
attempts to cause confusion by making the current vendor lock-in solution
seem to be the official government solution.

And even Strict Open XML is not a good idea, because it is not a good
standard. It was rushed through ISO by dubious means (which were well
reported at the time), and hence not enough care was taken in its design and
specification. Ambiguities in specifications make them hard or impossible to
implement reliably. ODF was not rushed. Implementations of it will be more
robust, and in particular different implementations are more likely to
interwork correctly (ie not misinterpret, misformat or mangle documents
created by someone else). This matters for reliable communication between
the government and its stakeholders, and for reliable archived documents.

In summary, I agree with the original proposal. I think that the choice of file formats is appropriate, and should not be changed or added to. And I think that that the details of how departments should implement these plans, deal with legacy documents, and with third parties unable to read the preferred format are all well thought through, and and likely to deliver the intended benefits (once any inevitable teething troubles are sorted out).

I am pleased that HM Government is taking the initiative on addressing these important but seemingly dull problems, and look forward to the benefits that will arise from implementing this proposal, and further proposals of similar quality and scope.

As someone who has to

As someone who has to actually process raw data from range of sources to create meaningful reports, standards - open, documented and followed ARE  the clearly the only way to go in supporting and promoting an open government framework which is not driven by commercial lobbying.

Flawed, closed or 'open but cryptic' standards simply lead to frustration, needless overhead, lost opportunities for niche products using said open standards and only play into the hands of those controlling and hance profiting from the use of proprietary standards.

 

I support and commend the Cabinet Office in opening this debate to public input. 

I think this comment box says

I think this comment box says it all; Text format: Filtered HTML or Plain Text. Raw text is the only format able to be read by anyone. All operating systems include one or more text readers/editors. Raw text can likewise be read with any Browser and on any eBook reader, or printed direct to either the scree or a printer. Project Gutenberg has worked for years on raw text files.

Formatting (apart from line breaks) is not required for comprehension.

For spreadsheet/database data comma/quote separated variables (CSV) or tab separated variables (TSV) should suffice as long as tabs, commas and quotes are removed from the data prior to export.

 

A fantastic step forward if

A fantastic step forward if ODF and HTML are to be used for governement interaction, I hope it makes it through the usual lobbying from those interested in locking the governement into a particular poorly documented or non-open standard.

As the spec of both ODF & HTML standards are freely available then various software providers (free or paid for) can easily impliment the spec. Indeed with the increasing demand for online or digital submissions, rather than paper-based, the implimentation of ODF/HTML as a government standard would lower that hurdle as you could use the likes of libreoffice/OpenOffice at no cost. You shouldn't have to be forced to pay for additional software to use a particular format so you can complete any forms or other documents requested by government departments or agencies.

The suggestion of using OOXML is not a good one, as Microsoft's is the only standard bearer and the documentation for the standard is poor. OOXML cannot even maintain compatibility with different versions of Microsoft Office let alone be reliable for others to be able to implement the standard reliably.

As someone working to help

As someone working to help implement open data standards in my work in the rail industry, I have to wholeheartedly support this proposal to use a small set of open standards for information exchange.  Open standards lower costs, improve the quality of information exchange and open the door to innovation and continuous improvement.

ODF is well supported by a number of free and actively-developed products such as LibreOffice.  The free open-source nature of this product means that it is responsive to user needs, emerging bugs and discovered security issues;  and the stability of the open format means that old documents will continue to be readable for a very long time. The support offered to the format by other tools will only improve as its use grows and the tool manufacturers realise the necessity of keeping up. 

I would echo the cautions expressed elsewhere about the use of .csv.  Some additional standardisation would be beneficial, such as in the structure of files, column headers, line terminators, text wrapping, escape characters and the format of specific data items such as dates and times.

Free Software Foundation Europe

Free Software Foundation Europe has long advocated the use of Open Standards in government. We applaud this proposal by the UK government.

Most governments are suffering the effects of lock-in in their IT infrastructure: high costs, dependence on a single ultimate supplier, no strategic freedom. This all but eliminates meaningful competition among suppliers, and stifles technological progress. In addition, these governments often end up imposing on the citizens they serve (and on other organisations they cooperate with) an obligation to acquire the same non-free programs that the government uses.

In contrast, the UK government stands out not just for its determination to break free and make real competition among suppliers possible, but also for having an integrated strategy for doing so. The present proposal is a central building block of this strategy, along with a clear and strong definition of Open Standards, the recently announced red lines for IT contracts, and other elements.

We applaud the UK Government's approach of focusing on standards rather than products, and relying on a strong definition of Open Standards to ensure that there will be significant competition among suppliers for any software products that the government may wish to use.

An important feature of the present proposal is that it relies on a thorough and comprehensive study of the actual user needs. This greatly increases the chances that
the proposal can be successfully implemented, and that any new tools deployed will be well matched to the requirements of their users.

The proposed standards (HTML (4.01, 5 or higher); TXT; CSV; ODF (1.1 or higher)) each address a different technical need. The UK Government is correct in focusing on a single Open Standard for each category and purpose.

Competition takes place on top of standards, not between them. Especially with regards to documents produced in office suites, concentrating on a single Open Standard will ensure that all suppliers can compete on an equal basis. In the mid to long term, the demand created by the UK government, and any others following in its footsteps, is bound to lead to significant improvements in the way office suites work - an area where progress has been all but absent for about a decade.

We agree with Francis Maude's assessment, from a speech delivered on January 29 this year, that "the adoption of open standards in government threatens the power of lock-in to proprietary vendors yet it will give departments the power to choose what is right for them and the citizens who use their services."

In closing, we reiterate our support for the UK Government's proposed approach. Ultimately, any strategy is only as good as its implementation. We would thus like to express our hope that the government will follow through on implementing this approach across all of its branches. FSFE remains available to support this effort.

Pages