This site is currently in beta and more functionality and content will be added over the coming months. We welcome your comments. Please click here to provide feedback.

Sharing or collaborating with government documents: proposal

Challenge: 

Domain: 

Short description: 

Update: The time period for commenting on this proposal has been extended to 17:00 GMT on Friday 28 February.

Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials, sharing and editing documents. Officials within government departments also need to work efficiently, sharing and collaborating with documents. Users must not have costs imposed upon them due to the format in which editable government information is shared or requested.

User need approach: 

Users in the context of this proposal include citizens, businesses and delivery partners who need to share information with government in editable formats. Users are also officials within government departments who need to share and work on information together.

As technology progresses, government’s production of editable information in formats traditionally associated with documents will become less important for users. 

Government services are being redesigned to make them more straightforward and easier to use by making them digital by default. This will diminish the use of traditional government document formatting even further as information is published or collected directly on the web.

This proposal recognises that changes in technology and service delivery will therefore mean that document formats become less important as collaborative editing and transactions increasingly become an online experience. However, documents formatted in office software are still prevalent amongst users of government information and the formats used by government should meet user needs.

Users need to:

  • Open, edit and save information online and offline
  • Submit information in response to a request, to perform a transaction or to access a service
  • Share information with specific people
  • Publish information online so that a wide audience can access and work with it
  • Edit information and be confident that it remains usable and editable when saved and shared with other users
  • Create a new document with the same style as documents previously created
  • Export the documents created in a non editable format so that they can share a document as they intend it to be presented
  • Export the documents they create in a format compatible with other software so that other people can use the information
  • Share information so that they can gather feedback
  • Share information so that they can respond to a request for information
  • View/edit the information shared with them so that they can read/act upon the content
  • Provide input on information created by someone else
  • Copy and paste content from one source to another so that they can quickly collate pieces of information in one place
  • Edit information created by an integrated system they work with so that they can add additional information
  • Gather feedback on information they have drafted so that they can apply other people’s recommendations to the content
  • See version updates so that they can be sure they’re working on the latest version of a document
  • Access information from any appropriate place so that they can get on with their work
  • Their devices not to be clogged up with downloads
  • Ensure integrity of specific documents, e.g. audit trail for editing, versioning
  • Use the information on the device and platform of my choice, for example laptop, tablet or smartphone
  • Be able to use accessibility tools with information in online and offline formats

Achieving the expected benefits: 

  • Users are able to efficiently share and work on editable government information
  • Users are not required to buy new software to submit or work with government information
  • Users are able to re-use data and text, where licences permit

Functional needs: 

The format should support:

  • Characters associated with Unicode 6.2 for text based file formats (in accordance with the standards profile for cross-platform character encoding)
  • Digital continuity - having implementations that enable support for import of older formats
  • Use of metadata
  • Imports and exports to/from other applications
  • Fonts and graphics that are reusable in other formats
  • Creation of templates

Citizens, businesses and delivery partners must be able to interact with government officials and services, or those working on behalf of government, sharing appropriately formatted, editable information.

Users should be able to work on their device of choice and must not have costs imposed upon them due to the document format in which government information is provided or requested.

Documents should be editable on different devices without loss of integrity - the information should not become spoiled. Documents in this context include:

  • Word processed text
  • Spreadsheets
  • Presentations

When dealing with citizens, information should be digital by default and therefore should be published online. Browser-based editing is the preferred option for collaborating on published government information. HTML (4.01 or higher e.g. HTML5) is therefore the default format for browser-based editable text. Other document formats specified in this proposal - ODF 1.1 (or higher e.g. ODF 1.2), plain text (TXT) or comma separated values (CSV) - should be provided in addition. ODF includes filename extensions such as .odt for text, .ods for spreadsheets and .odp for presentations.

For statistical or numerical information, CSV is the required format, preferably with a preview provided in HTML (4.01 or higher e.g. HTML5).

Forms and information exchanges should be digital by default where this is enabled, therefore use of office formats should not be encouraged for the completion of forms.

For information being collaborated on between departments, browser-based editing is preferable but often not currently available. Therefore, information should be shared in ODF (version 1.1 or higher e.g. ODF 1.2). The default format for saving government documents must be one of the formats described in this proposal.

To avoid lock-in to a particular provider, it must be possible for documents being created or worked on in a cloud environment to be exported in at least one of the editable document formats proposed.

Information that is newly created or edited should be saved in one of the formats described in this proposal. There is no requirement to transfer existing information, unless it is newly requested by a user and shared.

Other steps to achieving interoperability: 

  • A government body must not refuse to accept or supply a document in at least one of the open formats described in this proposal 
  • Documents may be shared in other formats but only in response to a specific request from a user
  • Existing documents should be migrated to the formats specified in this proposal if they are re-opened for editing or are requested by a user
  • Government bodies should avoid bespoke implementations which may limit their ability to migrate information or to share it with other users
  • Macros should be avoided wherever possible, particularly when sharing documents.
  • Government officials should engage with interoperability testing initiatives for document formats
  • Government officials should engage with standards bodies associated with the maintenance of standards that are agreed for document formats for use in government

This proposal, if agreed, would apply to information produced by or on behalf of central government departments, their agencies, non-departmental public bodies (NDPBs) and any other bodies for which they are responsible. These government bodies would need implementation advice to give clarity about when to use particular formats, the user needs they meet and the interoperability that can be expected.

A document metadata profile is outside the scope of this proposal, although this may be the subject of other challenges taken through the Standards Hub process.

Assessment of tools that can be used for providing multiple formats from a single standardised format are also outside the scope of this standards challenge.

Standards to be used: 

Other standards to be used: 

Incorporated in: 

Phase: 

Proposal

Associated standard versions: 

Comments

I think the emphasis on

HTML and ODF

 

I think the emphasis on digital by default is absolutely correct, as is the implication that HTML should be the default format for browser-based editable text.  This then leaves those other situations where for whatever reason browser-based solutions are not available, and here the choice must clearly be truly open standards to avoid vendor lock-in and to promote wide interoperability.  Again, that leads inevitably to ODF as the default format, as suggested in the proposal.

What about TSV instead of CSV

What about TSV instead of CSV  (tab separated values) ?

Equally well supported, and much less likelyto get confused. No need to escape commas, quotes, double-quotes and of course the escape character itself.

I wholeheartedly agree that

I wholeheartedly agree that CSV is fundamentally defective, because commas may appear in the data to be separated, whereas horizontal-tab characters almost never occur in the data to be separated.  Because of this horizontal-tab (Unicode 0x0009, control I) is far superior in TSV to comma in CSV.  Broken ideas like CSV should not be standardized.  The whole point of standards is not sameness for sameness's sake, but rather the establishment & dissemination of best practices. 

Sorry but this is nonsense.

Sorry but this is nonsense. If you have a comma in a CSV value then the string should be quoted, the same goes for tabs in TSV.

An unquoted string containing a tab in a TSV is no less dangerous than an unquoted comma in a CSV.

CSV is far more widely used, with good reason. Using whitespace characters for delimiters leads to errors that are more likely to be overlooked and can be more alkward to debug.

CSV is not a standard in that

CSV is not a standard in that their are many tools that save slightly incompatable "CSV" formats; indeed, TSV can be seen as, and is treated as a form of "CSV" for some tools.

The Python prgramming languages CSV module, documentation here,  has features to try and accommodate many dialects of CSV, including TSV - which makes working with CSV do-able in most cases.

CSV does have its place though and I support its inclusion, but you might need to be wary of it being so loosely defined.

Csv is annoyingly non

Csv is annoyingly non-standard and sometimes seeems that each usage is different from every other.  I'm sure it's not quite that bad but it's definitely not ideal.  Sadly i don't think there is an alternative that is so widely used.  Many devices generate reports in Csv and will continue to do so for a long time yet as not all of them are in easy reach. 

Of course one alternative that could be used is ODS (the ODF spreadsheet format) but Csv files are smaller and more suitable for tiny embedded devices, satellites, north-sea marker bouys and otehr places which need the tiniest file-sizes possible. 

So i applaud the Uk Government's decision to include this format in this consultation

Regards from
Tom Davies

It's worth noting that CSV,

It's worth noting that CSV, decimal numbers and other European countries don't mix too well because they use the comma where the UK uses the decimal point.

 

I would argue that CSV is not an appropriate format if quote or double quotes are part of the data. Subject to the note above about decimal points and commas, it should be solely for numeric data.

Just want to add support to

Just want to add support to using TSV. Using commas to separate date values always has been a stupid idea, in my experience because postal address lines sometimes contain them. Tab separation is the way to go.

However, I have never used a TSV file for permanent storage of any data. I've only ever used it as a way to transfer information from, say, a database to a mail merged document, of from a database to a spreadsheet. So as far as I'm concerned I'm not bothered about any rules and regulations governing its use. If it works, use it.

Any format where the document

Any format where the document control can be confused with content is severely broken. Horizontal Tab may be less common than comma in your appication but not in somebody else's.  There have long been methods to avoid this problem and any of them could easily be added to the CSV or other description without making any existing problem worse (the crudest technique is generally called 'escaping'. It's futile to try to choose a character that isn't used much : fix the problem properly. It's not a lot of effort.

There seems to be some who

There seems to be some who are interpreting CSV strictly as "Comma separated values" rather than "Character separated values". Yes comma's do tend to be used by default in many end user applications but I've also frequently seen and used the '|' (ASCII 0x7c) character. The key is to enable the use of a separator that is appropriate to the values being delimited. Hence I am happy with the CSV format as it has been defined on the Standards Hub.

<this is not a comment, but a

<this is not a comment, but a meta-comment:> I wrote a comment yesterday, which was posted.  then I edited it to remove a typo, and it has disappeared: any idea why? do I need to write it again?

Am pretty much in full

Am pretty much in full agreement with the proposal above. Well done.

I would also say that csv can be a bit troublesome especially when you need to escape meta characters, but this is a well understood problem and is simple to handle - as opposed to issues with proprietary formats.

So please go ahead and implement this proposal as a Goverment standard, and explain to other public bodies, such as schools, local councils and quangos why you have made it the standard.

Normal

"Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials", but the authors of this biased proposal don't really care about the preferences and interests of the said citizens, businesses etc. - many of which are using other widespread formats.

Preference to ODF at the expense of OOXML won't harm Microsoft. It will only result in extra government expenses on re-training and document conversion, and might hamper information exchange with not so small non-ODF community.

With information technologies changing every 5-7 years, reliance on single unpopular format doesn’t seem very smart :)

This argument runs contrary

This argument runs contrary to the objectives of open standards.

The OOXML produced by MSOffice is non-standard "transitional" OOXML. Recent versions of MSOffice have support for standard OOXML but it's not the default, instead it's buried in the options.

That means that in order to consistently publish documents readable by standards-compliant software you'd need to take the following steps:

  • All government PCs that use MSOffice would have to be upgraded to recent versions
  • Any PCs running older versions of Windows would have to be upgraded to recent versions to support recent versions of MSOffice
  • Any PCs that were too old and slow to run recent versions of Windows would have to be replaced
  • There would need to be a government-wide Windows group policy implemented to enforce Standard OOXML over Transitional OOXML.

This doesn't seem like an effective course of action. It would also benefit Microsoft in several ways, at the expense of vendors who have invested in open formats.

 

With regards to ODT, so it is

With regards to ODT, so it is instead right to push these costs onto businesses and individuals instead? This is completely contrary to two of the requirements in the above text:

  • Users are not required to buy new software to submit or work with government information
  • Users should be able to work on their device of choice and must not have costs imposed upon them due to the document format in which government information is provided or requested

If they are using Office 2007 SP2 or greater, then ODF is supported, but if not (and there are probably plenty still out there) then this will be forcing them to upgrade and have any training to do so. And this is without the costs of the additional challenges of what does and does not convert well.

Furthermore, will I as a tablet or mobile phone user then need to buy and learn something else for my devices to be able to use them? At a time where new form factors are taking off, then this should at least be a consideration.

If this is really about open access, then the approach must allow for the widest used format, even if the "direction of travel" is towards a single format.

Could I make a quick comment

Could I make a quick comment on the fears of extra expense expressed here? 

I recently conviced my wife to try converting from Microsoft Word to LibreOffice (free to download,  so no expense there apart from the 3 minutes time it took).  She then tried using it and it took only a few minutes for her to feel competent to use most of the options and she considered herself an expert after half an hour of trying all the options.  (half an hour of effort expended). 

Andrew,

Andrew,

as ODT is a truly open standard and supported by various free software office suites, there is no need for you or your business to buy a new office suite. May I, for example, suggest the use of LibreOffice? It is not only free as in "no cost", but also free as in "freedom", meaning that you are not restricted by the authors in how you are allowed to use it. LibreOffice is a fully-fledged office suite, and you will probably find that it is quite easy to use.

Additionally, recall that the first version of Microsoft Office supporting OOXML files is Office 2007. Choosing this format instead would not improve the situation you are describing in any way. In fact, it would be worsened a lot: despite being a published standard, OOXML is not a truly open standard. Instead, it has lots of Microsoft-specific unpublished additions. At the end of the day, that means that everyone not using a current version of Microsoft Office will have to buy one. Free operating systems, such as Linux-based ones, are completely excluded through the non-availability of Microsoft Office for these platforms.

What you are writing is, in essence, the single best argument *for* truly open standards like ODT. As the entire standard is detailed in a precise manner, everyone office suite can relatively easily support it. This is true for both free and open source solutions like LibreOffice, and proprietary alternatives such as Microsoft Office. In contrast, for proper OOXML support, one is limited to the latter. Would you not agree that this clearly makes ODT the superior choice?

Thus, I express my utmost support for the proposal and hope that other entities will follow.

This is good in theory but

This is good in theory but the bottom line is that nothing is free to adopt - just think about training, support, interaction with other suppliers and clients that may not have adopted the standard, not to mention conversion of existing documents.  As a business owner who deals with government this just makes me groan - I know it will cause cost and headaches for us if mandated (of course the proposal doesn't realy mandate it, it does allow for other formats if agreed, but that will also introduce more procedure and cost).  Overall open standards make sense but just by saying "open standard" doesn't mean it will be a success - this industry is littered with failures of standards being imposed from government, organisations or businesses.  In the end it is usually a de-facto standard that wins out and I expect the same here... (and we will all have spent a lot of time and money on it anyway)

This is not only good in

This is not only good in theory, but it is fantastic in practice.

No more forced word processor or spreadsheet upgrades because a vendor has withdrawn support, no more lost documents and spreadsheets because the stored version is 'no longer compatible', no more wasting user time on vendor provided 'version upgrade' tools which take forever, and are often poorly implemented at best.

What makes me groan is the waste of taxpayer money on having to support proprietary file formats, locked-in to a particular vendor, because only that one vendor can properly support the necessary formats. 

Open standards provide the opportunity for government and organisations, both profit focused and non-profit, the flexibility to select vendor when that organisation chooses, to "upgrade" (does anyone ever really want to upgrade their word processor?  I suspect not) on their own cycle rather than that of a vendor, and be able to go out to truly open competition for vendors to separately supply software, support, fixes, and more at a time of the organisation's choosing, not when a vendor sets the 'end of life' date.

ODF means lower costs, avoidance of lock-in and associated costs, less expensive government, great longevity of stored data, a genuine competitive marketplace for vendors, the possibility to separate licensing from support costs, development costs, bug fix costs and more, and the opportunity for all organisations to be in charge of their IT strategy, rather than in the thrall of some external organisation.

ODF means small organisations can just as readily tender into government as larger ones, as the entry barrier is lowered by the choice of suppliers.  ODF further means that vendor costs will be lower into government.

ODF makes financial sense to all taxpayers in the UK through cost reductions and greater competition, not just to a select few companies who might benefit from proprietary options.  ODF is democracy in action.

 

Oddly staying with MS Office

Oddly staying with MS Office means constant re-training in the ever-changing interface.  Each new release means a redesign of the infamous "ribbon bar" and people need to spend time and resources on learnign where things have been moved to. 

By contrast moving to LibreOffice or OpenOffice or any of the other alternatives generally means you can choose a non-default interface so if a radically new redesign happens then it's fairly easy to select the older one.  Also the current interfaces are very familiar to almost all users, especially anyone used to MS Office 2003 or earlier which is only just being replaced in many organisations.  Occasionally you find the familiar drop-down menus gain an additional feature without that forcing a redesign of anything else. 

The ODF standard has been drawn up by a 3rd party organisation who are able to force vendors to comply with the ISO standard.  The organisation bring together representatives from many organisations including Microsoft.  Something shocking like 5,000 organisations send representation and develop ODF.  If one organisation drops out it hardly affects the rest of the organisation at all.  They are welcome to rejoin later.

The ODF ISO standard is around 1,200 pages and many programs use it or could start doing so easily as they were mostly built on it in their own early days.  These programs tend to work on many platforms, not just Windows (or Mac). 

The OOXML ISO standard is around 7,000 pages and NO programs seem to implement it properly except MS Office 2013 on certain Windows platforms.  The Mac version of MS Office seems to use a non-standard version of it.  It doesn't work on non-Windows platforms at all (except the one effort on Mac which doesn't follow the ISO standard)

So there is a LOT less retraining with ODF but tons in using OOXML. 

Regards from
Tom Davies

 

 

 

 

A simple script could easily

A simple script could easily be written and provided for upgrading old existing documents en-masse. 

Training is to do with the software used rather than the format used.  Exisiting software can use the format unless you are using such old versions that you would be forced to upgrade soon anyway. 

Other suppliers and clients are likely to need to find some way to adopt the standard at around the same time as you and for much the same reasons. So the burden wont be on you and because so many programs and suites use it already it will avoid the current mess of all trying to upgrade to the same software at the same time and grumbling about people who upgrade later or earlier than you. 

There hasn't been an open standard before.  They have all been proprietary (unless you count the Rtf made by Microsoft that apparently lost a court case due to not being quite so open as promised.  Hmmm, sounds familiar somehow!). 

The current defacto standard only works if everyone is using the same version of the same software and that gets annoying of someone upgrades too early or late.

Regards from
Tom Davies

 

 

No one needs to *buy* new

No one needs to *buy* new software to support ODF - Libreoffice and OpenOffice are completely free, open source programs which have no restrictions on their use by anyone (which is a basic requirement of open source licenses) and run on Windows, OS X and Linux equally well.

By contrast using OOXML would force everyone who deals with the UK government onto the expensive MS Office upgrade treadmill, or a continuing Office 365 subscription. Not to mention that it's *only* MS Office 2013 that can save documents in the 'strict' format - 'strict' meaning in actual accordance with the ISO standard 29600 that underlies OOXML.

This means that to truly have an interoperable document format everyone would have to have bought MS Office 2013 and be running it on Windows, or have bought the most recent Office release for OS X which supports 'strict' format documents.

ODF v1.1 support for versions

ODF v1.1 support for versions of Microsoft Office that predate 2007 SP2 (specifically Office 2007 / 2003 / XP) has been freely available for some years now, via the Sourceforge OpenXML/ODF Translator Add-in for Office. So for these users the 'cost' is the effort needed to install a piece of software, if they haven't already done so.

I think it is going to be impossible to totally avoid costs being incurred as these will be highly dependent upon the particular software suites and platforms a business/individual is using, their attitudes to change and the rate of change. As an interim measure the government should continue to permit the use of the Microsoft Office 97/2000/XP/2003 DOC/XLS/PPT file formats, given these are both reasonably stable and widely supported by third-party products.

An issue moving forward is the use of ODF 1.2 specifically for documents being distributed by government, given the limited support for ODF 1.2 in currently shipping products. Hence what is needed is the publication of a roadmap, to enable support to be introduced as part of a product's normal update.

<blockquote>Preference to ODF

<blockquote>Preference to ODF at the expense of OOXML won't harm Microsoft. It will only result in extra government expenses on re-training and document conversion, and might hamper information exchange with not so small non-ODF community.</blockquote>

That's manifestly untrue. Even MS Office supports ODF. And the suggestion that using ODF will incur additional costs is a typical response from an encumbent like Microsoft who wishes to preserve their hegemony.

Each new version of MS Office has quirks and often breaks previous formats. Not to mention the constantly changing UI's (Ribbon anyone?). So the retraining costs argument is bogus and self-serving.

<blockquote>With information technologies changing every 5-7 years, reliance on single unpopular format doesn’t seem very smart :)</blockquote>

Popularity has nothing to do with it. It's the fact that the format is truely open and without proprietary extensions and patent issues that matters. OOXML isn't an open format by that definition. It's a proprietary format just like Microsoft's predecessors .xls, .doc etc in the guise of something more open.

The fact that ODF is a vendor neutral standard, that is, no one vendor controls the standard, unlike OOXML, and it's universally supported in all major Office software should be reason enough to select it over OOXML.

 

Pages