BETAThis is a new service – your feedback will help us to improve it.

Sharing or collaborating with government documents: proposal

Date submitted:
Tue, 28/01/2014 - 6:05pm
Submitted by:
lhumphries
Category:

Update: The time period for commenting on this proposal has been extended to 17:00 GMT on Friday 28 February.

Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials, sharing and editing documents. Officials within government departments also need to work efficiently, sharing and collaborating with documents. Users must not have costs imposed upon them due to the format in which editable government information is shared or requested.

Comments

HTML and ODF

I think the emphasis on digital by default is...

HTML and ODF

I think the emphasis on digital by default is absolutely correct, as is the implication that HTML should be the default format for browser-based editable text.  This then leaves those other situations where for whatever reason browser-based solutions are not available, and here the choice must clearly be truly open standards to avoid vendor lock-in and to promote wide interoperability.  Again, that leads inevitably to ODF as the default format, as suggested in the proposal.

What about TSV instead of CSV  (tab separated values) ?

...

What about TSV instead of CSV  (tab separated values) ?

Equally well supported, and much less likelyto get confused. No need to escape commas, quotes, double-quotes and of course the escape character itself.

I wholeheartedly agree that CSV is fundamentally defective,...

I wholeheartedly agree that CSV is fundamentally defective, because commas may appear in the data to be separated, whereas horizontal-tab characters almost never occur in the data to be separated.  Because of this horizontal-tab (Unicode 0x0009, control I) is far superior in TSV to comma in CSV.  Broken ideas like CSV should not be standardized.  The whole point of standards is not sameness for sameness's sake, but rather the establishment & dissemination of best practices. 

Sorry but this is nonsense. If you have a comma in a CSV value...

Sorry but this is nonsense. If you have a comma in a CSV value then the string should be quoted, the same goes for tabs in TSV.

An unquoted string containing a tab in a TSV is no less dangerous than an unquoted comma in a CSV.

CSV is far more widely used, with good reason. Using whitespace characters for delimiters leads to errors that are more likely to be overlooked and can be more alkward to debug.

CSV is not a standard in that their are many tools that save...

CSV is not a standard in that their are many tools that save slightly incompatable "CSV" formats; indeed, TSV can be seen as, and is treated as a form of "CSV" for some tools.

The Python prgramming languages CSV module, documentation here,  has features to try and accommodate many dialects of CSV, including TSV - which makes working with CSV do-able in most cases.

CSV does have its place though and I support its inclusion, but you might need to be wary of it being so loosely defined.

Csv is annoyingly non-standard and sometimes seeems that each...

Csv is annoyingly non-standard and sometimes seeems that each usage is different from every other.  I'm sure it's not quite that bad but it's definitely not ideal.  Sadly i don't think there is an alternative that is so widely used.  Many devices generate reports in Csv and will continue to do so for a long time yet as not all of them are in easy reach. 

Of course one alternative that could be used is ODS (the ODF spreadsheet format) but Csv files are smaller and more suitable for tiny embedded devices, satellites, north-sea marker bouys and otehr places which need the tiniest file-sizes possible. 

So i applaud the Uk Government's decision to include this format in this consultation

Regards from
Tom Davies

It's worth noting that CSV, decimal numbers and other European...

It's worth noting that CSV, decimal numbers and other European countries don't mix too well because they use the comma where the UK uses the decimal point.

I would argue that CSV is not an appropriate format if quote or double quotes are part of the data. Subject to the note above about decimal points and commas, it should be solely for numeric data.

Just want to add support to using TSV. Using commas to separate...

Just want to add support to using TSV. Using commas to separate date values always has been a stupid idea, in my experience because postal address lines sometimes contain them. Tab separation is the way to go.

However, I have never used a TSV file for permanent storage of any data. I've only ever used it as a way to transfer information from, say, a database to a mail merged document, of from a database to a spreadsheet. So as far as I'm concerned I'm not bothered about any rules and regulations governing its use. If it works, use it.

Any format where the document control can be confused with content...

Any format where the document control can be confused with content is severely broken. Horizontal Tab may be less common than comma in your appication but not in somebody else's.  There have long been methods to avoid this problem and any of them could easily be added to the CSV or other description without making any existing problem worse (the crudest technique is generally called 'escaping'. It's futile to try to choose a character that isn't used much : fix the problem properly. It's not a lot of effort.

There seems to be some who are interpreting CSV strictly as "Comma...

There seems to be some who are interpreting CSV strictly as "Comma separated values" rather than "Character separated values". Yes comma's do tend to be used by default in many end user applications but I've also frequently seen and used the '|' (ASCII 0x7c) character. The key is to enable the use of a separator that is appropriate to the values being delimited. Hence I am happy with the CSV format as it has been defined on the Standards Hub.

ODF (supplementing CSV and TXT) seems a good choice and presents...

ODF (supplementing CSV and TXT) seems a good choice and presents me with no problem in producing and receiving documents if this proposal were adopted.

<this is not a comment, but a meta-comment:> I wrote a...

<this is not a comment, but a meta-comment:> I wrote a comment yesterday, which was posted.  then I edited it to remove a typo, and it has disappeared: any idea why? do I need to write it again?

Editing the post meant that it was resubmitted for moderation so...

Editing the post meant that it was resubmitted for moderation so disappered for a short time. It should now be visible again.

it is now - thanks

it is now - thanks

Am pretty much in full agreement with the proposal above. Well...

Am pretty much in full agreement with the proposal above. Well done.

I would also say that csv can be a bit troublesome especially when you need to escape meta characters, but this is a well understood problem and is simple to handle - as opposed to issues with proprietary formats.

So please go ahead and implement this proposal as a Goverment standard, and explain to other public bodies, such as schools, local councils and quangos why you have made it the standard.

Is this intended to operate with HMRC?

Is it intended to...

Is this intended to operate with HMRC?

Is it intended to operate with FCA/PRA who have just changed to XBRL?

The standards that are approved through this process apply to all...

The standards that are approved through this process apply to all government departments, including HMRC. 

I believe that the FCA/PRA is an independent body, not funded by government. It would therefore not be covered by this proposal.

This challenge is specifically about documents formats for word processed text, spreadsheets and presentations, and XBRL is not a standard that is used within this context. 

"Citizens, businesses and delivery partners, such as charities and...

"Citizens, businesses and delivery partners, such as charities and voluntary groups, need to be able to interact with government officials", but the authors of this biased proposal don't really care about the preferences and interests of the said citizens, businesses etc. - many of which are using other widespread formats.

Preference to ODF at the expense of OOXML won't harm Microsoft. It will only result in extra government expenses on re-training and document conversion, and might hamper information exchange with not so small non-ODF community.

With information technologies changing every 5-7 years, reliance on single unpopular format doesn’t seem very smart :)

This argument runs contrary to the objectives of open standards....

This argument runs contrary to the objectives of open standards.

The OOXML produced by MSOffice is non-standard "transitional" OOXML. Recent versions of MSOffice have support for standard OOXML but it's not the default, instead it's buried in the options.

That means that in order to consistently publish documents readable by standards-compliant software you'd need to take the following steps:

  • All government PCs that use MSOffice would have to be upgraded to recent versions
  • Any PCs running older versions of Windows would have to be upgraded to recent versions to support recent versions of MSOffice
  • Any PCs that were too old and slow to run recent versions of Windows would have to be replaced
  • There would need to be a government-wide Windows group policy implemented to enforce Standard OOXML over Transitional OOXML.

This doesn't seem like an effective course of action. It would also benefit Microsoft in several ways, at the expense of vendors who have invested in open formats.

With regards to ODT, so it is instead right to push these costs...

With regards to ODT, so it is instead right to push these costs onto businesses and individuals instead? This is completely contrary to two of the requirements in the above text:

  • Users are not required to buy new software to submit or work with government information
  • Users should be able to work on their device of choice and must not have costs imposed upon them due to the document format in which government information is provided or requested

If they are using Office 2007 SP2 or greater, then ODF is supported, but if not (and there are probably plenty still out there) then this will be forcing them to upgrade and have any training to do so. And this is without the costs of the additional challenges of what does and does not convert well.

Furthermore, will I as a tablet or mobile phone user then need to buy and learn something else for my devices to be able to use them? At a time where new form factors are taking off, then this should at least be a consideration.

If this is really about open access, then the approach must allow for the widest used format, even if the "direction of travel" is towards a single format.

Could I make a quick comment on the fears of extra expense...

Could I make a quick comment on the fears of extra expense expressed here? 

I recently conviced my wife to try converting from Microsoft Word to LibreOffice (free to download,  so no expense there apart from the 3 minutes time it took).  She then tried using it and it took only a few minutes for her to feel competent to use most of the options and she considered herself an expert after half an hour of trying all the options.  (half an hour of effort expended). 

Andrew,

as ODT is a truly open standard and supported by...

Andrew,

as ODT is a truly open standard and supported by various free software office suites, there is no need for you or your business to buy a new office suite. May I, for example, suggest the use of LibreOffice? It is not only free as in "no cost", but also free as in "freedom", meaning that you are not restricted by the authors in how you are allowed to use it. LibreOffice is a fully-fledged office suite, and you will probably find that it is quite easy to use.

Additionally, recall that the first version of Microsoft Office supporting OOXML files is Office 2007. Choosing this format instead would not improve the situation you are describing in any way. In fact, it would be worsened a lot: despite being a published standard, OOXML is not a truly open standard. Instead, it has lots of Microsoft-specific unpublished additions. At the end of the day, that means that everyone not using a current version of Microsoft Office will have to buy one. Free operating systems, such as Linux-based ones, are completely excluded through the non-availability of Microsoft Office for these platforms.

What you are writing is, in essence, the single best argument *for* truly open standards like ODT. As the entire standard is detailed in a precise manner, everyone office suite can relatively easily support it. This is true for both free and open source solutions like LibreOffice, and proprietary alternatives such as Microsoft Office. In contrast, for proper OOXML support, one is limited to the latter. Would you not agree that this clearly makes ODT the superior choice?

Thus, I express my utmost support for the proposal and hope that other entities will follow.

This is good in theory but the bottom line is that nothing is free...

This is good in theory but the bottom line is that nothing is free to adopt - just think about training, support, interaction with other suppliers and clients that may not have adopted the standard, not to mention conversion of existing documents.  As a business owner who deals with government this just makes me groan - I know it will cause cost and headaches for us if mandated (of course the proposal doesn't realy mandate it, it does allow for other formats if agreed, but that will also introduce more procedure and cost).  Overall open standards make sense but just by saying "open standard" doesn't mean it will be a success - this industry is littered with failures of standards being imposed from government, organisations or businesses.  In the end it is usually a de-facto standard that wins out and I expect the same here... (and we will all have spent a lot of time and money on it anyway)

This is not only good in theory, but it is fantastic in practice...

This is not only good in theory, but it is fantastic in practice.

No more forced word processor or spreadsheet upgrades because a vendor has withdrawn support, no more lost documents and spreadsheets because the stored version is 'no longer compatible', no more wasting user time on vendor provided 'version upgrade' tools which take forever, and are often poorly implemented at best.

What makes me groan is the waste of taxpayer money on having to support proprietary file formats, locked-in to a particular vendor, because only that one vendor can properly support the necessary formats. 

Open standards provide the opportunity for government and organisations, both profit focused and non-profit, the flexibility to select vendor when that organisation chooses, to "upgrade" (does anyone ever really want to upgrade their word processor?  I suspect not) on their own cycle rather than that of a vendor, and be able to go out to truly open competition for vendors to separately supply software, support, fixes, and more at a time of the organisation's choosing, not when a vendor sets the 'end of life' date.

ODF means lower costs, avoidance of lock-in and associated costs, less expensive government, great longevity of stored data, a genuine competitive marketplace for vendors, the possibility to separate licensing from support costs, development costs, bug fix costs and more, and the opportunity for all organisations to be in charge of their IT strategy, rather than in the thrall of some external organisation.

ODF means small organisations can just as readily tender into government as larger ones, as the entry barrier is lowered by the choice of suppliers.  ODF further means that vendor costs will be lower into government.

ODF makes financial sense to all taxpayers in the UK through cost reductions and greater competition, not just to a select few companies who might benefit from proprietary options.  ODF is democracy in action.

Oddly staying with MS Office means constant re-training in the...

Oddly staying with MS Office means constant re-training in the ever-changing interface.  Each new release means a redesign of the infamous "ribbon bar" and people need to spend time and resources on learnign where things have been moved to. 

By contrast moving to LibreOffice or OpenOffice or any of the other alternatives generally means you can choose a non-default interface so if a radically new redesign happens then it's fairly easy to select the older one.  Also the current interfaces are very familiar to almost all users, especially anyone used to MS Office 2003 or earlier which is only just being replaced in many organisations.  Occasionally you find the familiar drop-down menus gain an additional feature without that forcing a redesign of anything else. 

The ODF standard has been drawn up by a 3rd party organisation who are able to force vendors to comply with the ISO standard.  The organisation bring together representatives from many organisations including Microsoft.  Something shocking like 5,000 organisations send representation and develop ODF.  If one organisation drops out it hardly affects the rest of the organisation at all.  They are welcome to rejoin later.

The ODF ISO standard is around 1,200 pages and many programs use it or could start doing so easily as they were mostly built on it in their own early days.  These programs tend to work on many platforms, not just Windows (or Mac). 

The OOXML ISO standard is around 7,000 pages and NO programs seem to implement it properly except MS Office 2013 on certain Windows platforms.  The Mac version of MS Office seems to use a non-standard version of it.  It doesn't work on non-Windows platforms at all (except the one effort on Mac which doesn't follow the ISO standard)

So there is a LOT less retraining with ODF but tons in using OOXML. 

Regards from
Tom Davies

A simple script could easily be written and provided for upgrading...

A simple script could easily be written and provided for upgrading old existing documents en-masse. 

Training is to do with the software used rather than the format used.  Exisiting software can use the format unless you are using such old versions that you would be forced to upgrade soon anyway. 

Other suppliers and clients are likely to need to find some way to adopt the standard at around the same time as you and for much the same reasons. So the burden wont be on you and because so many programs and suites use it already it will avoid the current mess of all trying to upgrade to the same software at the same time and grumbling about people who upgrade later or earlier than you. 

There hasn't been an open standard before.  They have all been proprietary (unless you count the Rtf made by Microsoft that apparently lost a court case due to not being quite so open as promised.  Hmmm, sounds familiar somehow!). 

The current defacto standard only works if everyone is using the same version of the same software and that gets annoying of someone upgrades too early or late.

Regards from
Tom Davies

No one needs to *buy* new software to support ODF - Libreoffice...

No one needs to *buy* new software to support ODF - Libreoffice and OpenOffice are completely free, open source programs which have no restrictions on their use by anyone (which is a basic requirement of open source licenses) and run on Windows, OS X and Linux equally well.

By contrast using OOXML would force everyone who deals with the UK government onto the expensive MS Office upgrade treadmill, or a continuing Office 365 subscription. Not to mention that it's *only* MS Office 2013 that can save documents in the 'strict' format - 'strict' meaning in actual accordance with the ISO standard 29600 that underlies OOXML.

This means that to truly have an interoperable document format everyone would have to have bought MS Office 2013 and be running it on Windows, or have bought the most recent Office release for OS X which supports 'strict' format documents.

ODF v1.1 support for versions of Microsoft Office that predate...

ODF v1.1 support for versions of Microsoft Office that predate 2007 SP2 (specifically Office 2007 / 2003 / XP) has been freely available for some years now, via the Sourceforge OpenXML/ODF Translator Add-in for Office. So for these users the 'cost' is the effort needed to install a piece of software, if they haven't already done so.

I think it is going to be impossible to totally avoid costs being incurred as these will be highly dependent upon the particular software suites and platforms a business/individual is using, their attitudes to change and the rate of change. As an interim measure the government should continue to permit the use of the Microsoft Office 97/2000/XP/2003 DOC/XLS/PPT file formats, given these are both reasonably stable and widely supported by third-party products.

An issue moving forward is the use of ODF 1.2 specifically for documents being distributed by government, given the limited support for ODF 1.2 in currently shipping products. Hence what is needed is the publication of a roadmap, to enable support to be introduced as part of a product's normal update.

<blockquote>Preference to ODF at the expense of OOXML won't...

<blockquote>Preference to ODF at the expense of OOXML won't harm Microsoft. It will only result in extra government expenses on re-training and document conversion, and might hamper information exchange with not so small non-ODF community.</blockquote>

That's manifestly untrue. Even MS Office supports ODF. And the suggestion that using ODF will incur additional costs is a typical response from an encumbent like Microsoft who wishes to preserve their hegemony.

Each new version of MS Office has quirks and often breaks previous formats. Not to mention the constantly changing UI's (Ribbon anyone?). So the retraining costs argument is bogus and self-serving.

<blockquote>With information technologies changing every 5-7 years, reliance on single unpopular format doesn’t seem very smart :)</blockquote>

Popularity has nothing to do with it. It's the fact that the format is truely open and without proprietary extensions and patent issues that matters. OOXML isn't an open format by that definition. It's a proprietary format just like Microsoft's predecessors .xls, .doc etc in the guise of something more open.

The fact that ODF is a vendor neutral standard, that is, no one vendor controls the standard, unlike OOXML, and it's universally supported in all major Office software should be reason enough to select it over OOXML.

Have you bothered to attempt to open earlier MS documents on new...

Have you bothered to attempt to open earlier MS documents on new versions of MS Office found they will not open   MS standards at play   .

Pages