The power of open data

This essay was written for the Transparency Camp of Vilnius, Lithuania, where I was, together with Anne-Lise Bouyer, invited to give a lecture on “the power of open data”.

Open Data represent information collected by an administration and offered for free to anyone in a machine-readable format. The concept gained popularity in the late 2000’s and is now hailed as the best path towards a “more efficient and transparent government”1.

There is some truth in this belief, but the way the idea of open data is framed is wrong. It is wrong because open data is not an issue of costs and benefits, it is a political issue: whether the state should be an economic agent among others or whether it should provide the foundations for others to operate. More importantly, open data has nothing to do with transparency and the two concepts should never be mixed.

Open data should be considered infrastructure

One big mistake is to see open data as pure economics. Of course, there is an appealing economic argument for open data. A data set can be sold for one million euros once but it might be less than what it would bring as open data. If you give the data set for free, many companies might use it in unforeseeable ways and, because they generate economic activity, they might give the government more than one million euros back in the form of taxes and social contributions. One example of this logic is Climate Corporation. The start-up, acquired in 2013 by Monsanto, uses data on soil quality offered by the United States government and provides farmers with services that help them increase yields2. Had the US government not given the data for free, Climate Corporation might not have been created. FourmiSanté, in France, used public data on medical doctors to create an online directory. There are many more examples.

Despite these success stories, not all public administrations have opened their data. Many still charge for data collected with public funds. Early open data activists asked the question ‘Why should data produced by the administration, with taxpayers’ money, not be given back to the same taxpayers?’ to push for open data3. The main reason administrations fight open data is more experience than ideology. In the 1990’s and 2000’s, it was fashionable for governments to organize their services as if they were private sector companies. Administrations were told they had to develop new revenue streams, which included the sale of data. Of course, such a requirement came with a cut in public funding. The French geographical institute (IGN), for instance, saw its statutes changed in 2004 to encourage it to sell more data4. Less than 10 years later, it was asked again to give its data away for free. Even if the revenues administrations gain from the sale of public data are relatively modest, many employees would be made redundant if the sales ceased. It is only normal that they refuse to open data. Without a strong financial support from the central government, administrations will not contribute to the trend, and sensibly so5.

To counter the view of public data as a source of revenue for the administration, several estimates of the global value of open data have been made. None is convincing6. Despite this lack of evidence, governments should provide certain data sets free of charge, as a public service. Governments have extended the realm of what constituted public services as the needs of the population evolved. Just 200 years ago, most bridges and roads used to charge a toll. Today, except for highways, toll roads have almost disappeared. Governments should consider some data sets, especially geographical, geological, cadastral and meteorological data, to be public infrastructure and manage it as such. Just like free roads and free law enforcement, providing all entrepreneurs with a level playing field when it comes to infrastructure data is a political decision. Framing it as an economic issue does a disservice to all, because it implies that the economic value of each data set needs to be weighted before it can be opened. It allows for endless bickering, the only winners of which are the heavyweights (such as large corporations or administrations) who can push for a decision in one direction or another.

A big misunderstanding

Such infrastructure data, however, has nothing to do with transparency. Just like a dictatorship such as Singapore has great bridges and world-class public transit, a government can provide top-notch weather data and remain fully unaccountable to its citizens.

Coupling the provision of infrastructure data with the freedom to access public documents was a mistake that early open data activists did. It might have been a mistake over the meaning of open, conflating “technologically open” (machine-readable data) with “actually open” (accessible). It might also have been a tactical choice. Because they knew that governments would listen to economic arguments more easily than to democratic ones, open data activists associated the two in public discourse. Hopefully, historians of the open data movement will come up with an answer in the future. Today, this combination of open data with transparency has disastrous consequences, for several reasons.

First, it lets authoritarian governments boast democratic credentials by engaging in open data. Azerbaijan is a case in point. The Caucasian dictatorship took part in the Open Government Partnership, a gathering of countries officially committed to open data7, while closing down on every freedom in the country8.

More importantly, several organizations that fought for open-data-as-transparency became consultancies for governments looking to develop open-data-as-infrastructure, thereby entering conflicts of interest so big that they cannot fulfill their main mission anymore.

Finally, linking open data to transparency lets governments bring their existing transparency initiatives under the open data label, thereby suffocating the first to the benefit of the second.

A net loss for transparency

The concrete implementation of open data policies were open data portals, where an administration uploads the data sets it chooses. Absent a clear guideline from the central government as to what must be published, political constraints within administrations pushed people responsible for open data to choose the least politically risky data sets as candidates for release. This is why so many data sets on open data portals are about the location of public toilets, the number of trees in parks or other trivial policy areas. When more interesting data sets are published, the few columns that would let citizens keep their government accountable are removed9. That the government pretends to be transparent when it isn’t makes it harder for transparency activists to raise awareness on the issue.

In other words, the process chosen by almost all administrations to “open” data ensured that only useless data sets would be published. There are a few notable exceptions. Some cities, such as Bath (United Kingdom) chose to develop open data portals jointly with local communities of activists with a focus on creating “useful things”10.

Despite their inability to fulfill their avowed goals, open data policies are not all bad. The hackathons they organize and the discussions they foster help people get together and it reminds decision-makers that computers can be used for something else than checking their Facebook and sending mass emails to voters. Such positive outcomes have little to do with transparency.

What about the data sets that are not on the open data portals? In theory, citizens can ask them under freedom of information legislation. The idea that citizens can demand information from the administration dates back to the 18th century and to article 15 of the Declaration of the Rights of Man and of the Citizen, which states that “society has the right to require of every public agent an account of his administration”. Most countries adopted specific legislation to organize this basic right. In Europe, Spain was the last country to vote a freedom of information act (FOIA) in 2014. Such laws can be used to obtain contracts, accounting documents, the details of public tenders, archives etc. In other words, unlike weather data, it lets citizens check what their government actually does. This is the fuel of transparency and accountability.

Smokescreen

A quick look at the indexes that attempt to measure access to information and open data shows that there is absolutely no correlation between the two. Taiwan tops the open data index and is 8th before last on the transparency index11. Austria, Germany and Italy are among the least transparent countries. They are all among the top 20 open data countries.

Open data is used as a smokescreen by many European governments to hide their absence of transparency. The decision by the European Commission to answer freedom of information requests on paper only12, the French ‘Higher Authority for Transparency’ that only provides scanned images of manually filled out forms13, the way the British government tried to restrict freedom of information during a law review in 201514 and many more examples show that government transparency is, at the very least, not on the rise.

The French government, which passed an advanced transparency law in 1978, is the best example of the smokescreen strategy. It is taking steps to merge the body responsible for freedom of information requests (known as CADA) with the one responsible for open data (known as AGD). It went so far as to duplicate the legal process for freedom of information requests by setting up a parallel track run by AGD, despite CADA’s legal responsibility on the matter15. In other words, it encourages citizens not to use the track enshrined in law and to use, instead, one set up by arbitrarily by the government. Needless to say, a body created by law has more power and legitimacy than one created on a whim by a prime minister.


Infrastructure data is needed, and it should be free. However, it does not imply in the least that the government providing it commits to transparency. The two issues - infrastructure data and transparency - must be addressed independently. If not, open data offers governments a handy excuse to cramp down on existing transparency laws.

Citizens who take the article 15 of the Declaration of the Rights of Man and of the Citizen seriously must not fall for the open data trap. Accountability cannot be handed out by the government on a data portal. Accountability comes from the ceaseless exchanges between government and citizens. Only freedom of information does ensure that such exchanges are fair and do not overly favor the former.

Special thanks to Anne-Lise Bouyer for her precious feedback.

Notes

1. This is precisely what the ‘opendata’ section of the White House says.

2. I read about it in Open Data for Economic Growth.

3. Give us back our crown jewels was the rallying cry of open data activists in 2006.

4. That’s the little sentence ‘ l’institut peut concevoir et commercialiser, dans le respect des règles de concurrence, tout produit ou service à partir des données recueillies dans le cadre de ses missions de service public’ to be found in Décret n° 2004-1246 du 22 novembre 2004 modifiant le décret n° 81-505 du 12 mai 1981 relatif à l’Institut géographique national.

5. The French experience is enlightening. In 2007, a national plan was launched to encourage administrations to sell as much data as possible - they could keep 100% of the revenue (read the 2008 report of APIE for the details). Just 5 years later, the official policy made a U-turn. In 2013, the prime minister scolded administrations for not opening enough data (read Open data: le rappel à l’ordre du Premier ministre Ayrault aux ministères). Faced with such contradictions, administrations have no incentive to follow the official line.

6. To make the case for open data, some tried to put a price tag on it. However, the evidence of the economic benefits of open data are thin. Most of the reports on the issue take anecdotal evidence and multiply it to come up with unrealistically large amounts. This 2013 review in the United Kingdom, for instance, announces billions in benefits but does not disclose its methodology. The widely quoted 2011 Vickery report only mashes up together estimates. Most of these studies quote each other as the basis for their estimates. Some simply create estimates out of thin air, such as as this 2013 McKinsey report. Importantly, all of these reviews have been made by consultancies that have a clear interest in pushing for open data - they also sell open data services.

7. It took five years for the OGP to get rid of Azerbaijan. Instead of expelling the country, they declared it ‘inactive’.

8. Take a quick look at Freedom House’s archive on the country to convince yourself.

9. This is the case, for instance, of the French data set on car accidents, where the date of each accident was carefully removed so that no one can check the effectiveness of automated speed controls.

10. See BathHacked.org

11. The indices are OKFN’s index for open data and the RTI ratings for transparency.

12. Read European Commission attempting to block citizens’ requests via AsktheEU.org

13. You have to see it to believe it: hatvp.fr.

14. Read Freedom of Information Act review ‘may curb access to government papers’

15. The Administrateur général des données can be asked for data instead of CADA


Comments

Want to raise an issue about this post? Please open a new one on Github or make a pull request directly, but make sure to read the rationale behind this blog first.