Should Data.gov visualize? Probably not.
- Written by
- Clay Johnson
- Date
- 04/23/2009 1:40 p.m.
A few people who saw our Data.gov design post asked for ways to visualize the data on Data.gov. As an organization that's such a proponent of data not only being free, but also using design to provide context to the data, why don't we advocate for data.gov to have visualizations for citizens to make sense of the data?
We didn't just leave it out because we didn't think of it. We left it out on purpose, along with lots of other feature ideas and concepts. We think that providing a centralized repository of government data in modern developer-friendly formats is a hard enough problem for government to solve. Vivek Kundra and his team should be focused on making data available and making it as accessible as possible to people via a small, uniform set of developer friendly formats (say: JSON, XML, KML and CSV).
We think that building a system for delivering that data in those formats across the federal government and building a mechanism for people to report back errors, integrity issues or simply comment on the data is far more important than building visualization tools on top of this data. Why?
This chart should explain it:
In short: because other people will do that and probably do it better. If the goal is to get the data in front of the most eyeballs possible, government should be providing the data in usable formats and focusing primarily on that. External entities will always give the data more exposure and treatment than government can.
The second reason why government should avoid spending time on adding visualizations or other bells and whistles to Data.gov is because it actually hurts transparency. Visualizations, like any other form of news product, can be editorial-- even inadvertently. If government puts more of a priority on producing great visualizations and user experience than on providing quality accurate data with a great feedback loop, then it runs a pretty good chance of not adhering to the goal of being actually transparent.
You can see this on Recovery.gov right now. You get a sense that there's a lot of data underneath, but they've spent a lot of time on user-interface development. Check out, for instance, the agency summary page for the SSA. Looks great! Neat Charts!
The raw data tells a different story though. In this case, the data is powered by spreadsheets available at the bottom of the page. Open one up and you're likely going to see something like this:
So you can see why we're significantly less interested in government "totally nailing" putting bar charts on a website and far more interested in saying "hey, eye on the ball! make the data come out clean, and reliable, and give us a way to tell you when it isn't."
Discussion
What are Your Thoughts?
Have thoughts that might fuel this discussion further, post them below. (Markdown syntax is supported in comments.)

I totally agree--it's not dataviz.gov after all ... You made some good points about the interpretation that goes into visualization. As any statistician knows there is an element of interpretation that goes into the collection of data. But adding another layer of interpretation is something to avoid I think. Thanks for speaking loudly about this.
In the library world that I work in an organization called the JISC came up with a slogan "The coolest thing to do with your data will be thought of by someone else." I know I'm preaching to the choir, but dang if some of the people in the choir just don't want to sing :-)
I'm all for providing public access to the raw data -- and in a hypothetical scenario where we have to choose between raw or pre-chewed data, raw data wins any day. My company depends on it.
But I don't buy the core argument here that we shouldn't demand both, or that one impedes the other. The government has a responsibility to make its communications comprehensible to the broader public, not just dataviz geeks, and just because we expect and hope that the Fourth Estate will step up doesn't mean the government shouldn't have to try. A government visualization is in no way an obstacle to a newspaper doing a better job. And just because "the coolest thing to do with your data will be thought of by someone else" doesn't mean you shouldn't be expected to think. :)
The govt. viewpoint will of course be editorial; any agency analysis is potentially biased, especially when the stakes are high. But the potential for bias doesn't mean the government shouldn't communicate anything but raw data. Even biased analyses are usually rich with content -- often content that's not present or obvious in the raw data. (There's a lot of signaling in the editorial choices the presenter makes.)
Finally, not all important data will make economic sense to re-crunch privately. It costs money for the NYTimes and firms like us to do our thing, and we have to pick a few high-value targets; we can't always do it all.
-Peter Verifiable.com
Looking at Recovery.gov I think we should applaud their efforts. Just the attempts at putting up the data in an easy to digest manner for most people is a great step towards government spending transparency.
Wouldn't it be wonderful if everyday citizens could go directly to the governments site and see immediately where their tax money is going or what the progress of bills that would effect their futures are is an easily digestible manner. That is the goal right? (Feel free to disagree.)
I echo Peter's sentiment that the government and independent non-gov groups are not mutually exclusive. Having the data displayed visually as well as the raw data is the ideal. It's good thing to remember that no matter what the current administration is or is doing, there will be skeptics that don't believe anything pre-digested or served to them on any type of platter. And all the better for those non-believers that the raw data is provided and there are watchdogs, skeptics and data geeks to look at things at granular level.
I think the government putting out their own visualizations brings up an important ideal of pushing visual data displays to be as flexible by the viewer as possible. Being able to see the baseline move or the data slice in ever finer degrees points out the fact the people that make the "lens" can easily change the focus.
ccqkno inateenorse.
http://aqnzgx.com joujhx
What about those citizens who are excited about the availability of datasets but dont have the technical expertise to do something with it? is it transparency for all or just some...