Surge of EPA data in Data.gov

Late afternoon yesterday, Data.gov went from 81 feeds to 261, and the EPA overtook the USGS for the agency providing the most data. The EPA added 180 new data files-- the Toxics Release Inventory data for each state and territory as well as for federal agencies for 2005, 2006 and 2007.

This data is interesting stuff-- dozens of CSV files (still in .exe compressed archives, ick) that speak to where corporations and government are managing toxic chemicals. There's lots of interesting data in there. But it isn't just a clear win-- this data is poorly documented byte delimited text files. While we do have some headers provided to get us started, but no real description of the actual files.

If you do end up working with this data for your [Apps for America 2: The Data.gov Challenge] entry, make some notes on how you parsed the data and let's create our own documentation for this data source.

Here's a breakdown of the data in Data.gov as of today:

Share |

Discussion

  1. Ben Jefferies 06/15/2009 8:26 p.m. (permalink)

    Would the sunlight organization be interested in gov't produced open source that link together data from different agencies?

  2. Chris Wolz 06/16/2009 9:19 a.m. (permalink)

    Hi Clay -

    Great to see the new data being added to the Data.gov catalog.

    Do you have any indication of what other data sources are slated to be added into Data.gov in the coming weeks?

    The more data sources the better for Apps for Democracy 2, but if we do not know what is going to be added and when, it makes it a bit of a moving target!

    Thanks -

    Chris

What are Your Thoughts?

Comments have been closed on this post.

Follow The Labs And See What We're Up To

  • Introducing the Open State Project API: http://bit.ly/9VseiO 10 states so far (5 are experimental), 37000+ bills, 1600+ legislators

1818 N Street NW, Suite 300
Washington, DC 20036
202.742.1520