We're Opening America's Government
Join UsWe're a community of open source developers and designers dedicated to opening up our government to make it more transparent, accountable and responsible. We need your help.
Recent Posts
Don't Use Zip Codes Unless You Have To
- Written by
- Tom Lee
- Date
- 01/19/2012 11:19 a.m.
Many of us in the labs found it thrilling to watch the internet community unite around opposition to the SOPA and PIPA bills yesterday. Even more gratifying was seeing how many participating websites used our APIs to help visitors find their elected representatives. This kind of use is exactly why we built those tools, and why we'll always make them freely available to anyone who wants to make government more accessible to its citizens.
Still, I'd be lying if I said we don't occasionally wince when we see someone using our services in a less-than-ideal way. It's completely understandable, mind you: the problem of figuring out who represents a given citizen is tougher than you might think. But we hate to think that anyone is getting bad information about which office to call -- talking to the people who represent you should be simple and easy! Since this comes up with some frequency, it's probably worth talking about the nature of these problems and how to avoid them.
TL;DR: Looking up congressional districts by zip code is inherently problematic. Our latitude/longitude-based API methods are much more accurate, and should be used whenever possible.
The first complication is probably obvious: zip codes and congressional districts aren't the same thing. A zip code can span more than one district (or even more than one state!), so if you want to support zip lookups for your users, you'll have to support cases where more than one matching district is returned. Our API accounts for this, but it's important that your code do so, too. We err on the side of returning inclusive results when a zip might belong to multiple congressional districts.
Unfortunately, things are actually more complicated than that. Most people don't realize it, but zip codes describe postal delivery routes -- the actual routes that mail carriers travel -- not geographically bounded areas. Zip codes are lines, in other words, while congressional districts are polygons. This means that mapping zips to congressional districts is an inherently imperfect process. The government uses something called a zip code tabulation area (ZCTA) to approximate the geographic footprint of a given zip as a polygon, and this is what we use to map zip codes to congressional districts. But it really is just an approximation -- it's far from perfect.
It's much better to skip the zip code step entirely and simply look up your location against the congressional district shapefiles published by the Census Bureau using a precise geographic coordinate pair instead of a hazy, vague zip code. Thanks to the Chicago Tribune News App Team's excellent Boundary Service project, we offer exactly this capability. If you can, we strongly encourage you to get a precise latitude/longitude pair from your users (either by geolocating them or geocoding their full address), then use it to determine their representatives.
"But what about house.gov's ZIP+4 congressional lookup tool?" I hear you asking. It's true, many House offices use this tool to determine who your representative is (and whether you're allowed to email them). Unfortunately, just because this tool is on an official site doesn't mean it's perfect. Here in the Labs, Kaitlin (who lives in Maryland) can't write her representative because the ZIP+4 tool gives incorrect results. Besides, not that many people know their full nine-digit ZIP+4 code.
So if you can, use latitude/longitude pairs. If you can't, and have to depend on zips, we'll supply results that are very, very good -- but not as good as real coordinates would allow.
Broadcasters' Public Files Should Be Published Online (and it's absurd that we're even having this conversation)
- Written by
- Tom Lee
- Date
- 01/17/2012 12:11 p.m.
Luigi passed along a couple of links to a great/infuriating On the Media segment about the new rules the FCC is considering related to the online disclosure of political ad purchases.
To run through the issue quickly: every broadcast station is required to keep a "public file" of paper records related to campaign ad purchases. These records show basic information about how an ad was purchased, who bought it and when it aired. As the name implies, the file is available for public inspection, but only if you show up at the station and ask for it.
The FCC has proposed a rule that would require the public file to be posted online. We feel that this is an obvious and overdue step, and have submitted comments to the rulemaking saying as much. After all, it's 2012--it's absurd to claim that information is "public" if it isn't also online. And this information is particularly important: with Citizens United enabling a new flood of money into our political system--with less acountability!--keeping track of the ways in which wealth is deployed to move political opinion is more important than ever. The public file is a vital source of this kind of information.
The first OTM segment, which features Steven Waldman, does a good job of explaining all of this. The second one mostly just makes your blood boil. In it, Jack Goodman, a lobbyist for the National Association of Broadcasters, makes the case that posting the public file online would represent an onerous burden on broadcast stations.
Clearly, this is nonsense. As Waldman notes, Goodman is claiming that his would be "the first industry to use the internet to become less efficient." I've seen what the public file looks like. Yeah, there's a bunch of stuff in there, but obviously not too much to fax to the FCC once a day (or, preferably, enter into a modern electronic records-keeping system--perhaps one supplied by the FCC--instead of continuing to record everything on paper like it's 1970).
But forget for a moment how ridiculous Goodman's argument is. Consider how outrageous it is that he's even making it. This is one of the underappreciated pathologies that lobbying produces. If you're an organization like the NAB and you have a staff lobbyist, whenever an issue comes along--however minor--your lobbyist can be counted on to make a fuss about it. That's what they're paid to do, right? Here we have a disclosure burden that is basically the bureaucratic equivalent of your office manager announcing that expense reports have to be filed using a webform. Yet for some reason we're now having a national conversation about it.
It's absolutely dumbfounding to have an effort to make money in politics more transparent weighed against someone not wanting to use the fax machine. And yet here we are. That's the magic of the lobbying industry.
The FEC's New Mobile Site Could Use Some Work
- Written by
- Tom Lee
- Date
- 01/03/2012 5:10 p.m.
Last Friday the Federal Election Commission announced the launch of a new mobile interface. You should try it for yourself at http://fec.gov/mobile/. The site declares itself to be a beta, which I suspect you'll agree is something of an understatement.
Let's call a spade a spade: there's no use pretending this is good. To begin with, there are obvious superficial problems: graphs lack units, graphics have been resized in a lossy way, and the damn thing doesn't work on most Android devices.
Worse, there are substantive errors. Look at Herman Cain's cash on hand. Why are debts listed as a share of positive assets? Look at the Bachman campaign's receipts. Why is "total contributions"--which should reflect the entire pie--just a slice? (It's not 50% because other slices seem to have incorrectly counted overlap, too.) Why don't any of the line items below the graphs reflect the fact that some are components of others?
We asked the FEC for comment, but so far they've declined. Once the powers that be over there have a closer look, I'm confident they'll agree that the mobile site is a mess.
It's hard to know what to say about all of this. Part of Sunlight's mission is to encourage government agencies to embrace technology more fully. We don't want to send mixed messages by jumping down their throats when they actually try to do so. Sure, we gave FAPIIS a hard time, but that was because the site's creators were obviously and deliberately undermining the idea of public oversight. By contrast, I don't think anyone who worked on the FEC Mobile site intended to do a bad job.
And of course there's a fundamental question. Obviously the bits that are relaying incorrect information are a problem. But assuming those get fixed, is a half-hearted attempt like this better than nothing? I suppose there might be some poor, twisted soul who will enjoy listening to FEC meeting audio while they're at the gym (though frankly, if such a person existed I suspect they'd already be working here). But as a general matter it's difficult to imagine anyone needing a mobile interface to a set of campaign finance data that's as narrowly conceived as this one.
To their credit, it doesn't seem as if this mobile interface was created at the expense of the organization's much more important responsibility to publish data--a mission that, by and large, the FEC fulfills ably and with steadily increasing sophistication. There's always room for improvement, but the truly pressing needs, like reliable identifiers for contributors and meaningful enforcement of campaign finance law, are beyond the reach of the organization's technical staff.
Still, it's a bit amazing to see obviously wrong numbers attached to a product that Chairperson Bauerly has been quoted as endorsing appreciatively. Among those of us concerned about America's campaign finance system and the effect it has on our democracy, there is a sense that the FEC's leadership does not take its mission particularly seriously. The release of shoddy work like this mobile site does little to dispel that impression.
The data behind Capitol words
- Written by
- Dan
- Date
- 12/21/2011 10:06 a.m.
Last Monday we launched an update to our Capitol Words project, which indexes and tokenizes the Congressional Record daily. With the launch behind us and the dust starting to settle, I'd like to walk through how we get from raw text to attributed, searchable quotations, and provide some examples of how you can interact with the data directly.
Before delving into how it works, though, it's important to acknowledge the myriad developers whose work on this project has made it possible. I'm only the most recent steward of the site; the bulk of the data legwork for this iteration was handled by Aaron Bycoffe and Jessy Kate Schingler, and the web interface owes its beauty to Caitlin Weber and Ali Felski. Timball provided the hardware, and the list continues from contributions to the scrapers all the way back to the original conception and implementation of the idea by Josh Ruihley and Garrett Schure. It's the combined efforts of everyone involved that brought us the site that's available today.
Now, without further ado...
House Approves Sweeping Open Data Standards
- Written by
- Eric
- Date
- 12/19/2011 1:24 p.m.
At a Friday hearing, the House of Representatives significantly raised the bar on open data by passing a resolution requiring that a wide variety of crucial House legislative information be published online, in open formats, and at permanent predictable URLs. Daniel Schuman covered this on the Sunlight Foundation blog on Friday.
The new standards create a new central website, run by the Clerk of the House, that will host all House bills, resolutions, amendments, and conference reports. These documents will be online on January 1, 2012, and will be in XML.
Beyond that, the standards require committees to post their amendments, votes, hearing notices, which bills and resolutions they're considering, and lots of other documents. The Clerk is charged with building tools for committees to post this information to the new website; in the meantime, committees must post them to their own website, in PDF. Committees are also encouraged to post this information in XML, and "should expect XML formats to become mandatory in the future".
This is hugely valuable information that, to date, has been extremely difficult to discover in a reliable way. To get House legislation, one either needs to scrape THOMAS.gov (a Sisyphean ordeal), or to rely on the good work of people who've already done it. Committee information is terribly fragmented, and in some cases there is often no way to get it at all (such as committee votes and amendments), short of hiring people to go sit in committee rooms and record what goes on (a practice that forms the basis for a number of business models here in DC). This is the beginning of bringing much needed order to chaos, and sunlight to the legislative process.
These standards demonstrate excellent leadership on the part of the House, and offers a modern vision for how a legislative body should view its responsibilities to the public. The Senate should hear the sound of a gauntlet being thrown. The Committee's action is in keeping with Speaker Boehner's and Majority Leader Cantor's April call for the House Clerk to release legislative data in machine readable formats. It is very gratifying to see this call taken so seriously.