We're Opening America's Government
Join UsWe're a community of open source developers and designers dedicated to opening up our government to make it more transparent, accountable and responsible. We need your help.
Recent Posts
The Tech Behind TransparencyCamp
- Written by
- Jeremy
- Date
- 05/10/2012 10:28 a.m.
TransparencyCamp, Sunlight's open government unconference, is one of the few chances the Labs gets each year to go crazy with tech. Our goal is to use technology to enhance the conference experience and set the expectation for the type of "maker" culture we have here at Sunlight. Read on to find out some of the technology that makes TransparencyCamp run.
Web and Mobile Sites

Transparency Camp is somewhat of a hybrid unconference. A small number of sessions are planned in advance and we try to keep the session board as-is once it is initially set. One reason for this is that we have the sessions listed on the web site, mobile web app and screens at the venue. We just don't have the resources to constantly monitor the board for changes and have that reflected on all of the other places sessions are listed.
When someone submits a session, it is manually entered into the TCamp database and a physical print-out is placed on the schedule board. The TransparencyCamp codebase includes an undocumented API that provides feeds of all upcoming sessions as well as the full conference schedule. The backend service also pulls in tweets from Twitter that match event-related hash tags and messages from the official TCampDC account.
The mobile app is an HTML-based site that has been tested on both iOS and Android devices. The app was built on a long outdated version of Backbone.js that gets sessions, tweets and photos from the TransparencyCamp API. The social feeds are updated every minute so attendees can watch the stream as it happens.
Etherpad
Each year at TCamp we want to provide a way for attendees to take notes during sessions. Last year we turned session pages into mini-wikis where users could click to edit the page to add notes. The usage, as we mostly expected, was disappointing. The user would have to click edit, make their changes, hope someone else hadn't saved other changes in the meantime and then hit save. While not the most laborious process ever created, it was enough of a barrier to keep people from participating in note taking.
We've had great success with an internal instance of Etherpad here at Sunlight so Eric suggested we incorporate it into the TransparencyCamp site. If you are not familiar with it, Etherpad is a collaborative document editor much like Google Docs. We used the embeddable view and slapped a collaboratively editable document right onto each session page. Attendees could then immediately take notes without clicking around and without worrying about clobbering other people's changes.
We found that many more people participated in note taking and those that did had nothing but great things to say about the experience. Etherpad really hit the sweet spot of collaboration that a wiki just couldn't reach.
Optimizing Registration
It's the little things that count. Most people, when setting up a 4-lane registration table, would just divide last names by first letter into even groups of four. But what if there isn't an even distribution of last names? Andrew saw this potential inefficiency and sprung into action.
Armed with our list of registrants, he calculated the frequency of the first letter of the last names of attendees. The frequency results were then fed into a script that iterated through the possible partitions of the alphabet, selecting the partition that minimized the standard deviation of percentages of the alphabet of each partition.
Nicko approves of the optimized registration lanes. Photo by stereogab.
Photo Booth

While not necessarily new, Tim set up another instance of our Sunlight Photo Booth. It's really just an iMac running our web-based Photo Booth software, but use your imagination here. The HTML user interface communicates with the backend over a WebSocket connection, which invokes isightcapture to take each photo. A Python script then uses PIL to add a Lomo-esque effect to each photo and combine them into a single strip. The generated strip is then returned to the web-based UI, uploaded to Flickr and a QR code is displayed that links to photo strip's Flickr page. Whew!
2013?
We've already been discussing ideas for new tech at TransparencyCamp 2013: our own registration and payment processing system, wall-crawling robots to scan the schedule, RFID implants (badges, not people… okay, maybe people) and more. What will make the cut? Stay tuned!
Labs Update: May 2012
- Written by
- Jeremy
- Date
- 05/09/2012 10:13 a.m.
Like a phoenix rising from its ashes on a monthly basis, it's Labs Update time!
TransparencyCamp 2012
It may be cliché to say, but TransparencyCamp 2012 was the best TCamp ever! GROUP HUG! We doubled attendance from last year with over 400 attendees from 26 US states and 27 countries. Anything I write here won't do the awesomeness of the event any justice so just watch the video:
TransparencyCamp wouldn't have been possible without the effort and expertise of the entire Sunlight Foundation staff, but I want to highlight the work of our newest designer, Amy Cesal. Event branding was her first task here in the Labs and I think it's pretty clear that she knocked it out of the park. Great work, Amy!

Open State Project
Sunlight Boston got the chance to spend a week with us at the DC headquarters during TransparencyCamp. It was great having them in office, even if Paul is a tab zealot.
Paul and James have done a lot of work on the API side of the project, implementing full-text search and enhanced event support as it relates to committees and bills. Thom has been focused on getting the public site closer to launch, working on the new design with Ali and refining news/blog aggregation.
James also released a new version of scrapelib. The update features FTP and retry support, optionally obeying robots.txt and a pluggable caching layer. scrapelib is now based on requests, Kenneth Reitz's ubiquitous HTTP library.
Influence Explorer
It's non-stop data with the Influence Explorer team. Ethan worked to add Super PAC and independent expenditure sections on profiles. Alison processed updates to Contractor Misconduct data from POGO. Andrew did more work on the new regulatory filings section, which is planned to launch sometime in July.
Scout

Eric recently launched an open beta of Scout, an alert system for the things you care about in state and national government. It covers Congress, regulations across the whole executive branch and legislation in all 50 states.
You can read more details about the project in Eric's launch blog post, but here is a quick rundown:
- notifications via email, SMS, RSS and JSON
- searching for keywords and phrases in bills, speeches and regulations
- detailed activity on specific bills
Scout is yet another new Sunlight project that is built almost exclusively on our public API services including Open States, Capitol Words and Real Time Congress.
Team Journalism
Ryan investigated the exciting and fast-paced world of tariff suspensions for a piece she wrote on the miscellaneous tariff bill process.
Lee has been running a grade level analysis of congressional speeches, which have been declining over the last seven years. The piece, which I hope scores higher than congressional speeches, should be published within the next week or two.
Jacob crunched third quarter independent expenditure numbers after monthly and quarterly filers posted results this month, is beginning work on a Party Time redesign to take place this summer and threw together a real-time FEC filing system monitor.
Open Source
Now that we are up to 186 open source projects on GitHub, I figure it's about time we feature the best of what we've got. Newly released projects include:
- citation is a JavaScript library for extracting US Code citations from blocks of text. Eric has also provided citation-api, a small node.js wrapper to provide citation as a service.
- bill-nicknames is a project to crowdsource popular names for bills. The goal is to map popular-but-unofficial names like 'Obamacare' to the official bills to which they refer.
- oyster is a service for tracking regularly-accessed pages. It will cache pages that are frequently scraped, downloading new versions when page content changes.
Tidbits
- Our pals at Cubox are working to get DataJam ready for public use
- Daniel has been working on tools for the manual collection of political ad buy files at TV stations around the country.
- Drew and Kaitlin have been working on SuperFastMatch and related tools, including a browser extension.
- Ali and team have been designing for a bunch of projects including the new Open States public site, Sunlight Academy, the Sunlight Foundation redesign, Scout and Party Time.
- Dan crunched numbers for a bunch of stories based on Capitol Words and has been looking into new technologies and data sets to be included in the project.
- Tom has been helping to manage the third Knight app's progress, working on some new project proposals and desperately clawing his way out of a huge pile of email that accumulated during tcamp.
- A Sunlight Olympics hack, but not the one mentioned in the post, has grown into a full project! We'll have more details next month and an announcement at Personal Democracy Forum in June.
- May's album of the month is Threads by Now Now. I'm sure some of my coworkers may disagree, but they have no say in this post… so there!
Scout, in Open Beta
- Written by
- Eric Mill
- Date
- 04/24/2012 3:12 p.m.
We're opening a new tool to the public today for beta testing, called Scout.
Scout is an alert system for the things you care about in state and national government. It covers Congress, regulations across the whole executive branch, and legislation in all 50 states.
You can set up notifications for new things that match keyword searches. Or, if you find a particular bill you want to keep up with, we can notify you whenever anything interesting happens to it -- or is about to.
Just to emphasize, this is a beta - it functions well and looks good, but we're really hoping to hear from the community on how we can make it stronger. You can give us feedback by using the Feedback link at the top of the site, or by writing directly to scout@sunlightfoundation.com.
Shouldn't Robots Be Doing My Taxes By Now?
- Written by
- Tom Lee
- Date
- 04/17/2012 10:39 a.m.
It's Tax Day, and if you're a software developer, I'll bet you find it as mystifying as I do. Not the actual tax preparation (mine are still pleasantly straightforward, I'm happy to say), but the general awfulness of the experience. Why am I responsible for collecting PDFs (or worse, paper) from a half-dozen institutions, then manually reentering that data? Why am I paying a vendor $50 for what amounts to some unit tests and an electronic transaction or two?
It makes no sense. Government uses technology for a lot of things, and some of those things are very hard [insert requisite reference to the Apollo Program here]. But filling out forms is not a hard thing. In fact, it's one of the problems that web technology has tackled first and most comprehensively. The first thing you learn in most web frameworks is how to make forms! It's hard to think of any other part of the government's mission that affects so many people negatively and could so easily and obviously be improved by better technology.
The IRS is trying to make progress on this score, of course. E-Filing has been with us since 1986. And they seem excited about the new version of their IRS2Go mobile app. But why on earth would I want a mobile app to help me find the IRS's YouTube channel?
Here's a better idea: instead of assuming I want to learn more about how to do my taxes, why not make it so that I can afford to know less about the process? Five minutes in a text editor tells me that my W-2 can be represented in less than 300 bytes -- a fraction of a QR code's capacity. How about promulgating some data standards that would make it easier for me to digitize all those 1099-INTs saying that I earned thirty cents on a checking account? Surely TurboTax or H&R Block would be willing to create some mobile apps that let me input my information by scanning a matrix barcode with my phone.
Better yet: since the agency is already receiving that data from all those financial institutions through a separate stream, how about organizing the data for me and simply letting me sign off on my automatically-generated return? I suspect that a lot of people would like that, given that the alternative is spending a spring day doing paperwork.
Naturally, this is not an original idea. As you'll see in these fine pieces from United Republic and the New York Times, many people feel that lobbying by firms like Intuit (the makers of TurboTax) has stopped efforts to make filing your taxes less unbearable.
Is this a case of malign influence peddling to prop up an industry that should be partially automated away, or is it just another example of government technology badly lagging behind that of the private sector? Whatever the case might be, here's hoping something changes soon. The fact that we're still doing our taxes this way is ridiculous.
Data for Better Bill Searching
- Written by
- Eric Mill
- Date
- 04/10/2012 11:34 a.m.
I've put up a dataset on Github that maps popular search terms to bills in Congress. It's a simple, 5-column CSV designed to help people create better search engines that take in user input to search for bills. The idea is that this will be useful to, and get contributions from, the community of people out there working with legislation and building tools around them.
It's humble - I started it out with a mere 7 rows, assigning the keywords "Obamacare", "SOPA", "PIPA", and "PPACA" to the appropriate bills. There are certainly more good candidates than that, so please contribute via pull request, or if you don't know how to do that, open an issue and talk about it with words.