ryanwold.net

A civic-minded citizen seeking the singularity

An entry

National Data Catalog

Date: 2012-01-01
Status: draft
Tags: api catalog data ruby transparency
UPDATE: Jan 1, 2012

The National Data Catalog has been retired by Sunlight Labs sometime ago.

Please check out other Data Catalogs like CKAN and offerings from CivicCommons. This summer, I'll be working on the National Data Catalog with Sunlight Labs, a division of the Sunlight Foundation.


The National Data Catalog is designed to be "an open platform for government data sets and APIs. It makes it easy to find datasets by and about government, across all levels (federal, state, and local) and across all branches (executive, legislative, and judicial)."

The purpose of this post is to outline how to get the code up and running so you can contribute to the development of the open source platform. Getting the code setup locally takes a bit of work, but is certainly doable. It consists of cloning 3 repositories located at Github down to the your local machine.

  1. get an API key from the National Data Catalog API's site)
  2. Install gems and their dependencies.
  3. configure several .yml config files located in /config)

    rake db:create # creates the development database rake db:migrate RAILSENV=test rake db:create RAILSENV=test rake db:migrate rake test (all tests should pass)

Developer References

So, after getting the basics installed, it is time to actually start using it, which involves using the "Importers". Making use of the importers also requires the Data Importer (http://github.com/sunlightlabs/datacatalog-importer). I started with the DataSF.org importer (http://github.com/sunlightlabs/datacatalog-imp-datasf). The importer also need a /config.yml file that requires the API key from the data-catalog /config/users.yml file for the specific importer - paste the API key into the specific importer's /config.yml file, and then...

Then, using the specific importer, run the following commands.

IMPORTER_ENV=local
rake pull
rake push