Data Sources

Page

The internet is full of data that is ripe for analysis, but sometimes information is hard to find, or is published in a form that requires a frustrating amount of transformation.

The increased availability of data mining techniques and web scraping tools is making this process a lot easier, but why waste time trying to put together your own dataset when the content creators are increasingly publishing their raw data, and sometimes even releasing APIs.

This page lists some of the handy data sources that we’ve come across, but please post a comment at the bottom of the page or send us a message if there are any other sources that you think should be included.

International
Australian

 


International

  • United Nations: http://data.un.org
      The site belongs to the UN Statistics Division (UNSD) and it contains a huge range of UN statistical databases, however some of it hasn’t been updated since 2008. Data is available for direct download, or access through their API on the site.
    • Food and Agriculture Organization of the United Nations: http://faostat3.fao.org
          Although the FAO is part of the UN, they have a seperate site with more up to data data (and it’s got a pleasing bootstrap-style aesthetic). The site presents a lot more detail about their own analysis, methods and standards. The entire database is available as a single download, as individual datasets, through their API, or as a package in R:
install.packages("FAOSTAT") # install
library(FAOSTAT) # load
vignette(topic = "FAOSTAT") # introductory document
demo(topic = "FAOSTATdemo") # Demo of the FAOSTAT package
  • The World Bank: http://data.worldbank.org
      The site contains data about development in countries around the globe, covering everything from agriculture, climate, education, finance, health development. Millennium Development Goals etc. Data is available as a direct download, or access through their API on the site.
  • Open Oil: http://openoil.net
      Open Oil publishes a lot of info about international oil contracts, some which would be otherwise very difficult to find.  Data is available through their web map or through their API

Australian

  • GovHack:  https://www.govhack.org/2015-data/
      The team at GovHack setup a competition each year for teams to participate in a data mashup event.  As a part of the competition they publish a loooooong and very comprehensive list of government data sets (commonwealth, state & regional).   This is basically your one-stop shop.

Leave a comment