Across multiple disciplines, from healthcare to justice to poverty to housing, there are projects that benefit greatly from use of the administrative datasets made available by the government. Our project seeks to clean and combine two of these administrative datasets and then build an easily accessible, open source, cloud-based platform that can be used by social scientists to analyze the data. This is a joint effort between Urban's Housing Finance Policy Center and the Data Science and Technology team at Urban.

This project is funded by the Alfred P. Sloan Foundation.

Linked Data

The initial phase of this project seeks to explore data collection and sampling from public sources that offer robust information regarding mortgages, people and place. We have standardized and linked key data variables over time from two government data sources, the Home Mortgage Disclosure Act (HMDA) and the Census American Community Survey (ACS).


  Urban ADRF 101 Slides   

Download Data Dictionary


Citation: Urban Institute Sloan ADRF Database. Retrieved from http://adrf.urban.org. 2017.


Linked HMDA and ACS Database

Code Used to Generate the Data


State-level Data

Metro-level Data (CBSA)

County-level Data

PUMA-level Data

Zip Code-level Data

Tract-level Data


Geographic Crosswalks Used to Create the Data

Crosswalk data sourced from: Missouri Census Data Center, MABLE/Geocorr2k and MABLE/Geocorr14, Version 1.0: Geographic Correspondence Engine. Web application accessed August, 2017 at: http://mcdc.missouri.edu/websas/geocorr14.html



Sample Research Products


Housing Profile of Areas Affected by Hurricane Harvey
Bing Bai, Sarah Strochak, Bhargavi Ganesh 
October 27, 2017

Housing Profile of Areas Affected by Hurricane Irma
Bing Bai, Sarah Strochak, Bhargavi Ganesh 
October 27, 2017

Housing Affordability: Local and National Perspectives
Laurie Goodman, Wei Li, Jun Zhu 
March 28, 2018

Is Limited English Proficiency a Barrier to Homeownership?
Edward Golding, Laurie Goodman, Sarah Strochak
March 26, 2018

Spark for Social Science

Urban has developed an elastic and powerful approach to the analysis of massive datasets using Amazon Web Services’ Elastic MapReduce (EMR) and the Spark framework for distributed memory and processing. For tutorials and to use Spark to analyze the linked datasets we’ve created using HMDA and ACS data, visit our project on GitHub.


Spark for Social Science on GitHub


Contact Us

If you’d like to give us feedback on the datasets or the Spark platform, please fill out the form below.