Social Sector Data Sources
The Rustandy Center for Social Sector Innovation provides researchers with access to data and database administrative support. We also can help Chicago Booth researchers secure new datasets.
Below is a list of nonprofit sector data sources accessible for academic use. Due to license agreements, some data is restricted to the Booth and University of Chicago communities.
Access Our Resources
Bloomberg’s Environmental, Social & Governance dataset offers ESG metrics and ESG disclosure scores for more than 11,500 companies in over 80 countries. The dataset includes as-reported data and derived rations as well as sector- and country-specific data points. Historical data is available from 2006.
Booth faculty, MBA and PhD students, and alumni may access the data via Bloomberg stations in Harper Center and Gleacher Center computing labs. For questions about accessing the data stations, email Booth’s Information Technology staff.
View the Bloomberg ESG User Guide
In addition, the Rustandy Center can support faculty and PhD research by executing customized data pulls. For an overview of the Bloomberg ESG data and acquisition inquiries, contact the Rustandy Center Research Team.
COVID-19 Cases by Zip Code
The COVID-19 Cases by Zip Code dataset lists the number of COVID-19 cases at the zip code–level. The zip codes where data were made publicly available via government-sponsored sources are included.
COVID-19 Stay At Home Orders
The COVID-19 Stay At Home Orders dataset contains a summary of shelter-in-place, stay-at-home, and other orders at the state, city, and county levels. Additionally, it contains information on school closures, public-health declarations, and other COVID-19-related actions at all levels.
COVID-19 Business Regulations
The COVID-19 Business Regulations dataset contains a summary of business regulations related to COVID-19 at the state and city levels. Data fields captured focus on the rollout of restrictions stipulating which types of businesses are allowed to operate.
For more information, contact the Rustandy Center Research Team
by Candid is an online database that profiles grant makers and their grant recipients. Profiles on grant makers include searchable 990s or 990-PFs, their total giving and assets, funding interests, officers and trustees, and the number and size of grants awarded.
The FDO also offers profiles on grant recipients, including their sector, geographic and population focus areas, and the number of grants and grant dollars received. Grant recipient profiles may also include 990 forms. Historical data is available dating back to 2003, and the database is updated on an ongoing basis.
For more information, contact the Rustandy Center Research Team.
The Entrepreneurship Database program at Emory University works in partnership with the Aspen Network of Development Entrepreneurs as part of the recently formed Global Accelerator Learning Initiative. Its aim is to collaborate with a growing number of accelerator programs around the world to collect and collate comparable longitudinal data that describes early-stage entrepreneurs and their ventures.
GuideStar by Candid is a large source of information on nonprofit organizations. GuideStar gathers and disseminates information about IRS-registered nonprofit organizations, and provides information about each nonprofit’s mission, finances, programs, transparency, governance, and more. GuideStar provides complimentary access to a mix of digitized and nondigitized data. Booth offers access to the following proprietary, digitized datasets:
- Tax Form 990s from 2003 through 2019
- Tax Form 990EZs from 2003 through 2017
- Tax Form 990PFs from 2014 through 2017
This dataset is available for academic research only by Chicago Booth and University of Chicago faculty, PhD students, and staff. For more information, contact the Rustandy Center Research Team.
There is also data available from the Internal Revenue Service's Form 990 filed by a sample of 394 501(c)(3) organizations from 2004 to 2010 (unbalanced panel). The nonprofits included were sampled from the set of nonprofits rated by Charity Navigator. These were sampled to obtain variation within the sample with respect to the first rating publication date for each nonprofit. Access is available for Booth faculty and students. Request access by emailing the Fama-Miller Center.
Homebase offers a comprehensive national employment dataset sourced from a free employee scheduling and time-tracking tool utilized by over 100,000 local businesses. The participating businesses are primarily owner-and-operator managed and belong to the restaurant, food and beverage, retail, and service industries. Examples of the data content include hours worked, locations open, employees on duty, and type of business.
National Center for Charitable Statistics is a clearinghouse for data on the nonprofit sector in the United States. Working closely with the IRS, other government agencies, nonprofits, and the scholarly community, NCCS builds compatible national, state, and regional databases, and develops uniform standards for reporting on the activities of charitable organizations.
The Protest Response Orders dataset has a list of recorded orders and, in some cases, press releases on or after the death of George Floyd on May 25, 2020, through August 2020 related to the Black Lives Matter movement, protests, and racial equality and equity. Data are recorded for all 50 states and a limited set of counties and cities.
For more information, contact the Rustandy Center Research Team.
Loans and Deposits
Since 1997, have surveyed more than 100,000 financial institution locations in the United States to compile advertised depository interest rates (weekly) and loan interest rates (monthly). The dataset covers a large cross section of all branches and depository institutions, sampling a variety of types and sizes. Institutions include banks, credit unions, savings and loan companies, brokers, trust companies, and others. It includes full-service and limited-service facilities and offices, be it brick-and-mortar offices, cyber offices, or home/phone banking.
Surveyed interest rates include a large number of standardized deposit and loan products such as checking and savings accounts; auto loans; certificates of deposits (CD) of different sizes and maturities; home-equity and mortgages loans of different sizes, characteristics, and maturities; and other commercial lending products. Within each category, details are available for multiple terms and/or dollar tiers. The Loans and Deposits data from 2001 until September 2019 are available.
This data is available for academic research only by Chicago Booth faculty, PhD students, and staff. For more information, contact the Rustandy Center Research Team.
Service Charges and Fees
Since 2000, have compiled service charges and fees from nearly 100,000 financial institution locations across the United States on a weekly basis. The dataset covers a large cross section of all branches and depository institutions, sampling a variety of types and sizes, and as of 2013, the dataset covers more than 50 percent of institutions in the United States. Institutions include banks, credit unions, savings and loan companies, brokers, trust companies, and others. It also includes full-service and limited-service facilities and offices, be it brick-and-mortar offices, cyber offices, or home/phone banking.The Service Charges and Fees dataset includes retail, cash management, and loan fee information nationwide.
The dataset includes over 60 product types across both personal and commercial banking, including interest checking, online cash management, domestic wire transfers, safe deposit boxes, corporate and business interest checking, and others. There are also more than 190 product subcategories, including information on loan terms, bill pay, loan fee, and monthly charges. The dataset also includes details on reporting institutions, including, but not limited to, their certification number, FDIC unique number, Federal Reserve ID, routing number, contact information, MSA, and longitude and latitude for mapping purposes. Fees are reported in percentages and in dollars. The Service Charges and Fees data from 2001 until February 2020 are available.
This data is available for academic research only by Chicago Booth faculty, PhD students, and staff. For more information, contact the Rustandy Center Research Team.
Refinitiv, an LSEG dataset, is a major provider of financial market data and infrastructure, helping wealth advisors maintain a competitive edge by identifying opportunities and gaining insights through AI, analytics, data feeds, and workflow tools. Refinitiv's data catalog includes a wide range of content, including machine-readable, real-time, reference, pricing, and time series data. They also offer various datasets that cover company data, news, research, and Environmental, Social, and Governance (ESG) information.
Since 2002, Refinitiv has offered one of the most comprehensive ESG databases in the industry, consisting of more than 630 ESG metrics that span over 80 percent of the global market cap. Reflecting the underlying ESG data framework, Refinitiv ESG scores are transparent, data-driven assessments of companies’ relative ESG performance and capacity that integrate and account for industry materiality and company size biases.
Booth faculty, MBA, and PhD students may access the data via Refinitiv Workspace and WRDS. See Booth’s or the for additional access information.
As a distinctive dataset, RepRisk tracks business conduct risks, controversial activities, and risk incidents. Since 2007, RepRisk has annually monitored data for more than 200,000 listed and non-listed companies around the globe and across all sectors. Unlike other datasets, RepRisk does not rely on self-disclosed information. Instead, it uses an event-driven approach, screening public media sources in 23 languages to identify risks related to 28 ESG issues. These issues are selected based on key international standards such as the Equator Principles, IFC Performance Standards, ILO Conventions, OECD Guidelines for Multinational Enterprises, and World Bank Group Environmental, Health, and Safety Guidelines.
In addition to the core ESG issues, RepRisk covers 73 ESG “hot topics,” which extend and complement the main ESG issues by providing detailed, theme-based analyses. This adaptable approach has evolved to incorporate new trends and client feedback. It is important to note that RepRisk does not verify or validate risk incident allegations, rather, it serves as a critical source of transparent information that systematically identifies and evaluates risk incidents based on established methodologies.
Chicago Booth faculty, PhD students, and staff researchers can access the RepRisk data through WRDS. For more information or to access the data, visit the Booth or the . For specific inquiries, please contact the Rustandy Center research team.
The 2017 Corporate Social Responsibility (CSR) Metrics is a dataset constructed by the Rustandy Center for Social Sector Innovation using CSR reports from S&P 500 companies. The digitized dataset includes all the S&P 500 companies (as in 2017) and covers their performance on a wide range of CSR metrics:
• CSR Reporting: 1) Overall, 2) Standards, and 3) Goals.
• Social Metrics: 1) Diversity, 2) Safety, 3) Community Engagement, and 4) Suppliers.
• Environmental Metrics: 1) Greenhouse Gas (GHG), 2) Energy, 3) Water, 4) Waste, and 5) Accidents and Fines.
The dataset is available for academic research by Chicago Booth and University of Chicago faculty, PhD students, and staff. For more information, contact the Rustandy Center Research Team.
is an independent source for environmental social governance (ESG) and corporate governance research and ratings. This dataset contains ESG data for firms worldwide, with more than 4,000 firms included in the dataset since 2013.
Sustainalytics provides overall ESG scores as well as constituent scores at the indicator level, of which there are 70 core and industry-specific indicators. Each company in the dataset is classified into one of 42 industry peer groups. Sustainalytics factors in varying degrees of materiality and exposure to risks in their ratings and assesses firms on the basis of their preparedness, disclosure, and quantitative and qualitative performance.
Chicago Booth faculty, PhD students and staff researchers can access the data through WRDS. See Booth’s or the for additional access information.
The database consists of three parts:
1) ESG Scores data
ESG Scores data provides one aggregated score, three dimension-level scores and more than 30 criteria-level scores, produced using the S&P Global Corporate Sustainability Assessment process and other sources.
2) Trucost Environmental data
Trucost Environmental data measures companies鈥 environmental impact across key dimensions. This data can be used to assess the environmental costs, identify and manage environmental and climate risk, and conduct peer and portfolio analyses from a climate and environmental perspective.
3) Climate Analytics data
Climate Analytics data provides company-level data, which measures companies鈥 exposure to seven climate-change physical risks, analyzes companies鈥 exposure to carbon pricing risk under different possible climate change scenarios, and evaluates companies鈥 alignment with the Paris Agreement goal.
Chicago Booth faculty, PhD students, and staff researchers can access the data through WRDS. See Booth鈥檚 or the for additional access information.
The University of Chicago Library offers access to a variety of sources for researching nonprofit organizations and charitable giving, including industry overviews, data and statistics, and directories of organizations.
Learn More about Our Data Sources
For more information on social sector data sources, contact the Rustandy Center's Research Team. For more social sector data resources, go to . (Please note: the research computing site is only available to those with access to Booth’s intranet.)