Skip to Main Content

How to Find Data & Statistics: Data vs. Statistics: Finding Data

Reused and adapted with permission from the LibGuide of Hailey Mooney of Michigan State University Libraries at http://libguides.lib.msu.edu/datastats

Introduction to Finding Data

Start by defining your topic

Be specific about your topic so that you can narrow your search, but be flexible enough to tailor your needs to existing sources.

Identify the Unit of Analysis

This is what you should be able to define:

#1 - Who or What?

Social Unit: This is the population that you want to study.
It can be...

  • People
    For example: individuals, couples, households
  • Organizations and Institutions
    For example: companies, political parties, nation states
  • Commodities and Things
    For example: crops, automobiles, arrests

#2 - When?

Time: This is the period of time you want to study.
Things to think about...

  • Point in time
    A "snapshot" or one-time study
  • Time Series
    Study changes over time
  • Current information
    Keep in mind that there is usually a time lag before data will be published.  The most current information available may be a couple years old.
  • Historical information

 #3 - Where?

Space: Geography or place.
There are two main types of geographic classifications...

  • Political boundaries
    For example: nation, state, county, school district, etc.
  • Statistical/census geography
    For example: metropolitian statistical areas, tracts, block groups, etc.

Remember to define your topic with enough flexibility to adapt to available data!
Data is not available for every thinkable topic. Some data is hidden (behind a pay-wall for example), uncollected, unavailable. Be prepared to try alternative data.

Search Strategies

Search Strategy #1: Search in a Data Archive

Look within a data archive that collects within the general subject area that you are searching for.

  • Data Repositories (Open Access Directory)
  • Open data repositories from multiple academic disciplines.
  • Databib
    "A collaboration between the Purdue University Libraries and Penn State University to create a community-driven, annotated bibliography of research data repositories."

 


Search Strategy #2: Identify Potential Producers

Ask yourself: Who might collect and publish this type of data?

Then visit the organization’s website and see if you're right! Or, search for them as an author in the library catalog.

These are some of the main types of data producers:

Government Agencies

The government collects data to aid in policy decisions and is the largest producer of data overall. For example, the U.S. Census Bureau, Federal Election Commission, Federal Highway Administration and many other agencies collect and publish data. To better understand the structure of government agencies read the U.S. Government Manual and browse FedStats. Government data is free and publicly available, but may require access through library resources or special requests.

Non-Government Organizations

Many independent non-commercial and nonprofit organizations collect and publish data that supports their social platform. For example, the International Monetary Fund, United Nations, World Health Organization, and many others collect and publish data. For more information about NGOs, visit Duke Libraries NGO Research Guide. Data from NGOs may be free or fee-based. The library subscribes to many NGO data resources, so be sure to check the library’s e-resources pages or catalog.

Academic Institutions

Academic research projects funded by public and private foundations create a wealth of data. For example, the Michigan State of the State Survey, Panel Study of Income Dynamics, American National Election Studies, and many other research projects collect and publish data. Much of this type of data is free and publicly available, but may require access through library resources. Access to smaller original research projects may be dependent upon contacting individual researchers.

Private Sector

Commercial firms collect and publish data as a paid service to clients or to sell broadly. Examples include marketing firms, pollsters, trade organizations, and business information. This information is almost always is fee-based and may not always be available for public release. The library does subscribe to some commercial data services, particularly through the business library.

 


Search Strategy #3: Turn to the literature

Search for research studies based on secondary analysis of publicly available data sets.

Unfortunately, citation of research data is often incomplete.  Sometimes the best you will get is the title of the data set used, but check to see if the data or a related publication are cited and follow it up.  Don't commit this fallacy when you publish, cite your data.

Data Archive Bibliographies

  • ICPSR Bibliography of Data-Related Literature
    "A continuously-updated database of thousands of citations of works using data held in the ICPSR archive. The works include journal articles, books, book chapters, government and agency reports, working papers, dissertations, conference papers, meeting presentations, unpublished manuscripts, magazine and newspaper articles, and audiovisual materials."

Library Indexes

Library Catalog

  • Use the ISU Catalog interfaces (Classic Catalog or Fusion Catalog) as part of your literature review to find books on your topic that may cite relevant data providers or for books of statistical tables to identify sources of data. The library also has some data sets on CD-ROM. Try adding keywords such as “data” or “statistics” to your search. To expand your search to include other libraries, look in WorldCat (request outside materials through Interlibrary Loan).

Books on Research Methods


Search Strategy #4: Statistics lead to Data

Search for statistics and follow them to the source.

Try the search strategies for statistics detailed on the "Finding Statistics" tab of this guide.  Where does the statistic you find come from?  Can you track it down to the source survey or other data set?

 


 

Recap: Access to Data Sets

Depending on which search strategy you used, you may have already found the dataset file download link directly on a website.  Or, you may have just a reference/citation to a dataset or producer.  Here are some common ways to find the dataset files themselves.

  • Government agencies and universities will often post dataset files directly on their websites.
  • Check to see if the dataset has been archived in ICPSR or another topical data archive.
  • The library has many datasets on CD-ROM, especially in the Government Documents collection.  Search the library catalog for the study title.
  • Contact the data producer directly.

Evaluate Data


Once you’ve chosen a data set that you believe will work, take care to carefully evaluate it. Is it appropriate? Does it come from an authoritative source? Does it fit your needs? Does it cover your Where, When, and Who or What requirements? Are you willing to compromise your requirements or manipulate the data to fit your needs? Always read the documentation and codebook to ensure that the analysis you are planning to do really measures what you want it to.

Analyze Data

You may contact the ISU Math & Writing Center for help with data analysis. Or you may ask your professor for assistance. 

Tutorials

This is a short list of helpful tutorials that are useful for learning more about the technicalities of secondary data manipulation and codebooks.

Cite Data

Be sure to provide a proper citation for any data that you use.  The How to Cite Data research guide will help you determine how to format your citation.