Data Sources

These are free Internet sources for real data and datasets available for public use. Great for use by Statistics teachers, by students, in the classroom, for projects, or for additional exploration. You may care to read some of my comments about these sources first. Please make sure to cite data appropriately. Feel free to email me if you have corrections, suggestions, or kind comments.

Arts and Entertainment
  • The Chronicle of Higher Education – data on higher ed institutions, students, and staff; some with interactive features
  • College Board – AP, SAT, PSAT/NMSQT data
  • College InSight – higher education data inc. student debt, financial aid, cost of attendance, economic diversity, racial diversity, enrollment, and student success.
  • College Scorecard Data – from the U.S. Department of Education; comprehensive and well-documented data from higher ed institutions for the past 20 years inc. financial aid, earnings info, performance of schools, student outcomes, etc. 
  • PayScale – U.S. Salary Data & Career Research Center (U.S.) searchable database of salary info on job title, degree/major, employer name or type, state, city, years experience, etc. Available for other countries as well.
  • See NYC and NYS education data links below
Health, Medicine, and Biological Sciences
  • American Cancer Society – facts and statistics reports give global, national, and state info on cancer inc. cases, deaths, risk factors, prevention, cancer screening, treatment, breast cancer, colorectal cancer, and for African Americans and Hispanic/Latinos. Bonus: Infographics Gallery.
  • Centers for Disease Control and Prevention – CDC data organized by topic; inc. vital statistics, state data, tools, and resources
  • Columbia Prediction of Infectious Diseases – CPID from Columbia University has compiled up-to-date Ebola data for Guinea, Liberia, Sierra Leone (as well as by province) from the World Health Organization and included three interactive predictive models for infection and mortality numbers: an optimized scenario, a no-change scenario, and a degraded scenario. Data is conveniently organized in tabular form. Models and maps for influenza incidence as well.
  • Cornell Bird Data – From Cornell's Lab of Ornithology; crowd-sourced bird data including bird species abundance, distribution, movements, breeding, population trends, and spread of diseases. Additional education guides; info about data collection, bird songs, feathers, and plumages.
  • – From the U.S. Dept. of Health and Human Services (in beta); data and datasets from various federal and state sources inc. the CDC, NIH, and CMS. Filters for topic, publisher, and format.
    • National Cancer Institute Fast Stats – From the NCI Surveillance, Epidemiology, and End Results (SEER) Program via the NIH, "Fast Stats is an interactive tool for quick access to key SEER and US cancer statistics for major cancer sites by age, sex, race/ethnicity and data type. Statistics are presented as graphs and tables."
  • Nutritionix – Large nutrition database with information on common foods, packaged products, and restaurant items. 
  • Shark Attack Data – shark/human interaction data and graphs compiled from the Shark Research Institute; sortable by country, U.S. state, provocation status, and fatality status; incident details available; 1900 to present. Note: although very interesting, because these data are not in a format that can be easily analyzed without compiling first.
  • World Health Organization – WHO data; some interactive graphs available
  • USDA National Nutrient Database for Standard Reference – food nutrient information
  • See NYC health data links below
  • – lots of information from this Pearson Education site: everything from a word of the day to calendars to biographies to maps to general summary statistics. Lots of quantitative and qualitative info about business, economy, geography, sports, medicine, science, U.S. history and government, world history, technology, the 50 U.S. States, and more.
New York and Federal
  • Citi Bike – data from NYC Citi Bike inc. trip histories, daily ridership and membership, real-time data, and monthly operating reports.
  • – U.S. government open data site for federal, state, and local data; 100,000+ datasets searchable by topic and keyword; resources and tools available for data visualizations
  • NYC Department of Education – NYCDOE data inc. school reports, graduation rates, and test results
  • NYC Department of Heath and Mental Hygiene – inc. vital statistics, HIV/AIDS data
  • New York State Education Department – NYSED public data inc. enrollment, teacher performance review ratings, and school report cards
  • NYC Open Data – NYC government open data site; 1300+ data sets, maps, and charts from various New York City agencies; to be more complete by 2018.
  • New York Police Department – NYPD site with current and historical crime data, traffic data, crime statistics, reports, and information.
Social Sciences
  • Bureau of Labor Statistics – BLS databases, tables, and calculators organized by topic
  • Bureau of Justice Statistics – BJS databases and tables organized by topic inc. courts, corrections, crime, and law enforcement data
  • Census Bureau – US census data tables (inc. those from the American Community Survey) organized by topic and geography; tools, infographics, and some visualizations available.
  • Encyclopedia Titanica – data and information from the RMS Titanic and it's 1912 sinking: deckplans, passenger and crew lists; statistics grouped by passenger class, boarding location, nationality, gender, age group, and survival status.
  • Federal Bureau of Investigation – FBI statistics site inc. data on hate crimes, terrorism, white-collar crimes, and campus crimes
  • General Social Survey – GSS data via the University of Chicago; since 1972, data collected regarding American people and society. Explore GSS questions, variables, and publications by subject, year, or keyword; downloadable in various formats.  
  • Gun Violence Archive – a not-for-profit group that provides free online public access to accurate information about gun-related violence in the United States. They claim to check for accuracy and provide some information about gun-related incidents.
  • Inside Airbnb – data sourced from Airbnb listings in major world cities on neighborhood locations, listing prices, room types, availability, etc. Interactive dataviz available via the site for each city, neighborhood, etc.
  • Pew Research Center – downloadable datasets from nonpartisan public opinion polls, demographics, and social science research; some interactive visualizations available
  • Tax Foundation – non-profit think tank; includes federal and state data on tax rates, revenue, and burdens; interactive tools and calculators; discussion of international taxes as well.
  • United Nations – UN searchable data on a wide variety of topics
  • US Mass Shootings, 1982-2018 – Data From Mother Jones’ Investigation. Available in multiple formats with variables including date, location, number of fatalities, type of weapon, etc.
  • World Bank – international data organized by country, topic, and indicator
  • Advanced Football Analytics – NFL team and player stats; team, player, salary, and game analysis; game probabilities; playoff projections; stats glossary; calculators, visualization, and other tools; discussions about home field advantage, correlations, football fallacies, and more.
  • Baseball Prospectus – baseball data galore; data organized by offense, pitching, general, team, manager, and splits categories. 1871 to present.
  • Major League Baseball (MLB) – lots o' baseball data inc. pitching, hitting, batting, by team, by player, etc. 1876 to present.
  • Major League Soccer (MLS) – U.S. soccer data inc. team, player, leader, and all-time stats.
  • National Basketball Association – NBA data; team stats, player stats, lineups, etc.
  • National Collegiate Athletic Association (NCAA) – men and women's college sports stats searchable by sport, student-athlete, and team; records data, championship summaries, and championship results.
  • National Football League (NFL) – American football data; player stats, team stats, player info, history, rules, etc.
  • Sports Reference – Links to baseball, pro basketball, pro football, college basketball, college football, hockey, and olympic sports reference sites. Excellent for summary and micro data.
  • USA Today Sports – data for various professional and college sports inc. scores, schedules, player stats, team stats, salaries, odds, standings, and more.
Weather and Physical Sciences
  • Climate Stations – historical temperature and precipitation data for various cities
  • MySound – From UConn; real time weather, water quality, and wave data in the Long Island Sound.
  • National Aeronautics and Space Administration (NASA) – Catalog of aerospace, applied science, earth science, management/operations, and space science data.
  • National Oceanic and Atmospheric Administration – NOAA's National Climate Data Center inc. marine/ocean, satellite, radar, climate, and weather data.
  • Roller Coaster Database – world roller coaster information inc. name, park, location, operating status, number of inversions, speed, height, length, layout shape, manufacturer, and more.
  • US Geological Survey – USGS data searchable by geographic location, keyword, USGS mission area, data source, and scientist.
Data Libraries and Archives – These types of sources contain datasets that may be good for teaching but may not be suitable for projects and assignments for which you wish students to produce work that is original in nature.
Minor updates made 09/09/2018.