Awesome Public Datasets
This is a list of topic-centric public data sources in high quality. They are collected and tidied from blogs, answers, and user responses. Most of the data sets listed below are free, however, some are not. This project was incubated at OMNILab, Shanghai Jiao Tong University during Xiaming Chen's Ph.D. studies. OMNILab is now part of the BaiYuLan Open AI community. Other amazingly awesome lists can be found in sindresorhus's awesome list.
NOTICE: This repo is automatically generated by apd-core. Please DO NOT modify this file directly. We have provided a new way to contribute to this repo. Join the slack community for an instant touch of HQ data updates.
Table of Contents
- Agriculture
- Architecture
- Biology
- Chemistry
- Climate+Weather
- ComplexNetworks
- ComputerNetworks
- CyberSecurity
- DataChallenges
- EarthScience
- Economics
- Education
- Energy
- Entertainment
- Finance
- GIS
- Government
- Healthcare
- ImageProcessing
- MachineLearning
- Museums
- NaturalLanguage
- Neuroscience
- Physics
- ProstateCancer
- Psychology+Cognition
- PublicDomains
- SearchEngines
- SocialNetworks
- SocialSciences
- Software
- Sports
- TimeSeries
- Transportation
- eSports
- Complementary Collections
Agriculture
Architecture
- Swiss Apartment Models - This dataset contains detailed data on 42,207 apartments (242,257 [...] [Meta]
Biology
Chemistry
Climate+Weather
ComplexNetworks
ComputerNetworks
CyberSecurity
DataChallenges
EarthScience
Economics
Education
Energy
Entertainment
Finance
GIS
Government
- Alberta, Province of Canada [Meta]
- Antwerp, Belgium [Meta]
- Argentina (non official) [Meta]
- Datos Argentina - Portal de datos abiertos de la República Argentina. Encontrá datos públicos [...] [Meta]
- Austin, TX, US [Meta]
- Australia (abs.gov.au) [Meta]
- Australia (data.gov.au) [Meta]
- Austria (data.gv.at) [Meta]
- Baton Rouge, LA, US [Meta]
- Beersheba, Israel - Open Data Portal (Smart7 OpenData) [Meta]
- Belgium [Meta]
- City of Berkeley Open Data [Meta]
- Brazil [Meta]
- Buenos Aires, Argentina [Meta]
- Calgary, AB, Canada [Meta]
- Cambridge, MA, US [Meta]
- Canada [Meta]
- Chicago [Meta]
- Chile [Meta]
- China [Meta]
- Dallas Open Data [Meta]
- DataBC - data from the Province of British Columbia [Meta]
- Debt to the Penny - The Debt to the Penny dataset provides information about the total [...] [Meta]
- Denver Open Data [Meta]
- Durham, NC Open Data [Meta]
- Edmonton, AB, Canada [Meta]
- England LGInform [Meta]
- EuroStat [Meta]
- EveryPolitician - Ongoing project collating and sharing data on every politician. [Meta]
- Federal Committee on Statistical Methodology (FCSM) (formerly FedStats) [Meta]
- Finland [Meta]
- France [Meta]
- Fredericton, NB, Canada [Meta]
- Gatineau, QC, Canada [Meta]
- Germany [Meta]
- Ghent, Belgium [Meta]
- Glasgow, Scotland, UK [Meta]
- Greece [Meta]
- Guardian world governments [Meta]
- Halifax, NS, Canada [Meta]
- Helsinki Region, Finland [Meta]
- Hong Kong, China [Meta]
- Houston, TX, US [Meta]
- Indian Government Data [Meta]
- Indonesian Data Portal [Meta]
- Iowa - Welcome to the State of Iowa's data portal. Please explore data about Iowa and your [...] [Meta]
- Ireland's Open Data Portal [Meta]
- Israel's Open Data Portal [Meta]
- Istanbul Municipality Open Data Portal [Meta]
- Italy - Il Portale dati.gov.it è il catalogo nazionale dei metadati relativi ai dati [...] [Meta]
- Jail deaths in America - The U.S. government does not release jail by jail mortality data, [...] [Meta]
- Japan [Meta]
- Laval, QC, Canada [Meta]
- Lexington, KY [Meta]
- London Datastore, UK [Meta]
- London, ON, Canada [Meta]
- Los Angeles Open Data [Meta]
- Luxembourg - Luxembourgish Open Data Portal [Meta]
- MassGIS, Massachusetts, U.S. [Meta]
- Metropolitan Transportation Commission (MTC), California, US [Meta]
- Mexico [Meta]
- Mississauga, ON, Canada [Meta]
- Moldova [Meta]
- Moncton, NB, Canada [Meta]
- Montreal, QC, Canada [Meta]
- Mountain View, California, US (GIS) [Meta]
- NYC Open Data [Meta]
- NYC betanyc [Meta]
- Netherlands [Meta]
- New York Department of Sanitation Monthly Tonnage - DSNY Monthly Tonnage Data provides [...] [Meta]
- New Zealand [Meta]
- OECD [Meta]
- Oakland, California, US [Meta]
- Oklahoma [Meta]
- Open Data for Africa [Meta]
- Open Government Data (OGD) Platform India [Meta]
- OpenDataSoft's list of 1,600 open data [Meta]
- Oregon [Meta]
- Ottawa, ON, Canada [Meta]
- Palo Alto, California, US [Meta]
- OpenDataPhilly - OpenDataPhilly is a catalog of open data in the Philadelphia region. In [...] [Meta]
- Portland, Oregon [Meta]
- Portugal - Pordata organization [Meta]
- Puerto Rico Government [Meta]
- Quebec City, QC, Canada [Meta]
- Quebec Province of Canada [Meta]
- Regina SK, Canada [Meta]
- Rio de Janeiro, Brazil [Meta]
- Romania [Meta]
- Russia [Meta]
- San Diego, CA [Meta]
- San Antonio, TX - Community Information Now - CI:Now is a nonprofit serving Bexar (San [...] [Meta]
- San Francisco Data sets [Meta]
- San Jose, California, US [Meta]
- San Mateo County, California, US [Meta]
- Saskatchewan, Province of Canada [Meta]
- Seattle [Meta]
- Singapore Government Data [Meta]
- South Africa Trade Statistics [Meta]
- South Africa [Meta]
- State of Utah, US [Meta]
- Switzerland [Meta]
- Taiwan gov [Meta]
- Taiwan [Meta]
- Tel-Aviv Open Data [Meta]
- Texas Open Data [Meta]
- The World Bank [Meta]
- Toronto, ON, Canada [Meta]
- Tunisia [Meta]
- U.K. Government Data [Meta]
- U.S. American Community Survey [Meta]
- U.S. CDC Public Health datasets [Meta]
- U.S. Census Bureau [Meta]
- U.S. Department of Housing and Urban Development (HUD) [Meta]
- U.S. Federal Government Agencies [Meta]
- U.S. Federal Government Data Catalog [Meta]
- U.S. Food and Drug Administration (FDA) [Meta]
- U.S. National Center for Education Statistics (NCES) [Meta]
- U.S. Open Government [Meta]
- UK 2011 Census Open Atlas Project [Meta]
- US Counties - This is a repository of various data, broken down by US county. While most of [...] [Meta]
- U.S. Patent and Trademark Office (USPTO) Bulk Data Products [Meta]
- Uganda Bureau of Statistics [Meta]
- Ukraine [Meta]
- United Nations [Meta]
- Uruguay [Meta]
- Valley Transportation Authority (VTA), California, US [Meta]
- Vancouver, BC Open Data Catalog [Meta]
- Victoria, BC, Canada [Meta]
- Vienna, Austria [Meta]
- Statistics from the General Statistics Office of Vietnam - Data in different categories are [...] [Meta]
- U.S. Congressional Research Service (CRS) Reports [Meta]
Healthcare
ImageProcessing
MachineLearning
Museums
NaturalLanguage
Neuroscience
Physics
ProstateCancer
- EOPC-DE-Early-Onset-Prostate-Cancer-Germany - Early Onset Prostate Cancer - Germany. [...] [Meta]
- GENIE - Data from the Genomics Evidence Neoplasia Information Exchange (GENIE) project of the [...] [Meta]
- Genomic-Hallmarks-Prostate-Adenocarcinoma-CPC-GENE - Comprehensive genomic profiling of 477 [...] [Meta]
- MSK-IMPACT-Clinical-Sequencing-Cohort-MSKCC-Prostate-Cancer - Targeted sequencing of clinical [...] [Meta]
- Metastatic-Prostate-Adenocarcinoma-MCTP - Comprehensive profiling of 61 prostate cancer [...] [Meta]
- Metastatic-Prostate-Cancer-SU2CPCF-Dream-Team - Comprehensive analysis of 150 metastatic [...] [Meta]
- NPCR-2001-2015 - Database from CDC's National Program of Cancer Registries (NPCR). The [...] [Meta]
- NPCR-2005-2015 - Database from CDC's National Program of Cancer Registries (NPCR). The [...] [Meta]
- NaF-Prostate - NaF Prostate is a collection of F-18 NaF positron emission tomography/computed [...] [Meta]
- Neuroendocrine-Prostate-Cancer - Whole exome and RNA Seq data of castration resistant [...] [Meta]
- PLCO-Prostate-Diagnostic-Procedures - The Prostate Diagnostic Procedures dataset (95,837 [...] [Meta]
- PLCO-Prostate-Medical-Complications - The Prostate Medical Complications dataset (3,350 [...] [Meta]
- PLCO-Prostate-Screening-Abnormalities - The Prostate Screening Abnormalities dataset (10,527 [...] [Meta]
- PLCO-Prostate-Screening - The Prostate Screening dataset (177,315 records, 35,875 subjects, [...] [Meta]
- PLCO-Prostate-Treatments - The Prostate Treatments dataset (13,409 records, 7,614 subjects, [...] [Meta]
- PLCO-Prostate - The Prostate dataset is a comprehensive dataset that contains nearly all the [...] [Meta]
- PRAD-CA-Prostate-Adenocarcinoma-Canada - Prostate Adenocarcinoma - Canada. Collected by the [...] [Meta]
- PRAD-FR-Prostate-Adenocarcinoma-France - Prostate Adenocarcinoma - France. Collected by ten [...] [Meta]
- PRAD-UK-Prostate-Adenocarcinoma-United-Kingdom - Prostate Adenocarcinoma - United Kingdom. [...] [Meta]
- PROSTATEx-Challenge - Retrospective set of prostate MR studies. All studies included [...] [Meta]
- Prostate-3T - The Prostate-3T project provided imaging data to TCIA as part of an ISBI [...] [Meta]
- Prostate-Adenocarcinoma-Broad-Cornell-2012 - Comprehensive profiling of 112 prostate cancer [...] [Meta]
- Prostate-Adenocarcinoma-Broad-Cornell-2013 - Comprehensive profiling of 57 prostate cancer [...] [Meta]
- Prostate-Adenocarcinoma-CNA-study-MSKCC - Copy-number profiling of 103 primary prostate [...] [Meta]
- Prostate-Adenocarcinoma-Fred-Hutchinson-CRC - Comprehensive profiling of prostate cancer [...] [Meta]
- Prostate Adenocarcinoma (MSKCC/DFCI) - Whole Exome Sequencing of 1013 prostate cancer samples. [Meta]
- Prostate-Adenocarcinoma-MSKCC - MSKCC Prostate Oncogenome Project. 181 primary, 37 metastatic [...] [Meta]
- Prostate-Adenocarcinoma-Organoids-MSKCC - Exome profiling of prostate cancer samples and [...] [Meta]
- Prostate-Adenocarcinoma-Sun-Lab - Whole-genome and Transcriptome Sequencing of 65 Prostate [...] [Meta]
- Prostate-Adenocarcinoma-TCGA-PanCancer-Atlas - Comprehensive TCGA PanCanAtlas data from 11k [...] [Meta]
- Prostate-Adenocarcinoma-TCGA - Integrated profiling of 333 primary prostate adenocarcinoma samples. [Meta]
- Prostate-Diagnosis - PCa T1- and T2-weighted magnetic resonance images (MRIs) were acquired [...] [Meta]
- Prostate-Fused-MRI-Pathology - The Prostate Fused-MRI-Pathology collection is a combination [...] [Meta]
- Prostate-MRI - The Prostate-MRI collection of prostate Magnetic Resonance Images (MRIs) was [...] [Meta]
- Prostate-R - The R package 'ElemStatLearn' contains a prostate cancer dataset from Stamey et [...] [Meta]
- QIN-PROSTATE-Repeatability - The QIN-PROSTATE-Repeatability dataset is a dataset with [...] [Meta]
- QIN-PROSTATE - The QIN PROSTATE collection of the Quantitative Imaging Network (QIN) contains [...] [Meta]
- SEER-YR1973_2015.SEER9 - The SEER November 2017 Research Data files from nine SEER registries [...] [Meta]
- SEER-YR1992_2015.SJ_LA_RG_AK - The SEER November 2017 Research Data files from the San Jose- [...] [Meta]
- SEER-YR2000_2015.CA_KY_LO_NJ_GA - The SEER November 2017 Research Data files from the Greater [...] [Meta]
- SEER-YR2000_2015.CA_KY_LO_NJ_GA - The July - December 2005 diagnoses for Louisiana from their [...] [Meta]
- TCGA-PRAD-US - TCGA Prostate Adenocarcinoma (499 samples). [Meta]
Psychology+Cognition
PublicDomains
SearchEngines
SocialNetworks
SocialSciences
Software
Sports
TimeSeries
Transportation
eSports
Complementary Collections
- Data Packaged Core Datasets
- OpenDataMonitor: An overview of available open data resources in Europe
- Quora: Where can I find large datasets open to the public?
- RS.io: 100+ Interesting Data Sets for Statistics
- CVonline: Image Databases
- InnoTrek: Leveraging open data to understand urban lives
- CV Papers: CV Datasets on the web
Special thanks to