DOWNLOADABLE DATA


  Search results
The data files represented here includes data available in the Human Protein Atlas version 13. A subset of this data can also be downloaded from the Search page with the genes corresponding to the current search result in the result in different formats; XML, RDF & TAB.
 
  Single entry
Data in XML, RDF & TAB format can be accessed at single entry level using URLs structure as below:
/ENSG00000106631.xml
/ENSG00000106631.trig
/ENSG00000106631.tab

 
  Archived data
Data from version 12 of the Human Protein Atlas can be retrieved this the archive page

 
1 Normal tissue data
Expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays. The comma-separated file includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell type"), expression value ("Level"), the type of annotation (annotated protein expression (APE), based on more than one antibody, or staining, based on one antibody only) ("Expression type"), and the reliability or validation of the expression value ("Reliability"). The data is based on The Human Protein Atlas version 13 and Ensembl version 75.37.
normal_tissue.csv.zip
CSV-file, 5.4 MB
 
2 Cancer tumor data
Staining profiles for proteins in human tumor tissue based on immunohistochemisty using tissue micro arrays. The comma-separated file includes Ensembl gene identifier ("Gene"), tumor name ("Tumor"), staining value ("Level"), the number of patients that stain for this staining value ("Count patients"), the total amount of patients for this tumor type ("Total patients") and the type of annotation staining ("Expression type"). The data is based on The Human Protein Atlas version 13 and Ensembl version 75.37.
cancer.csv.zip
CSV-file, 5.3 MB
 
3 Subcellular location data
Subcellular localization of proteins based on immunofluorescently stained cells. The comma-separated file includes Ensembl gene identifier ("Gene"), main subcellular location of the protein ("Main location"), other locations ("Other location"), the type of annotation (annotated protein expression (APE), based on more than one antibody, or staining, based on one antibody only) ("Expression type"), and the reliability or validation of the expression value ("Reliability"). The data is based on The Human Protein Atlas version 13 and Ensembl version 75.37.
subcellular_location.csv.zip
CSV-file, 53.7 KB
 
4 RNA data
RNA levels in 44 cell lines and 32 tissues based on RNAseq. The comma-separated file includes Ensembl gene identifier ("Gene"), analysed cell line ("Cell line"), fragments per kilobase of transcript per million fragments mapped ("FPKM"), and abundance class ("Abundance"). The data is based on The Human Protein Atlas version 13 and Ensembl version 75.37.
RNA sequencing data for human tissue
RNA sequencing data for human cell lines


rna.csv.zip
CSV-file, 8.1 MB
 
5 Data from the Human Protein Atlas in XML format
The XML file contains most of the data in the Human Protein Atlas version 13, including protein expression data (in normal and tumor tissues and in cell lines), antigen sequences, Western blot data for antibodies, protein array data for antibodies, RNA-Seq data, external references such as UniProt identifiers, and more. The data is based on Ensembl version 75.37. The file structure is presented in the XSD-schema. This data can also be downloaded for a resulting gene set when using the search function (via the xml link on the result page).
The XML file presented here is compressed with gzip due to its size. It can be uncompressed with an archive program like 7‑zip.
proteinatlas.xml.gz
XML-file (gzip compressed), 275 MB
 
6 Data from the Human Protein Atlas in RDF format
This file contains a subset of the data in the Human Protein Atlas version 13 corresponding to the tissue annotations on gene level. This data can also be downloaded for a resulting gene set when using the search function (via the RDF link on the result page). This RDF release is BETA and will be extended and developed in coming releases. We thank Mark Thompson, Rajaram Kaliyaperumal and Eelke van der Horst (LUMC, The Netherlands), and Christine Chichester (SIB, Switzerland) for providing templates for generating the first beta-release of HPA nanopublications. Their contribution was made possible by IMI project Open PHACTS and EU FP7 project RD-Connect. This beta was developed within an ELIXIR collaboration.
proteinatlas.trig.gz
RDF trig-file (gzip compressed), 109.8 MB
 
7 Data from the Human Protein Atlas in TAB format
This file contains a subset of the data in the Human Protein Atlas version 13 corresponding to the data seen in the search result. This data can also be downloaded for a resulting gene set when using the search function (via the TAB link on the result page).
proteinatlas.tab.gz
TAB-file (gzip compressed), 1.3 MB