Share
VIDEOS 1 TO 50
How to use Uniprot.org
How to use Uniprot.org
Published: 2015/10/05
Channel: Whitney Clench
UniProt: Exploring protein sequence and functional information
UniProt: Exploring protein sequence and functional information
Published: 2015/04/02
Channel: European Bioinformatics Institute - EMBL-EBI
UniProt Intro
UniProt Intro
Published: 2014/05/07
Channel: UniProt
UniProt peptide search and publications view
UniProt peptide search and publications view
Published: 2016/10/20
Channel: European Bioinformatics Institute - EMBL-EBI
UniProt Proteomes
UniProt Proteomes
Published: 2016/06/09
Channel: European Bioinformatics Institute - EMBL-EBI
UNIPROT TUTORIAL(MQT)
UNIPROT TUTORIAL(MQT)
Published: 2012/10/14
Channel: rabiic23
Blast intro
Blast intro
Published: 2014/05/11
Channel: UniProt
UniProtKB Intro
UniProtKB Intro
Published: 2014/05/07
Channel: UniProt
UniProt Feature Viewer
UniProt Feature Viewer
Published: 2016/04/29
Channel: European Bioinformatics Institute - EMBL-EBI
Searching the UniProt Database
Searching the UniProt Database
Published: 2016/02/07
Channel: sbbiovideos
Variation and Disease data in UniProt
Variation and Disease data in UniProt
Published: 2015/06/26
Channel: UniProt
Uniprot How To Video
Uniprot How To Video
Published: 2015/05/08
Channel: xRocco1
EMBL-EBI, programmatically: UniProt
EMBL-EBI, programmatically: UniProt
Published: 2017/06/22
Channel: European Bioinformatics Institute - EMBL-EBI
BLAST Lab
BLAST Lab
Published: 2014/10/06
Channel: bionerdery
Exploring models for human disease with UniProt
Exploring models for human disease with UniProt
Published: 2017/04/06
Channel: European Bioinformatics Institute - EMBL-EBI
uniprot file format
uniprot file format
Published: 2014/01/22
Channel: crazybiocomputing
UniProt: Como buscar información sobre proteínas usadas en biotecnología
UniProt: Como buscar información sobre proteínas usadas en biotecnología
Published: 2015/02/20
Channel: upobioinfo
Un paseo por una proteína de la base de datos UniProt (Parte 1 de 2)
Un paseo por una proteína de la base de datos UniProt (Parte 1 de 2)
Published: 2012/02/03
Channel: upobioinfo
Bioinformatics [Lesson 1]: Using SwissProt database to search for a specific protein
Bioinformatics [Lesson 1]: Using SwissProt database to search for a specific protein
Published: 2015/12/09
Channel: Sun Poppy
UniProt Comparisons CYB
UniProt Comparisons CYB
Published: 2014/07/04
Channel: hugenex2000
Genome annotation tracks in UniProt
Genome annotation tracks in UniProt
Published: 2016/05/17
Channel: UniProt
UniProt: Programmatic access to UniProtKB
UniProt: Programmatic access to UniProtKB
Published: 2015/06/04
Channel: European Bioinformatics Institute - EMBL-EBI
UniProt Tutorial Feb 7 2011v1 0
UniProt Tutorial Feb 7 2011v1 0
Published: 2011/03/07
Channel: Kashyap Chhatbar
UniProt website updates Dec 2015
UniProt website updates Dec 2015
Published: 2015/12/10
Channel: European Bioinformatics Institute - EMBL-EBI
FunRich tutorial video: Using UniProt database to preform enrichment anlaysis.
FunRich tutorial video: Using UniProt database to preform enrichment anlaysis.
Published: 2015/01/20
Channel: FunRich Tool
Mascot Server: Setting up a Uniprot proteome database
Mascot Server: Setting up a Uniprot proteome database
Published: 2016/04/20
Channel: MatrixScience
UniProt SPARQL in production @Biohackathon2015
UniProt SPARQL in production @Biohackathon2015
Published: 2015/11/27
Channel: togotv
Retrieve/ID mapping intro
Retrieve/ID mapping intro
Published: 2014/11/26
Channel: UniProt
QuickGO - Gene ontology annotation
QuickGO - Gene ontology annotation
Published: 2016/07/22
Channel: European Bioinformatics Institute - EMBL-EBI
UniProt and beyond: Integrated database infrastructure for the Life Sciences
UniProt and beyond: Integrated database infrastructure for the Life Sciences
Published: 2013/05/14
Channel: Eric Jain
Align intro
Align intro
Published: 2014/05/11
Channel: UniProt
How to insert hyperlinks to the PDB, PubMed, and Uniprot.
How to insert hyperlinks to the PDB, PubMed, and Uniprot.
Published: 2009/11/17
Channel: MolSoft Molecules in Silico
Un paseo por una proteína de la base de datos UniProt (Parte 2 de 2)
Un paseo por una proteína de la base de datos UniProt (Parte 2 de 2)
Published: 2012/02/03
Channel: upobioinfo
Base de datos UniProt - P53 - Flipped learning
Base de datos UniProt - P53 - Flipped learning
Published: 2017/02/03
Channel: upobioinfo
Tip of the Week: From UniProt to the PSI SBKB and Back Again
Tip of the Week: From UniProt to the PSI SBKB and Back Again
Published: 2011/10/04
Channel: Trey Lathe
How to use ProteinTagger with UniProt XML files
How to use ProteinTagger with UniProt XML files
Published: 2012/09/19
Channel: Jairo Velasco
BIOS - curso Uniprot/proteínas
BIOS - curso Uniprot/proteínas
Published: 2016/04/01
Channel: Bios Colombia
PDZ Domains From Uniprot
PDZ Domains From Uniprot
Published: 2013/11/19
Channel: Raja Bhayia
Uniprot Science Fair Screen Recording
Uniprot Science Fair Screen Recording
Published: 2017/05/15
Channel: 21Friends1
SPARQLing UniProt RDF: Using RDF based technologies to aid biological curation efforts
SPARQLing UniProt RDF: Using RDF based technologies to aid biological curation efforts
Published: 2011/08/29
Channel: togotv
Alternative Input: UniProt Identifier
Alternative Input: UniProt Identifier
Published: 2015/08/23
Channel: EvolutionaryTrace Bcm
Sci AI v0 3 automatic metatagging of the Uniprot, ChEBI, MeSH, ICD 10, Drugbank, Geneontology terms
Sci AI v0 3 automatic metatagging of the Uniprot, ChEBI, MeSH, ICD 10, Drugbank, Geneontology terms
Published: 2017/01/18
Channel: XPANSA Group
Collecting fasta file from Uni Prot
Collecting fasta file from Uni Prot
Published: 2016/03/18
Channel: Yash Gupta
BIOS - protein and more
BIOS - protein and more
Published: 2016/04/05
Channel: Bios Colombia
011 Phylogenetic Tree Construction
011 Phylogenetic Tree Construction
Published: 2015/09/27
Channel: Mr. Kuhn, Teacher
Linked Data and Data Flow using Uniprot, Data Sniffer, and RDF Editor
Linked Data and Data Flow using Uniprot, Data Sniffer, and RDF Editor
Published: 2016/12/01
Channel: Kingsley Idehen
Uniprot-tp53-part1
Uniprot-tp53-part1
Published: 2013/11/14
Channel: Rengul Atalay
BioJS UniProt Proteomes Viewer
BioJS UniProt Proteomes Viewer
Published: 2014/10/15
Channel: BioJS
SPARQLing UniProt RDF: Using RDF based technologies to aid biological curation efforts
SPARQLing UniProt RDF: Using RDF based technologies to aid biological curation efforts
Published: 2013/04/03
Channel: togotv
PDZ domains from Uniprot and our excel sheet: A comparison
PDZ domains from Uniprot and our excel sheet: A comparison
Published: 2013/11/20
Channel: Wasim Aftab
NEXT
GO TO RESULTS [51 .. 100]

WIKIPEDIA ARTICLE

From Wikipedia, the free encyclopedia
Jump to: navigation, search
UniProt
UPlogo1.png
Content
Description UniProt is the Universal Protein resource, a central repository of protein data created by combining the Swiss-Prot, TrEMBL and PIR-PSD databases.
Data types
captured
Protein annotation
Organisms All
Contact
Research center EMBL-EBI, UK; SIB, Switzerland; PIR, US.
Primary citation UniProt Consortium [1]
Access
Data format Custom flat file, FASTA, GFF, RDF, XML.
Website www.uniprot.org
www.uniprot.org/news/
Download URL www.uniprot.org/downloads & for downloading complete data sets ftp.uniprot.org
Web service URL Yes – JAVA API see info here & REST see info here
Tools
Web Advanced search, BLAST, ClustalO, bulk retrieval/download, ID mapping
Miscellaneous
License Creative Commons Attribution-NoDerivs
Versioning Yes
Data release
frequency
4 weeks
Curation policy Yes – manual and automatic. Rules for automatic annotation generated by database curators and computational algorithms.
Bookmarkable
entities
Yes – both individual protein entries and searches

UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature.

The UniProt consortium[edit]

The UniProt consortium comprises the European Bioinformatics Institute (EBI), the Swiss Institute of Bioinformatics (SIB), and the Protein Information Resource (PIR). EBI, located at the Wellcome Trust Genome Campus in Hinxton, UK, hosts a large resource of bioinformatics databases and services. SIB, located in Geneva, Switzerland, maintains the ExPASy (Expert Protein Analysis System) servers that are a central resource for proteomics tools and databases. PIR, hosted by the National Biomedical Research Foundation (NBRF) at the Georgetown University Medical Center in Washington, DC, USA, is heir to the oldest protein sequence database, Margaret Dayhoff's Atlas of Protein Sequence and Structure, first published in 1965.[2] In 2002, EBI, SIB, and PIR joined forces as the UniProt consortium.[3]

The roots of UniProt databases[edit]

Each consortium member is heavily involved in protein database maintenance and annotation. Until recently, EBI and SIB together produced the Swiss-Prot and TrEMBL databases, while PIR produced the Protein Sequence Database (PIR-PSD).[4][5][6] These databases coexisted with differing protein sequence coverage and annotation priorities.

Swiss-Prot was created in 1986 by Amos Bairoch during his PhD and developed by the Swiss Institute of Bioinformatics and subsequently developed by Rolf Apweiler at the European Bioinformatics Institute.[7][8][9] Swiss-Prot aimed to provide reliable protein sequences associated with a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recognizing that sequence data were being generated at a pace exceeding Swiss-Prot's ability to keep up, TrEMBL (Translated EMBL Nucleotide Sequence Data Library) was created to provide automated annotations for those proteins not in Swiss-Prot. Meanwhile, PIR maintained the PIR-PSD and related databases, including iProClass, a database of protein sequences and curated families.

The consortium members pooled their overlapping resources and expertise, and launched UniProt in December 2003.[10]

Organization of UniProt databases[edit]

UniProt provides four core databases: UniProtKB (with sub-parts Swiss-Prot and TrEMBL), UniParc, UniRef, and UniMes.

UniProtKB[edit]

UniProt Knowledgebase (UniProtKB) is a protein database partially curated by experts, consisting of two sections: UniProtKB/Swiss-Prot (containing reviewed, manually annotated entries) and UniProtKB/TrEMBL (containing unreviewed, automatically annotated entries).[11] As of 19 March 2014, release "2014_03" of UniProtKB/Swiss-Prot contains 542,782 sequence entries (comprising 193,019,802 amino acids abstracted from 226,896 references) and release "2014_03" of UniProtKB/TrEMBL contains 54,247,468 sequence entries (comprising 17,207,833,179 amino acids).[12][13]

UniProtKB/Swiss-Prot[edit]

UniProtKB/Swiss-Prot is a manually annotated, non-redundant protein sequence database. It combines information extracted from scientific literature and biocurator-evaluated computational analysis. The aim of UniProtKB/Swiss-Prot is to provide all known relevant information about a particular protein. Annotation is regularly reviewed to keep up with current scientific findings. The manual annotation of an entry involves detailed analysis of the protein sequence and of the scientific literature.[14]

Sequences from the same gene and the same species are merged into the same database entry. Differences between sequences are identified, and their cause documented (for example alternative splicing, natural variation, incorrect initiation sites, incorrect exon boundaries, frameshifts, unidentified conflicts). A range of sequence analysis tools is used in the annotation of UniProtKB/Swiss-Prot entries. Computer-predictions are manually evaluated, and relevant results selected for inclusion in the entry. These predictions include post-translational modifications, transmembrane domains and topology, signal peptides, domain identification, and protein family classification.[14][15]

Relevant publications are identified by searching databases such as PubMed. The full text of each paper is read, and information is extracted and added to the entry. Annotation arising from the scientific literature includes, but is not limited to:[10][14][15]

Annotated entries undergo quality assurance before inclusion into UniProtKB/Swiss-Prot. When new data becomes available, entries are updated.

UniProtKB/TrEMBL[edit]

UniProtKB/TrEMBL contains high-quality computationally analyzed records, which are enriched with automatic annotation. It was introduced in response to increased dataflow resulting from genome projects, as the time- and labour-consuming manual annotation process of UniProtKB/Swiss-Prot could not be broadened to include all available protein sequences.[10] The translations of annotated coding sequences in the EMBL-Bank/GenBank/DDBJ nucleotide sequence database are automatically processed and entered in UniProtKB/TrEMBL. UniProtKB/TrEMBL also contains sequences from PDB, and from gene prediction, including Ensembl, RefSeq and CCDS.[16]

UniParc[edit]

UniProt Archive (UniParc) is a comprehensive and non-redundant database, which contains all the protein sequences from the main, publicly available protein sequence databases.[17] Proteins may exist in several different source databases, and in multiple copies in the same database. In order to avoid redundancy, UniParc stores each unique sequence only once. Identical sequences are merged, regardless of whether they are from the same or different species. Each sequence is given a stable and unique identifier (UPI), making it possible to identify the same protein from different source databases. UniParc contains only protein sequences, with no annotation. Database cross-references in UniParc entries allow further information about the protein to be retrieved from the source databases. When sequences in the source databases change, these changes are tracked by UniParc and history of all changes is archived.

Source databases[edit]

Currently UniParc contains protein sequences from the following publicly available databases:

UniRef[edit]

The UniProt Reference Clusters (UniRef) consist of three databases of clustered sets of protein sequences from UniProtKB and selected UniParc records.[18] The UniRef100 database combines identical sequences and sequence fragments (from any organism) into a single UniRef entry. The sequence of a representative protein, the accession numbers of all the merged entries and links to the corresponding UniProtKB and UniParc records are displayed. UniRef100 sequences are clustered using the CD-HIT algorithm to build UniRef90 and UniRef50.[18][19] Each cluster is composed of sequences that have at least 90% or 50% sequence identity, respectively, to the longest sequence. Clustering sequences significantly reduces database size, enabling faster sequence searches.

UniRef is available from the UniProt FTP site.

Funding for UniProt[edit]

UniProt is funded by grants from the National Human Genome Research Institute, the National Institutes of Health (NIH), the European Commission, the Swiss Federal Government through the Federal Office of Education and Science, NCI-caBIG, and the US Department of Defense.[11]

References[edit]

  1. ^ UniProt, Consortium. (January 2015). "UniProt: a hub for protein information.". Nucleic acids research. 43 (Database issue): D204–12. PMID 25348405. 
  2. ^ Dayhoff, Margaret O. (1965). Atlas of protein sequence and structure. Silver Spring, Md: National Biomedical Research Foundation. 
  3. ^ http://www.genome.gov/page.cfm?pageID=10005283
  4. ^ O'Donovan, C.; Martin, M. J.; Gattiker, A.; Gasteiger, E.; Bairoch, A.; Apweiler, R. (2002). "High-quality protein knowledge resource: SWISS-PROT and TrEMBL". Briefings in bioinformatics. 3 (3): 275–284. PMID 12230036. doi:10.1093/bib/3.3.275. 
  5. ^ Wu, C. H.; Yeh, L. S.; Huang, H.; Arminski, L.; Castro-Alvear, J.; Chen, Y.; Hu, Z.; Kourtesis, P.; Ledley, R. S.; Suzek, B. E.; Vinayaka, C. R.; Zhang, J.; Barker, W. C. (2003). "The Protein Information Resource". Nucleic Acids Research. 31 (1): 345–347. PMC 165487Freely accessible. PMID 12520019. doi:10.1093/nar/gkg040. 
  6. ^ Boeckmann, B.; Bairoch, A.; Apweiler, R.; Blatter, M. C.; Estreicher, A.; Gasteiger, E.; Martin, M. J.; Michoud, K.; O'Donovan, C.; Phan, I.; Pilbout, S.; Schneider, M. (2003). "The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003". Nucleic Acids Research. 31 (1): 365–370. PMC 165542Freely accessible. PMID 12520024. doi:10.1093/nar/gkg095. 
  7. ^ Bairoch, A.; Apweiler, R. (1996). "The SWISS-PROT protein sequence data bank and its new supplement TREMBL". Nucleic Acids Research. 24 (1): 21–25. PMC 145613Freely accessible. PMID 8594581. doi:10.1093/nar/24.1.21. 
  8. ^ Bairoch, A. (2000). "Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times!". Bioinformatics. 16 (1): 48–64. PMID 10812477. doi:10.1093/bioinformatics/16.1.48. 
  9. ^ Séverine Altairac, "Naissance d’une banque de données: Interview du prof. Amos Bairoch". Protéines à la Une, August 2006. ISSN 1660-9824.
  10. ^ a b c Apweiler, R.; Bairoch, A.; Wu, C. H. (2004). "Protein sequence databases". Current Opinion in Chemical Biology. 8 (1): 76–80. PMID 15036160. doi:10.1016/j.cbpa.2003.12.004. 
  11. ^ a b Uniprot, C. (2009). "The Universal Protein Resource (UniProt) in 2010". Nucleic Acids Research. 38 (Database issue): D142–D148. PMC 2808944Freely accessible. PMID 19843607. doi:10.1093/nar/gkp846. 
  12. ^ UniProtKB/SwissProt release statistics
  13. ^ UniProtKB/TrEMBL release statistics
  14. ^ a b c Annotation of UniProtKB
  15. ^ a b Apweiler, R.; Bairoch, A.; Wu, C. H.; Barker, W. C.; Boeckmann, B.; Ferro, S.; Gasteiger, E.; Huang, H.; Lopez, R.; Magrane, M.; Martin, M. J.; Natale, D. A.; o’Donovan, C.; Redaschi, N.; Yeh, L. S. (2004). "UniProt: The Universal Protein knowledgebase". Nucleic Acids Research. 32 (90001): 115D–1119. PMC 308865Freely accessible. PMID 14681372. doi:10.1093/nar/gkh131. 
  16. ^ Where do UniProtKB sequences come from
  17. ^ Leinonen, R.; Diez, F. G.; Binns, D.; Fleischmann, W.; Lopez, R.; Apweiler, R. (2004). "UniProt archive". Bioinformatics. 20 (17): 3236–3237. PMID 15044231. doi:10.1093/bioinformatics/bth191. 
  18. ^ a b Suzek, B. E.; Huang, H.; McGarvey, P.; Mazumder, R.; Wu, C. H. (2007). "UniRef: Comprehensive and non-redundant UniProt reference clusters". Bioinformatics. 23 (10): 1282–1288. PMID 17379688. doi:10.1093/bioinformatics/btm098. 
  19. ^ Li, W.; Jaroszewski, L.; Godzik, A. (2001). "Clustering of highly homologous sequences to reduce the size of large protein databases". Bioinformatics (Oxford, England). 17 (3): 282–283. PMID 11294794. doi:10.1093/bioinformatics/17.3.282. 

External links[edit]

Disclaimer

None of the audio/visual content is hosted on this site. All media is embedded from other sites such as GoogleVideo, Wikipedia, YouTube etc. Therefore, this site has no control over the copyright issues of the streaming media.

All issues concerning copyright violations should be aimed at the sites hosting the material. This site does not host any of the streaming media and the owner has not uploaded any of the material to the video hosting servers. Anyone can find the same content on Google Video or YouTube by themselves.

The owner of this site cannot know which documentaries are in public domain, which has been uploaded to e.g. YouTube by the owner and which has been uploaded without permission. The copyright owner must contact the source if he wants his material off the Internet completely.

Powered by YouTube
Wikipedia content is licensed under the GFDL and (CC) license