Bioinformatics Online Home
   Chapters Links Problems Enroll for Updates Help

Home  >  Links

Table 10.4. Databases of patterns and sequences of protein families
Name   Web address   Description   Reference
3D-Ali aligned protein structures and related sequences using only secondary structures assigned by author of the structures Pascarella and Argos (1992)
3DCA Cluster Analysis integrating structural and sequence information to obtain predictions about functionally relevant clusters of residues see Web site
3D-PSSM uses a library of scoring matrices based on structural similarity given in the SCOP classification scheme (p. 433) for alignment with matrices based on sequence similarity Kelley et al. (2000)
BIND (Biomolecular Interaction Network Database) includes information on protein–protein interactions, molecular complexes, and pathways see Web site
BLOCKS ungapped blocks in families defined by the Prosite catalog Henikoff and Henikoff (1996); Henikoff et al. (1998)
cluSTr clustering of all proteins in PIR, TrEMBL, and SwissProt based on pair-wise similarity Kriventseva et al. (2003)
COGS (Clusters of Orthologous Groups database and search site) clusters of similar proteins in at least three species collected from available genomic sequences Tatusov et al. (1997)
CONSURF (Protein Surface Contact Analysis) mapping of functional regions on surface of proteins using conserved amino acid patterns Glaser et al. (2003)
DiffTool clustering of proteins based in similarity Chetouani et al. (2002)
DIP (Database of Interacting Proteins) database of interacting proteins Xenarios et al. (2000)
eMOTIF http://dna.Stanford.EDU/emotif/ common and rare amino acid motifs in the BLOCKS and
HSSP databases
Nevill-Manning et al. (1998)
HOMSTRAD structure-based alignments organized at the level of homologous familiesa Mizuguchi et al. (1998a)
HSSP sequences similar to proteins of known structure Dodge et al. (1998)
INTERPRO resource of protein domains and functional sitesb combination of Pfam, PRINTS, Prosite, and current SwissProt/TrEMBL sequence see Web site
a library of protein family cores based on msa of protein cores using amino acid substitution matrices based on structure (see Chapter 3) see Web page
NCBI search conserved domain database (rpsblast) or for domain architecture (cdart) see Web page
NetOGly 2.0 server
predicts glycosylation sites in mammalian proteins by NN analysis Hansen et al. (1997)
NNPSL predicts subcellular location of proteins by NN see Web site
Pfam profiles derived from alignment of protein families, each one composed of similar sequence and analyzed by HMMs Sonnhammer et al. (1998)
PIR family and superfamily classification based on sequence alignment Barker et al. (1996)
PRINTS protein fingerprints or sets of unweighted sequence motifs from aligned sequence families Attwood et al. (1999)
PROCLASS database organized by Prosite patterns and PIR superfamilies; NN system for protein classification into superfamily Wu (1996); Wu et al. (1996)
groups of sequence segments or domains from similar sequences found in SwissProt database by BLASTP algorithm; aligned by msa Corpet et al. (1998)
ProtoNet automatic hierarchical clustering of SwissProt Sasson et al. (2003)
Prosite groups of proteins of similar biochemical function on basis of amino acid patterns Bairoch (1991); Hofmann et al. (1999)
ProtoMap classification of SwissProt and TrEMBL proteins into clusters Yona et al. (1999)
PSORT predicts presence of protein localization signals in proteins see Web site
SignalP Web server
predicts presence and location of signal peptide cleavage sites in proteins of different organisms by NN analysis Nielsen et al. (1997)
SMART database of signaling domain sequences with accurate alignments Schultz et al. (1998)
SYSTERS classification of all sequences in the SwissProt database into clusters based on sequence similarity Krause et al. (2000)
TargetDB database of peptides that target proteins to cellular locations see Web site
Uniprot combined protein sequence database of PIR, SwissProt, and TrEMBL see Web site
    A list of Web sites with protein sequence/structure databases is maintained at Many protein family databases are accessible through the European Bioinformatics Institute ( List of protein–protein interaction databases is maintained at
    a Sequence alignments of each family shown with residues labeled by solvent accessibility, secondary structure, H bonds to main-chain amide or carbonyl group, disulfide bond, and positive Φ angle.
    b A combination of Pfam 5.0, PRINTS 25.0, Prosite 16, and current SwissProt and TrEMBL data. Additional merges with other protein pattern databases are planned.


© 2004 by Cold Spring Harbor Laboratory Press. All rights reserved.
No part of these pages, either text or image, may be used for any purpose other than personal use. Therefore, reproduction, modification, storage in a retrieval system, or retransmission, in any form or by any means, electronic, mechanical, or otherwise, for reasons other than personal use, is strictly prohibited without prior written permission.


Home Chapters Links Problems Enroll for Updates Help CSHL Press