Abstract

ProClass is a protein family database that organizes non-redundant sequence entries into families defined collectively by PROSITE patterns and PIR superfamilies. By combining global similarities and functional motifs into a single classification scheme, ProClass helps to reveal domain and family relationships and classify multi-domain proteins. The database currently consists of more than 120 000 sequence entries, approximately 60% of which is classified into about 3500 families. To maximize family information retrieval, the database provides links to various protein family/domain and structural class databases and contains multiple motif alignments of all PROSITE patterns as well as global alignments of PIR superfamilies. The motif sequences are retrieved from both PIR-International and SWISS-PROT databases, including a large number of new members detected by our GeneFIND family identification system. ProClass can be used to support full-scale genomic annotation, because of its high classification rate. The ProClass database is available for on-line search and record retrieval from our WWW server at http://diana.uthct.edu/proclass.html

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call