PlantTribes - Home
With rapidly growing numbers of whole genome and expressed sequence tag (EST) sequences in our public databases, sequence-based protein classification systems are providing foundations for gene annotation, functional genomics, and comparative investigations of gene and genome evolution.
PlantTribes is an objective classification system for plan proteins based on cluster analyses of the inferred proteomes of the sequenced angiosperms
Arabidopsis thaliana v Columbia,
Oryza sativa v. japonica (Rice), and
Populus trichocarpa (poplar). Sequence data for Carica papaya (papaya v. 1.0; R. Ming et al. in preparation) and Medicago papaya (barrel medic, 60% complete) are also included in the current version of Tribes (v. 1.0). Results for these species are currently masked from view, but will be available when the genomes are publicly released. In addition to the genome-based tribe scaffold, unigenes from more than 200 plant and algal species TIGR Transcript Assemblies have been associated with each tribe (see documentation), resulting in a global classification of about 4 million putative plant protein sequences.
PlantTribes 1.0 incorporates an extensive collection of microarray expression data from Arabidopsis microarray experiments [link to the doc page]. Expression data is linked to the individual genes in PlantTribes, and can be accessed through any result including Arabidopsis gene sequences.
PlantTribes is based on the similarity-based clustering procedure
TribeMCL (Enright et al, 2002,2003) to classify protein-coding genes into putative gene families. MCL classifications have been constructed using three clustering stringencies , allowing the user to explore the stability of the protein classification. A second round of MCL clustering identifies SuperTribes that approximate objective superfamilies. PlantTribes also includes information about domains, traditional gene family names, and a unified nomenclature based on common terms.
Phylogenetic analyses of exemplar gene families show a strong, but not perfect correspondence between tribe membership and cladistic relationships. The results of these analyses provide insights into the
Arabidopsis, Rice, and Poplar genomes, gene family evolution, and the evolutionary dynamics of functional domains among gene families. In addition, the resulting classification schemes provide scaffolds for sorting protein sequences from other plant species.
How to cite PlantTribes - We hope you find PlantTribes useful in your research and teaching. Please cite the following, which includes the technical details about the database and user interface. As of August 15, 2007, the paper is under review. A copy of the manuscript is available on request from Kerr Wall (pkerwall@psu.edu) or Claude dePamphilis (cwd3@psu.edu):
P. Kerr Wall, Jim Leebens-Mack, Kai Muller, Dawn Field, Naomi S. Altman, Claude W. dePamphilis. PlantTribes: A gene and gene family resource for comparative genomics in plants. Submitted: Nucleic Acids Research. August 15, 2007.
Please see the
Documentation page for help and send comments, suggestions, questions, and/or bugs to pkerrwall at psu . edu