Elucidating the key players of molecular mechanism that mediate the complex stress-responses in plants system is an important step to develop improved variety of stress tolerant crops. Understanding the effects of different types of biotic and abiotic stress is a rapidly emerging domain in the area of plant research to develop better, stress tolerant plants. Information about the transcription factors, transcription factor binding sites, function annotation of proteins coded by genes expressed during abiotic stress (for example: drought, cold, salinity, excess light, abscisic acid, and oxidative stress) response will provide better understanding of this phenomenon. STIFDB is a database of abiotic stress responsive genes and their predicted abiotic transcription factor binding sites in Arabidopsis thaliana. We integrated 2269 genes upregulated in different stress related microarray experiments and surveyed their 1000 bp and 100 bp upstream regions and 5UTR regions using the STIF algorithm and identified putative abiotic stress responsive transcription factor binding sites, which are compiled in the STIFDB database. STIFDB provides extensive information about various stress responsive genes and stress inducible transcription factors of Arabidopsis thaliana. STIFDB will be a useful resource for researchers to understand the abiotic stress regulome and transcriptome of this important model plant system.
The snippet could not be located in the article text. This may be because the snippet appears in a figure legend, contains special characters or spans different sections of the article.
STIFDB-Arabidopsis Stress Responsive Transcription Factor DataBase.
Int J Plant Genomics. 2009; 2009: 583429.
Stress Responsive Transcription Factor DataBase
National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS, GKVK Campus, Bellary Road, Bangalore 560 065, India
Department of Crop Physiology, UAS, GKVK Campus, Bellary Road, Bangalore 560 065, India
Received 2009 February 28; Accepted 2009 June 29.
Copyright 2009 K. Shameer et al.
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article has beencited byother articles in PMC.
Elucidating the key players of molecular mechanism that mediate the complex stress-responses in plants system is an important step to develop improved variety of stress tolerant crops. Understanding the effects of different types of biotic and abiotic stress is a rapidly emerging domain in the area of plant research to develop better, stress tolerant plants. Information about the transcription factors, transcription factor binding sites, function annotation of proteins coded by genes expressed during abiotic stress (for example: drought, cold, salinity, excess light, abscisic acid, and oxidative stress) response will provide better understanding of this phenomenon. STIFDB is a database of abiotic stress responsive genes and their predicted abiotic transcription factor binding sites inArabidopsis thaliana. We integrated 2269 genes upregulated in different stress related microarray experiments and surveyed their 1000 bp and 100 bp upstream regions and 5UTR regions using the STIF algorithm and identified putative abiotic stress responsive transcription factor binding sites, which are compiled in the STIFDB database. STIFDB provides extensive information about various stress responsive genes and stress inducible transcription factors ofArabidopsis thaliana. STIFDB will be a useful resource for researchers to understand the abiotic stress regulome and transcriptome of this important model plant system.
The challenge of maintaining a balance between a swelling population and the capacity to produce food is increasing day by day. Consequently, food security has become a burning issue for agricultural scientists and economists alike. Increasing crop productivity in view of the escalating population and diminishing cultivable land and natural resources has become vital. However, environmental stresses like drought, salinity, high and low temperatures, high light, and so forth, along with biotic agents like pests and diseases, reduce agricultural yields significantly and consequently affect food security. Developing crops that tolerate environmental stresses, while maintaining productivity, will become a critical requirement for enhancing agriculture in the twenty first century . Understanding the molecular mechanisms that underlie stress tolerance would be the first step in the generation of abiotic stress tolerant crops. To understand plant stress responses, unravelling the mechanisms of regulation of stress responsive genes assumes paramount importance. Gene regulation by Transcription Factors (TFs) is an important facet of stress responsive signal transduction cascades. Transcription factors are regulatory proteins that implement their functions by binding directly to the promoters of target genes in a sequence-specific manner to either activate or repress the transcription of downstream target genes . Transcriptional regulation of genes in response to abiotic stresses like drought, cold, salinity, high light, abscisic acid (ABA), oxidative stress, and so forth, is an emerging area of plant research. As it is impossible to perform biochemical identification and validation of individual genes involved in such complex regulatory events, bioinformatics approaches will be useful to acquire information by integrating diverse data sets and tools. Large scale data integration from multiple experimental and bioinformatics resources will provide a robust platform to understand the major molecular players behind a biological problem. In this paper, we describe the availability of a database designed around stress genes involved in abiotic stress regulation inArabidopsis thaliana.Apart from the database, we also discuss the generic trends of the genes and transcription factors available in STIFDB (Stress responsive TranscrIption Factor DataBase) . STIFDB (available at is a database of stress-related genes, which are upregulated in abiotic stress-related microarray experiments. These genes are further analysed to predict all probable abiotic stress responsive Transcription Factor Binding Sites (TFBS) in their regulatory regions, using an efficient, context specific, stress responsive transcription factor binding site prediction algorithm called STIF . We have also integrated Gene Ontology associations , gene descriptions from TAIR , transcription factor-related information from DATF , 1000 and 100 base pair up-stream sequences and the 5UTR sequence for each gene for further analysis, and the stress-signals based stress-profiles to identify the stress responsive impact on individual genes based on different stress signals.
The list of 2269 genes in STIFDB has been compiled from abiotic stress-related microarray experiments. Genes were obtained from gene expression databases like the Nottingham Arabidopsis Stock Centre (NASC) , Database Resource for Analysis of Signal Transduction in Cells (DRASTIC) , Microarray Expression Data Search of the Riken Arabidopsis Genome Encyclopaedia (RARGE-MAEDA) , and the StressLink Database . Genes that are consistently upregulated (upregulated in at least 3 replicates) of microarray experiments in response to various stress treatments like dehydration, drought, osmotic stress, salinity stress, ABA, cold, high light, and oxidative stress across various microarray experiments have been considered as stress responsive and included in the database. In cases where fold increases in expression levels were available, genes with a 4-fold expression change was used to consider the gene as a probable candidate for STIFDB. Sequence segments (1000 bp, 100 bp, and 5UTR) of genes were obtained from TAIR . The collected sequences were scanned further to identify potential abiotic stress responsive TFBS using the STIF algorithm. In response to abiotic stresses like drought, cold, salinity, high light, heat, salt, and so forth, 10 specific families of transcription factors are known to be involved. 22 HMM-based models  of these 10 specific families including subfamilies are used in STIF algorithm to scan for binding sites using STIF algorithm (see TablesTables11andand2).2). We have also consulted literature to cross-validate the transcription factor binding sites predicted by the STIF algorithm for 29 genes . STIFDB provides the 1000 bp promoter regions, along with their 5UTR sequences, extracted from TAIR, and identifies known transcription factor binding sites/cis-elements bound by abiotic stress responsive transcription factors. Flow chart of the steps involved in the development of STIFDB is provided inFigure 1.
Flow chart of steps involved in the development of STIFDB.
Table of transcription factors considered.
Details of transcription factors and subfamily members available in STIFDB.
2.1. STIF Algorithm for Prediction of Transcription Factor Binding Sites
STIF an HMM-based algorithmis developed to predict transcription factor binding sites in the upstream and 5UTR regions of genes extracted from TAIR. Statistical significance of the prediction is calculated for each prediction usingZ-Score and Normalization Score. Program based on STIF method accepts a DNA sequence (Upstream region + 5UTR) in FASTA format as the input. Extensive experimental results show that abiotic stress responsive transcription factors fall into ten transcription factor families . These are ABI3/VP1, AP2/EREBP, ARF, bHLH, bZIP, HB, HSF, Myb, NAC, and WRKY families, which have a total of 22 subfamilies. Abiotic stress responsive transcription factors largely belong to one of these 22 TF subfamilies (see TablesTables11andand2).2). Input sequence is scanned using library of these 22 preconstructed stress responsive transcription factor HMMs obtained from literature. Input sequences are scanned for matches to the HMM models. Subsequent to the HMM search, scores of all possible matches in forward and reverse orientations in the upstream regions of stress genes are calculated along with standard deviation and average. Based on STIF search results, hits are scored using significant scoring method. In the final step-Standard deviation, average and significant scores based on hits are used to calculate theZ-score and normalization.
STIFDB offers several unique features as well as integrated data from public resources that will be useful for the better understanding of the TFBS and function of the downstream genes.
TFmap  is a graphical representation of the upstream regions of the stress genes inArabidopsis thalianawith the predicted and the validated transcription factor binding sites marked along with theirZ-Scores. TFmaps are generated using Bio::Graphics module from Bioperl .
TheArabidopsisInformation Resource (TAIR)  maintains a database of genetic and molecular biology data for the model higher plantArabidopsis thaliana. TAIR ID is used in STIFDB to access the gene-based contents. Users can query the database using TAIR ID.
GO annotations  for the genes in STIFDB are obtained from TAIR. GO annotations will help the users to understand the known functional associations of genes in STIFDB.
Gene description is another feature of functional annotation, that provides a short description of genes along with predicted domain associations from InterPro database . Gene descriptions for genes reported in STIFDB are obtained from TAIR .
Users can access STIFDB using standard gene names or its aliases reported in TAIR  database. For example, TAIR ID-AT4G23600 refers to the single entry in the database with different aliases: CORI3, CORONATINE INDUCED 1, JASMONIC ACID RESPONSIVE 2, and JR2.
Chromosome Position refers to the exact location of the given stress gene among the 5Arabidopsis thalianachromosomes.
References to publications  and related resources are provided in each Gene-Related information pages.
This refers to the Transcription Factor Family whose binding site sequence has been located/predicted on a given promoter sequence. This database identifies binding sites of the ten stress responsive transcription factor families and their subfamilies.
Binding site refers to the core binding sequence to which a transcription factor binds. The binding site sequences have been characterized in literature reports, and the accompanying references are provided.
Orientation of Binding Sites refers to the DNA strand on which the transcription factor binding site has been located. It can be either on the forward strand or on the reverse DNA strand.
Stress Signal refers to the type of stress, which, according to literature reports, regulates the transcription factor. Most of the transcription factors dealt with here are regulated by various abiotic stress signals like drought, cold, heat, light, and so forth. A URL is provided to access the database based on different stress signals .
whereZis theZ-Score, Score is the HMM score of the hit, Mean is the Mean of scores of all window slides of query sequence, and the window size depends on the transcription factor binding sites,Std deviation (Standard Deviation)Standard Deviation of mean of all window slides of query sequence.
This algorithm is validated with an experimental data set of 27 stress genes fromArabidopsis thaliana. As per that information, we observed thatz-score for 100 bp and its 5UTR regions can be seen above 2.0 and for 1000 bp, and its 5UTR regions can be seen above 1.5.
whereis a factor that denotes Top 1st rank ofz-score of binding site for given TFBS and stress gene/Total number of binding sites for given TFBS andis a factor that denotes Total number of binding sites for all TFBS library and stress gene/Total number of binding sites for all TFBS library and all stress genes. The normalization score explains the distribution of particular TFBS (Transcription Factor Binding Site) in the whole data set of the stress genes. If the normalization numbers are low, then it means that it is well distributed among the data set.
STIFDB is organised such that the users can browse using four criteria like chromosome number, transcription factors, stress signal profiles, and sorted list of TAIR locus IDs. Users can search the STIFDB using TAIR locus IDs, Gene alias names, and stress signals. A BLAST- (blastn-)  based search tool is also implemented to search the database of 1000 bp promoter sequences of 2629 genes in STIFDB. A detailed screenshot of STIFDB with various features are provided inFigure 2.
With a growing world population, food security is a high concern as cultivation of food crops are in risk due to various biotic and abiotic stress factors. Better understanding of plant stress response mechanisms and application of knowledge derived from integrated experimental and bioinformatics approaches are gaining importance. Abiotic stresses cause up to 30% yield losses , and hence an explicit data organisation and a clearer understanding of the regulation of abiotic stress responsive genes have become crucial. With genomic sequence data available, bioinformatics tools have been valuable for large scale analyses of genes  and understanding gene regulation. STIFDB is a database of abiotic stress responsive genes, identified as responsive to various abiotic stress signals based on publicly available, genome wide stress microarray data. STIFDB is a useful resource to analyse the promoters of these abiotic stress responsive genes for potential stress-specific transcription factor binding sites, which would provide insights into the regulation of these stress responsive genes by upstream transcription factors. It also provides clues towards the stress signal that affects the transcription of this gene, which might offer clarity about signal specific regulation of these genes. List of genes in STIFDB indicates that abiotic stress responsive genes seem to be roughly the same numbers on all chromosomes. Chromosome-wise distribution of abiotic stress responsive genes in STIFDB is provided inFigure 3. Distribution of genes responsive to specific abiotic stress signals indicates that numerous genes are regulated in response to cold, drought, salinity, light and external ABA, and a lesser subset of genes that respond to oxidative stress and rehydration. Distribution of individual stress signal that affects genes in STIFDB is provided inFigure 4. There are also 41 genes that are expressed in response to multiple abiotic signals, cold, drought, and salinity. Analysing these genes as subsets or individually, would offer clues to understanding the individual stress transciptomes better, and analysing the promoters of these genes could provide insights into the regulation of these genes in response to their specific stress signal. We have further analysed the number of TFBS on the promoters of these abiotic stress responsive genes and have identified varying numbers of stress-specific TFBS. This gives a broad indication that the genes in STIFDB are indeed abiotic stress responsive. There seems to be greater numbers of certain TFBS than others. This could partly be due to the differences in the length of these cis-elements. Frequency of individual transcription factor binding sites on 2629 genes in STIFDB is provided inFigure 5. STIFDB would be a very useful tool to understand abiotic transcriptome and the regulatory events of abiotic stress genes inArabidopsis.
Chromosomewise distribution of abiotic stress responsive genes in STIFDB.
Distribution of individual stress signal that affects genes in STIFDB.
Frequency of transcription factor binding sites in STIFDB.
Experimental validation and evidence about how many of these TFBS actually bind a TF to bring about regulation of their downstream genein vivois still lacking suggesting that we need to be cautious about the hits and the seeming false positives. It also needs to be determined if a greater number of stress-specific TFBS on the promoter, a particular gene, means a greater role of that particular TF in its regulation. It is also would be worthwhile to analyse the promoters of subsets of genes that are regulated by specific stresses, to identify patterns of TFBS, which would have potential roles in the regulation of downstream genes responsive to a particular stress. Therefore, STIFDB provides a platform to understand the stress-regulome of abiotic stress responsive genes in plants. STIFDB will be a highly useful resource for a researcher working on abiotic stress responses in plants.
The project is funded by a Grant from the Department of Biotechnology, India. Susan Mary Varghese acknowledges the CSIR for the Senior Research Fellowship during the course of this research work. The authors thank UAS and NCBS (TIFR) for infrastructural support.
Oxford, UK: Blackwell Publishing; 2005. Plant Abiotic Stress.
Rhee SY, Dickerson J, Xu D. Bioinformatics and its applications in plant biology.
STIFDB (Stress responsive TranscrIption Factor DataBase)
Sundar AS, Varghese SM, Shameer K, Karaba N, Udayakumar M, Sowdhamini R. STIF: identification of stress-upregulated transcription factor binding sites in
Ashburner M, Ball CA, Blake JA, et al. Creating the gene ontology resource: design and implementation.
The Gene Ontology project in 2008. Nucleic Acids Research.
Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology.
Garcia-Hernandez M, Berardini TZ, Chen G, et al. TAIR: a resource for integrated
Functional and Integrative Genomics
Swarbreck D, Wilks C, Lamesch P, et al. The
information resource (TAIR): gene structure and function annotation.
Guo A, He K, Liu D, et al. DATF: a database of
Nottingham Arabidopsis Stock Centre (NASC)
Button DK, Gartland KM, Ball LD, Natanson L, Gartland JS, Lyon GD. DRASTICINSIGHTS: querying information in a plant gene expression database.
Sakurai T, Satou M, Akiyama K, et al. RARGE: a large-scale database of RIKEN
resources ranging from transcriptome to phenome.
Warren GJ. Cold stress: manipulating freezing tolerance in plants.
Eddy SR. Profile hidden Markov models.
Validation of STIF algorithm for 29 stress responsive genes.
Mahalingam R, Gomez-Buitrago A, Eckardt N, et al. Characterizing the stress/defense transcriptome of
Vinocur B, Altman A. Recent advances in engineering plant tolerance to abiotic stress: achievements and limitations.
TFmapTranscription Factor map in STIFDB.
Stajich JE, Block D, Boulez K, et al. The Bioperl toolkit: Perl modules for the life sciences.
Mulder NJ, Apweiler R. The InterPro database and tools for protein domain analysis.
Current Protocols in Bioinformatics
Bray EA. Classification of genes differentially expressed during water-deficit stress in
: an analysis using microarray and differential expression data.
Browse STIFDB using stress signals.
Altschul SF, Madden TL, Schffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Qu L-J, Zhu Y-X. Transcription factor families in
: major progress and outstanding issues for future research.
International Journal of Plant Genomics
How does Europe PMC derive its citations network?
Web of Science times cited (
Show all itemsGenes & ProteinsShow all itemsShow all itemsShow all itemsProtein InteractionsShow all itemsShow all itemsShow all itemsProtein FamiliesShow all itemsShow all itemsShow all itemsNucleotide SequencesShow all itemsShow all itemsShow all itemsFunctional Genomics ExperimentsShow all itemsShow all itemsShow all itemsProtein StructuresShow all itemsShow all itemsShow all itemsGene Ontology (GO) TermsShow all itemsShow all itemsShow all itemsSpeciesShow all itemsShow all itemsShow all itemsDiseasesShow all itemsShow all itemsShow all itemsData CitationsShow all itemsShow all itemsShow all itemsChemicalsShow all itemsShow all itemsShow all itemsExperimental Factor Ontology (EFO) TermsShow all itemsShow all itemsShow all itemsProteomics DataShow all itemsShow all itemsShow all itemsShow all itemsShow all items
Europe PMC is a service of theEurope PMC Funders Group, in partnership with theEuropean Bioinformatics Institute; and in cooperation with theNational Center for Biotechnology Informationat theU.S. National Library of Medicine (NCBI/NLM). It includes content provided to thePMC International archiveby participating publishers.