Plant and Cell Physiology Advance Access originally published online on June 15, 2009
Plant and Cell Physiology 2009 50(7):1249-1259; doi:10.1093/pcp/pcp086
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This article appears in the following Plant and Cell Physiology issue: Special Issue Articles: Omics and Bioinformatics [View the issue table of contents]
Special Issue - Regular Paper |
PosMed-plus: An Intelligent Search Engine that Inferentially Integrates Cross-Species Information Resources for Molecular Breeding of Plants
1Bioinformatics And Systems Engineering (BASE) division, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045 Japan
2Plant Science Center (PSC), RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045 Japan
*Corresponding author: E-mail, toyoda{at}base.riken.jp; Fax: +81-45-503-9553.
| Abstract |
|---|
|
|
|---|
Molecular breeding of crops is an efficient way to upgrade plant functions useful to mankind. A key step is forward genetics or positional cloning to identify the genes that confer useful functions. In order to accelerate the whole research process, we have developed an integrated database system powered by an intelligent data-retrieval engine termed PosMed-plus (Positional Medline for plant upgrading science), allowing us to prioritize highly promising candidate genes in a given chromosomal interval(s) of Arabidopsis thaliana and rice, Oryza sativa. By inferentially integrating cross-species information resources including genomes, transcriptomes, proteomes, localizomes, phenomes and literature, the system compares a users query, such as phenotypic or functional keywords, with the literature associated with the relevant genes located within the interval. By utilizing orthologous and paralogous correspondences, PosMed-plus efficiently integrates cross-species information to facilitate the ranking of rice candidate genes based on evidence from other model species such as Arabidopsis. PosMed-plus is a plant science version of the PosMed system widely used by mammalian researchers, and provides both a powerful integrative search function and a rich integrative display of the integrated databases. PosMed-plus is the first cross-species integrated database that inferentially prioritizes candidate genes for forward genetics approaches in plant science, and will be expanded for wider use in plant upgrading in many species.
Keywords: Omics - Omic space - Superbrain - Web application
Abbreviations: NER, named entity recognition; QTL, quantitative trait locus; RFLP, restriction fragment length polymorphism; SSR, simple sequence repeat.
| Introduction |
|---|
|
|
|---|
Molecular breeding is an efficient way to upgrade plant functions that are pertinent to solving a variety of problems facing mankind, including changes in the environment and shortages of food, energy and bio-based materials. The efficiency of molecular breeding depends on our ability to access and utilize the available molecular information from the viewpoint of these valuable phenotypes. The recent accumulation of plant genome sequences, and increasing knowledge of gene functions in the published literature and various omics databases, is expected to accelerate the efficiency of molecular breeding. Plant-upgrading science, or molecular-based science for upgrading plant functions, requires a coherent information platform that integrates the available molecular knowledge and plant functions that have been investigated with forward genetics approaches.
To construct such a coherent integrated information tool for plants, we focused on Arabidopsis thaliana and rice, Oryza sativa, both well studied land plants. In particular, the genomic sequence for Arabidopsis has been determined (Arabidopsis Genome Initiative 2000
) and various levels of genome-wide information have been accumulated, such as a collection of full-length cDNAs (Seki et al. 2002
, Yamada et al. 2003
), expression profiles (Goda et al. 2008
, Matsui et al. 2008
), proteomes (Baerenfaller et al. 2008
) and interactomes (Cui et al. 2006). Reverse genetics approaches, whereby each gene is systematically knocked-out or knocked-in to observe the resulting phenotypic changes, have elucidated the relationships between phenotype and the responsible genes genome-wide (Kuromori et al. 2004
, Kondou et al. 2009
). These reverse genetics experiments are particularly efficient for model plants, such as Arabidopsis and rice, and the resulting data are accumulating in both public databases (Hirochika et al. 2004
, Kuromori et al. 2006
, Swarbreck et al. 2008
) and the literature (Coletti et al. 2001). The results are useful not only for understanding gene functions, but also for upgrading the valuable phenotypes of crops by altering their genomes: molecular breeding. On the other hand, forward genetics approaches have elucidated many quantitative trait loci (QTLs) of agriculturally important traits, such as increasing grain number (Ashikari et al. 2005
), grain width and weight (Song et al. 2007
, Shomura et al. 2008
), salt tolerance (Ren et al. 2005
) and reducing heading date (Takahashi et al. 2001
, Doi et al. 2004
).
Thus, the coherent information system needed for molecular breeding must be capable of realizing the intelligent inferential association of phenotypes and genes, through various types of input information including experimental data described in the literature and molecular networks inferred by bioinformatics. This system is necessary to fill the gap that exists between the knowledge accumulated by reverse genetics experiments in model plants and the QTL knowledge narrowed down by forward genetics approaches for useful crops (Fig. 1). Our proposed data integration model is analogous to an artificial intelligence-oriented artificial neural network approach, categorized as connectionist–symbolic hybrid integration (Sun et al. 1997). Here we introduce PosMed-plus (Positional Medline for plant-upgrading science), which is an application of the GRASE search engine (Kobayashi et al. 2008
), to the linked data generated from various plant science databases, in order to accelerate the prioritization of candidate genes for positional cloning in crops.
|
PosMed-plus is a plant science version of PosMed, which was initially established to assist in candidate selection for positional cloning work in mice, humans and rats (Yoshida et al. 2009
PosMed-plus is the first information tool to prioritize candidate genes for forward genetics approaches in plant science, and will contribute to a wider use of such systems in plant-upgrading sciences in many plant species. PosMed-plus is also integrated seamlessly with other data-browsing systems to supply both a powerful integrative search function and a rich integrative display of the integrated databases. For each candidate gene, the accumulated omics information from genomes to phenomes is displayed by OmicBrowse (Toyoda et al. 2007
) and can be downloaded by OmicDownload which joins the table of candidate genes selected by PosMed-plus with tables of other annotations such as full-length cDNAs, microarray and whole-genome tiling array data, genome annotations, genetic markers, polymorphisms and gene ontology (Matsushima et al. 2009
). PosMed-plus will be continuously maintained by RIKEN in order to make a significant contribution to a wide range of plant sciences, and expanded to utilize other plant information resources such as data for wheat and poplar and a manual association of literature references with genes of other plant species. PosMed-plus is available at http://omicspace.riken.jp/.
| Results |
|---|
|
|
|---|
A neural network representation of the statistical algorithm for searching complex literature and omics data
PosMed-plus prioritizes candidate genes for positional cloning by employing our original database search engine GRASE (Kobayashi et al. 2008
|
PosMed-plus is, therefore, a powerful tool that immediately ranks the candidate genes by connecting phenotypic keywords to the genes, with connections representing both gene–gene interactions and other biological interactions such as metabolite–gene, phenotype–gene, subcellular localization–gene, co-expression, protein–protein interactions (PPIs), and ortholog and paralog data. By utilizing orthologous and paralogous connections, PosMed-plus can facilitate the ranking of rice genes based on evidence found in other plant species. The system is an artificial superbrain (Yoshida et al. 2009
Manual curation work connecting Arabidopsis genes to the literature
The accuracy of PosMed-plus is strongly correlated with its ability to make correct associations between each gene and documents. This is because GRASE utilizes these associations to execute direct searches and inference searches that are supported by co-citations. To increase Posed-plus's; accuracy, we employed manual curation to make connections between Arabidopsis genes and the literature. Our original curation method is based on named entity recognitions (NERs; see Materials and Method for details). Rather than connecting every literature reference to genes, specialized curators create search rules to retrieve all the correct references from titles, abstracts and MeSH terms. In order to validate the effectiveness of our method, we compared our curation results with TAIR annotation (Table 1).
|
The number of total MEDLINE hits for PosMed-plus was 28% larger than the TAIR data set. This is because our method has an advantage over TAIR in terms of updating new data. On the other hand, TAIR has 2.4 times more genes with associated MEDLINE records. One reason for this is that TAIR annotators extract gene–reference relationships not only based on abstracts but also using the whole article text. As a result, TAIR sometimes connects many genes to each article describing omic research. In contrast, PosMed-plus only focuses on literature with gene and/or synonym names in the abstract. Generally, non-omic research that addresses the functions of a small number of genes will mention the gene names in the abstract. Our curation results suggest that only around 15% of the total genes have been functionally analyzed in Arabidopsis. We also compared the number of gene–reference pairs. Approximately 65% of PosMed-plus data matched with TAIR data. Although PosMed-plus has more MEDLINE references, TAIR has a greater number of gene–reference pairs. This is because TAIR tends to extract many genes from a single source of literature.
General search paths of PosMed-plus
Using the search functionalities of GRASE, PosMed-plus supports the following four types of searches:
- Direct search: GRASE searches genes located in the users chromosomal interval by performing a full-text search against the set of databases with the users keyword, i.e. the following search path is realized: keyword
doc-ument (e.g. literature)
gene
chromosomal interval (Fig. 2b, top).
- Inference search: by applying gene–gene relationships over the genes extracted by a direct search that is not located in the users chromosomal interval, GRASE discovers further genes that are indirectly related to the keyword via gene–gene relationships, i.e. the following search path is realized: keyword
document (e.g. literature)
gene1
gene2
chromosomal interval. The link between gene1 and gene2 is supported by omics data (Fig. 2b, bottom).
- Cross-species search: this is an extension of the direct search (i) to the rice genome. The connections from Arabidopsis genes to rice genes are supported by ortholog and paralog data (Fig. 2c, top)
- Cross-species inference search: this is an extension of the inference search (ii) to the rice genome. As for (iii) above, ortholog and paralog data connect Arabidopsis genes to rice genes (Fig. 2c, bottom).
In the final stage, these types of search results are integrated into a ranked gene list by species. In the following section we describe examples to illustrate the powerful applications of PosMed-plus.
1. Search with user-specified keywords and chromosomal intervals. A typical application of PosMed-plus is searching with phenotypic keywords and chromosomal intervals suggested by linkage analysis. As an example, we retrieved drought tolerance-related genes in the chromosomal interval from 0 to 10 Mbp on chromosome 6 in the rice genome ( Fig. 3A). In this example, PosMed-plus retrieved 24 candidate genes ranked by statistical significance between the users keyword and each gene. Although PosMed-plus found >200,000 documents, it returned results within 1 s. Users can download all the candidate genes together with the associated gene annotations, using the download rank list button in the left blue box (Fig. 3D). PosMed-plus also supports an expert mode that allows users to select possible search paths and confirm the number of resulting genes for each search path. By clicking on a gene name listed in the gene search result page shown in Fig. 3B, PosMed-plus shows the supporting evidence for each candidate gene. To confirm the expression pattern of candidate genes with a genome browser, we provide a link to OmicBrowse from the gene location (Fig. 3C). OmicBrowse covers four genome versions for Arabidopsis and two for rice, and each genome is mapped to omic-type databases and a total of 78 data sources (Table 2).
|
|
For chromosomal intervals, users can select according to location (e.g. 10 Mbp) or in relation to restriction fragment length polymorphism (RFLP) and simple sequenc repeat (SSR) markers.
2. Search with phenotypic keywords. PosMed-plus also allows users to find genes related to phenotypic keywords. For example, searching with the keyword rumpled leaves, PosMed-plus shows four known cases via the direct search and one new candidate gene via the inference search. For the four known cases, PosMed-plus shows the link to RAPID (RIKEN Arabidopsis Phenome Information Database) and users can confirm the phenotypes with pictures. PosMed-plus also shows the evidence documents in the inference path to the AT1G51500 candidate gene. In this case, AT1G51500 is retrieved via the AT1G17840 gene that is one of the four known genes found in the direct search. They are highly connected with co-expression, PPI and co-citation data.
3. Reference search with gene IDs. It is difficult to retrieve all the appropriate references based on gene names, because of the wide variation of synonyms. Moreover, sometimes the same abbreviated names are used for functionally different genes, causing false-positive hits. In PosMed-plus, we carefully extracted these gene–reference relationships manually. Therefore, users can retrieve the curated results with the gene ID (e.g. AGI code) even if abstracts do not contain the gene ID itself.
4. Search for omics data. As shown in Fig. 4, PosMed-plus integrates various data such as gene annotations, co-expressions, subcellular localizations, phenotypes and PPIs. Users can select any document set (the default setting is to search everything) and retrieve the required data, all with the same interface. PosMed-plus links not only to the original databases but also to our genome browser, OmicBrowse. OmicBrowse also assists users in accessing various omics data and in downloading the data (Matsushima et al. 2009
).
|
In silico positional cloning after QTL analysis in rice
To validate the efficacy of PosMed-plus, we checked whether PosMed-plus could successfully retrieve correct genes that have been identified by QTL analysis. Three examples are described below.
Ren et al. (2005
) isolated the SKC1 gene and through QTL analysis found that it encoded an Na+-selective transporter. In this example, we need to prioritize candidate genes without the functionally related keyword transporter. Instead of the functional keyword, we retrieved genes with the phenotypic keyword salt tolerance and selected the genomic interval between the markers C955 and E50811
[GenBank]
on chromosome 1. PosMed-plus returned the Os01g0307500 (cation transporter family protein) gene with a high ranking. This is because the keyword salt tolerance was mapped to the sodium ion transmembrane transporter gene AT4G10310, and Os01g0307500 was suggested as a homolog of AT4G10310.
Using a no-pollen type of male-sterile mutant (xs1), Zuo et al. (2008
) revealed that mutant microspores are abnormally condensed and agglomerated to form a deeply stained cluster at the late microspore stage. This results in cessation of the microspore vacuolation process, and, therefore, the mutant forms lack functional pollen. This mutation is controlled by a single recessive gene, termed VR1 (vacuolation retardation 1), which is located between the molecular markers RM17411 and RM5030 on chromosome 4. We searched candidate genes with a phenotypic keyword sterility in the suggested chromosome region. PosMed-plus suggested the Os04g0605500 gene (similar to calcium-transporting ATPase) as the homolog of the Arabidopsis calcium-transporting ATPase, AT3G21180. Since Schiøtt et al. (2004
) found that mutation of AT3G21180 results in partial male sterility, we conclude that PosMed-plus found an appropriate candidate.
Lastly, Zhang et al. (2008
) found a male sterility mutant of anther dehiscence in advance, add(t), between the markers R02004 and RM300 on chromosome 2. In this search, PosMed-plus returned RNA-binding region RNP-1, Os02g0319100 and Disease resistance protein family protein, Os02g0301800 with strong homology with Arabidopsis genes. PosMed-plus retrieved the Os02g0319100 gene as a homolog of Arabidopsis mei2-like protein 5, AT1G29400. As supportive evidence, Kaur et al. (2006
) showed that multiple mutants of all the Arabidopsis mei2-like (AML) genes displayed a sterility phenotype. The other candidate gene, Os02g0301800, was derived via an inference search. First, PosMed-plus retrieved the keyword sterility in a document describing the AT2G26330 gene. Then, AT2G26330 was linked to AT5G43470 supported by three co-citations. Finally, Os02g0301800 was returned as a homolog of AT5G43470. PosMed-plus originally suggested the Os02g0301800 gene because AT2G26330 is linked to the keyword sterility in a document. However, this document states that AT2G26330 causes aberrant ovule development and female-specific sterility. Since Zhang et al. (2008
) focused on male sterility, we conclude that Os02g0319100 is the appropriate candidate.
| Discussion |
|---|
|
|
|---|
PosMed has been widely used to prioritize candidate genes after QTL analysis in mice and successfully identify responsible genes, as reported previously. In this paper, we aimed to create a supportive tool for molecular breeding in plants, and describe an extension of PosMed to the model plants A. thaliana and O. sativa. PosMed-plus is a useful tool to assist positional cloning in silico. At the same time, PosMed-plus integrates various kinds of omics data and assists users by allowing them to access several omics databases at a time.
In order to expand our positional cloning support system to other important crop plants, we have been preparing ortholog and paralog information (Hanada et al. 2008
). We hope and expect that PosMed-plus will contribute towards solving many of the worlds environmental and food problems by supporting QTL analysis for useful plants.
| Materials and Methods |
|---|
|
|
|---|
Data source
In order to construct PosMed-plus in Arabidopsis, we combined the following four kinds of omics data (Table 2). First, genome and related functional annotations were obtained from TAIR (Swarbreck et al. 2008
Since OmicBrowse is designed as a scalable system for maintaining numerous genome annotation data sets, it already combined 74 databases and their different versions (Table 3). In addition to genome and transcriptome data, OmicBrowse contains marker, ontology, polymorphism and other data (Toyoda et al. 2007
, Matsushima et al. 2009
).
|
Manual high-accuracy curation for mapping from Arabidopsis gene to MEDLINE abstract
In order to develop a set of document databases for our original search engine for PosMed, we developed a method for mapping between Arabidopsis genes and MEDLINE abstracts, based on an NER (Leser et al. 2005
PosMed-plus RANKING
In order to prioritize the positional candidate genes, PosMed-plus first calculates the statistical significance between the users keyword and each gene. Then, a 2 x 2 contingency table
is generated that consists of the following:
- the number of documents that match both the keyword and the gene
- the number of documents that match the keyword but not the gene
- the number of documents that match the gene but not the keyword
- the number of documents that match neither the keyword nor the gene.
The P-value is then computed using Fishers exact test.
For an inference search, we statistically evaluate the relevance between gene1 and gene2 using the Fishers exact test. Thereafter, we compute the total P-value as P = 1 – (1 –Ps)(1 – Pr), where Ps is the P-value of the first association search between the users keyword and each gene, and Pr is the P-value of the gene–gene relationship applied in the second association search.
To treat biological data such as PPIs using this method, all biological data are described as sentences (e.g. protein A interacts with protein B) and they are stored as document sets in PosMed-plus.
Implementation
PosMed-plus was developed as a web-oriented tool using Java and Java Servlet. Detailed information is provided in Kobayashi et al. (2008
) and Yoshida et al. (2009
). Users can freely access PosMed-plus with a conventional web browser, and no plug-ins need to be installed. However, for Windows we recommend the use of Microsoft Internet Explorer 7 or later, or Firefox 2 or later, and for Macintosh we recommend Safari 2 or later, or Firefox 2 or later. PosMed-plus is freely available at http://omicspace.riken.jp/PosMed-plus/.
| Funding |
|---|
|
|
|---|
The Japanese Ministry of Education, Culture, Sports, Science and Technology Special Coordination Funds.
| Acknowledgements |
|---|
|
|
|---|
We thank Koji Doi and Michiel J. L. de Hoon for critically reading the manuscript.
| Footnotes |
|---|
3These authors contributed equally to this work.
| References |
|---|
|
|
|---|
Adie E., Adams R., Evans K., Porteous D., Pickard B. SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics (2006) 22:773–774.
Aerts S, Lambrechts D, Maity S, Van Loo P, Coessens B, et al. Gene prioritization through genomic data fusion. Nat. Biotechnol. (2006) 24:537–544.[CrossRef][Web of Science][Medline]
Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature (2000) 408:796–815.[CrossRef][Medline]
Ashikari M, Sakakibara H., Lin S, Yamamoto T, Takashi T, Nishimura A, et al. Cytokinin oxidase regulates rice grain production. Science (2005) 309:741–745.
Baerenfaller K, Grossmann J, Grobei M, Hull R, Hirsch-Hoffmann M, Yalovsky S, et al. Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science (2008) 320:938–941.
Coletti M, Bleich H. Medical subject headings used to search the biomedical literature. J. Amer. Med. Inform. Assoc. 8:317–323.
Cui J, Li P, Li G, Xu F, Zhao C, Li Y, et al. AtPID: Arabidopsis thaliana protein interactome database—an integrative platform for plant systems biology. Nucleic Acids Res (2008) 36:D999–D1008.
Doi K, Izawa T, Fuse T, Yamanouchi U, Kubo T, Shimatani Z, et al. Ehd1, a B-type response regulator in rice, confers short-day promotion of flowering and controls FT-like gene expression independently of Hd1. Genes Dev. (2004) 18:926–936.
Goda H, Sasaki E, Akiyama K, Maruyama-Nakashita A., Nakabayashi K, Li W, et al. The AtGenExpress hormone and chemical treatment data set: experimental design, data evaluation, model data analysis and data access. Plant J. (2008) 55:526–542.[CrossRef][Medline]
Hanada K, Zou C, Lehti-Shiu M, Shinozaki K, Shiu S. Importance of lineage-specific expansion of plant tandem duplicates in the adaptive response to environmental stimuli. Plant Physiol. (2008) 148:993–1003.
Harushima Y, Yano M, Shomura A, Sato M, Shimano T, Kuboki Y, et al. A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics (1998) 148:479–494.
Heazlewood J, Verboom R, Tonti-Filippini J, Small I, Millar A. SUBA: the Arabidopsis subcellular database. Nucleic Acids Res. (2007) 35:D213–D218.
Hirochika H, Guiderdoni E, An G, Hsing Y, Eun M, Han CD, et al. Rice mutant resources for gene discovery. Plant Mol. Biol. (2004) 54:325–334.[CrossRef][Web of Science][Medline]
Kato N, Watanabe Y, Ohno Y, Inoue T, Kanno Y, Suzuki H, et al. Mapping quantitative trait loci for proteinuria-induced renal collagen deposition. Kidney Int. (2008) 73:1017–1023.[CrossRef][Web of Science][Medline]
Kaur J, Sebastian J, Siddiqi I. The Arabidopsis-mei2-like genes play a role in meiosis and vegetative growth in Arabidopsis. Plant Cell (2006) 18:545–559.
Kobayashi N, Toyoda T. Statistical search on the Semantic Web. Bioinformatics (2008) 24:1002–1010.
Kondou Y, Higuchi M, Takahashi S, Sakurai T, Ichikawa T, Kuroda H, et al. Systematic approaches to using the FOX hunting system to identify useful rice genes. Plant J. (2009) 57:883–894.[CrossRef][Web of Science][Medline]
Kuromori T, Hirayama T, Kiyosue Y, Takabe H, Mizukado S, Sakurai T, et al. A collection of 11,800 single-copy Ds transposon insertion lines in Arabidopsis. Plant J. (2004) 37:897–905.[CrossRef][Web of Science][Medline]
Kuromori T, Wada T, Kamiya A, Yuguchi M, Yokouchi T, Imura Y, et al. A trial of phenome analysis using 4,000 Ds-insertional mutants in gene-coding regions of Arabidopsis. Plant J. (2006) 47:640–651.[Web of Science][Medline]
Leser U, Hakenberg J. What makes a gene name? Named entity recognition in the biomedical literature. Brief Bioinform. (2005) 6:357–369.
Matsui A, Ishida J, Morosawa T, Mochizuki Y, Kaminuma E, Endo TA, et al. Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA treatment conditions using a tiling array. Plant Cell Physiol. (2008) 49:1135–1149.
Matsushima A, Kobayashi N, Mochizuki Y, Ishii M, Kawaguchi S., Ishii M., et al. OmicBrowse: a Flash-based high-performance graphics interface for genomic resources. Nucleic Acids Res. (2009) (in press).
McCouch S, Teytelman L., Xu Y, Lobos K, Clare K, Waltam M, et al. Development and mapping of 2240 new SSR markers for rice (Oryza sativa L.). DNA Res. (2002) 9:199–207.[Abstract]
Moritani M, Togawa K, Yaguchi H, Fujita Y, Yamaguchi Y, Inoue H, et al. Identification of diabetes susceptibility loci in db mice by combined quantitative trait loci analysis and haplotype mapping. Genomics (2006) 88:719–730.[CrossRef][Web of Science][Medline]
Obayashi T, Hayashi S, Saeki M, Ohta H, Kinoshita K. ATTED-II provides coexpressed gene networks for Arabidopsis. Nucleic Acids Res. (2009) 37:D987–991.
Ren Z, Gao J, Li L, Cai X, Huang W, Chao D, et al. A rice quantitative trait locus for salt tolerance encodes a sodium transporter. Nat. Genet. (2005) 37:1141–1146.[CrossRef][Web of Science][Medline]
Schiøtt M, Romanowsky S, Baekgaard L, Jakobsen M, Palmgren M, Harper J. A plant plasma membrane Ca2+ pump is required for normal pollen tube growth and fertilization. Proc. Natl Acad. Sci. USA (2004) 101:9502–9507.
Seelow D, Schwarz J, Schuelke M. GeneDistiller—distilling candidate genes from linkage intervals. PLoS ONE (2008) 3:e3874.[CrossRef][Medline]
Seki M, Narusaka M, Kamiya A, Ishida J, Satou M, Sakurai T, et al. Functional annotation of a full-length Arabidopsis cDNA collection. Science (2002) 296:141–145.
Shomura A, Izawa T, Ebana K, Ebitani T, Kanegae H, Konishi S, et al. Deletion in a gene associated with grain size increased yields during rice domestication. Nat.Genet (2008) 40:1023–1028.[CrossRef][Web of Science][Medline]
Song X, Huang W, Shi M, Zhu M, Lin H. A QTL for rice grain width and weight encodes a previously unknown RING-type E3 ubiquitin ligase. Nat, Genet, (2007) 39:623–630.[CrossRef][Web of Science][Medline]
Sun R, Alexandre F. Connectionist–Symbolic Integration From Unified to Hybrid Approaches. London: Lawrence Erlbaum Associates Inc.
Swarbreck D, Wilks C, Lamesch P, Berardini T, Garcia-Hernandez M, Foerster H, et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. (2008) 36:D1009–D1014.
Takahashi Y, Shomura A, Sasaki T, Yano M. Hd6, a rice quantitative trait locus involved in photoperiod sensitivity, encodes the alpha subunit of protein kinase CK2. Proc. Natl Acad. Sci. USA (2001) 98:7922–7927.
Tanaka T, Antonio B, Kikuchi S, Matsumoto T, Nagamura Y, Numa H, et al. The Rice Annotation Project Database (RAP-DB): 2008 update. Nucleic Acids Res. (2008) 36:D1028–D1033.
Thornblad T, Elliott K, Jowett J, Visscher P. Prioritization of positional candidate genes using multiple web-based software tools. Twin Res. Hum. Genet. (2007) 10:861–870.[CrossRef][Web of Science][Medline]
Toyoda T, Mochizuki Y, Player K, Heida N, Kobayashi N, Sakaki Y. OmicBrowse: a browser of multidimensional omics annotations. Bioinformatics (2007) 23:524–526.
UniProt Consortium. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. (2009) 37:D169–D174.
van Driel M, Cuelenaere K, Kemmeren P, Leunissen J, Brunner H, Vriend G. GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res. (2005) 33:W758–W761.
Yamada K, Lim J, Dale J, Chen H, Shinn P, Palm CJ, et al. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science (2003) 302:842–846.
Yoshida Y, Makita Y, Heida N, Asano S, Matsushima A, Ishii M, et al. PosMed (Positional Medline): prioritizing genes with an artificial neural network comprising medical documents to accelerate positional cloning. Nucleic Acids Res. (2009) (in press).
Zhang Y, Li Y, Zhang J, Shen F, Huang Y, Wu Z. Characterization and mapping of a new male sterility mutant of anther advanced dehiscence (t) in rice. J. Genet. Genomics (2008) 35:177–182.[CrossRef][Web of Science][Medline]
Zuo L, Li S, Chu M, Wang S, Deng Q, Ding L, et al. Phenotypic characterization, genetic analysis, and molecular mapping of a new mutant gene for male sterility in rice. Genome (2008) 51:303–308.[Medline]
(Received March 31, 2009; Accepted June 10, 2009)
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Shinozaki and H. Sakakibara Omics and Bioinformatics: An Essential Toolbox for Systems Analyses of Plant Functions Beyond 2010 Plant Cell Physiol., July 1, 2009; 50(7): 1177 - 1180. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





