Wheat is one of the leading crops in Russia and worldwide (http://www.fao.org/fileadmin/templates/est/meetings/wto_comm/Trade_Policy_Brief_Russia_final.pdf). In most agricultural regions of the Russian Federation, the wheat pathogen complex affecting the plant organs containing chlorophyll is represented by the following species: Puccinia triticina causing leaf rust; P. graminis causing stem rust; and Blumeria graminis causing powdery mildew. Population structure of fungal pathogens depends on environment and wheat genotype. Taking into account the evolution of host-pathogen interactions [1, 2], genetic diversity of wheat and fungus must be monitored. Plant disease resistance detect pathogen attacks and facilitate a counter strike against pathogens [3,4,5,6]. The information on avirulent genes (Avr) in migrating fungal populations together with the resistant genes (R) and translocations in wheat lines is important to be incorporated into breeding programs. However, this pool of data is hard to use without integrative system. There is a lack of available database for Russian wheat varieties and breeding lines genotyped for resistant genes and no open access for the information of virulence genes in pathogen populations spreading on wheat crops.
Regarding the diversity of Russian wheat germplasm, there is a unique database GRIS developed at the Vavilov Research Institute. GRIS (http://wheatpedigree.net) is a Genetic Resources Information System for Wheat and Triticale, which provides information services for breeding and research programs containing the data on more than 170,000 genotypes (varieties and breeding lines) from national germplasm collections and research laboratories. Having an undoubted value for utilization by wheat geneticists, this database, however, does not fully meet the interests of both breeders and plant pathologists who deal with wheat breeding programs for immunity.
Since the modern technology of Marker Assisted Selection (MAS) complemented conventional selection methods, there is an urgent need to make the MAS technique widely available. A brilliant example of such an information resource is an open database MASwheat, developed in the USA. The MASwheat (http://maswheat.ucdavis.edu/) provides an extensive list of protocols for more than 40 molecular markers for resistance genes in wheat. However, before applying in regional breeding program, all molecular markers loaded in the MASwheat should be verified on an extended array of the Russian germplasm.
In order to catalogue information serving the interests of Russian wheat pathologists and breeders, which use conventional methods and MAS techniques, we have developed the MIGREW (Molecular Identification of Genes for Resistance in Wheat) database. The main goal of the proposed database is to increase the potentials of national wheat breeding for immunity to rusts and powdery mildew. The MIGREW also focuses on effectiveness of wheat resistance genes in different regions of Russia to make the database more adequate to the rapidly changing population structure of pathogens.
Database organization and content
MIGREW is developed in classical MVC (Model-View-Controller) architecture. The model layer is the PostgreSQL database containing 16 tables. The View layer is a WEB application that designed in Java Script language with Webix (webix.com) libraries. The Controller layer is the java application designed using spring.io libraries that performs REST API access to the data. On a simplified database scheme (see Fig. 1) one can find data structures and main connections between objects.
In the database, we have the following objects: diseases, plants, chromosomes, genes, markers, protocols, pathogens and papers. We have built advanced relations between objects. It makes possible to get the information in the following manner: For the Stem Rust disease we have X genes, that have own marks of resistance in selected regions. This phenomenon has been shown in the list of publications. In addition, we can go deeper. For each object of the database, we can obtain all levels of related objects. Some typical use cases with the database we have implemented in the web application.
In addition, one can use the MIGREW REST API for direct data access with any programming/modeling tool that support REST service calls (Python, R or Matlab, for instance). The web application uses same API so one can build own tool for the MIGREW data exploring. The example of using the MIGREW REST API with Python can be found in Additional file 1.
The MIGREW was initially loaded with the published data [7,8,9,10,11,12,13,14,15] along with our original unpublished data. Most of the entities in MIGREW has connection with data sources. Many of that publications have no yet been digitized (books and articles that has not been scanned into the digital form as .pdf or other), and, perhaps, never will.
Nowadays the database contains information extracted from 50 papers for the period from 1973 to 2018. The data on 4 diseases and 3 genes associated with them. There are 45 pyramided genotypes (noted as gene pyramids in the database), that combined from at least one of those genes; 11 markers that found for these genes; and 502 plants varieties.
Utility and discussion
The MIGREW home page (
migrew.sysbio.cytogen.ru
) provides an easy access to any part of the database. It contains basic information as Database Statistics, About and Feedback pages (Fig. 2).
On the homepage, the typical data exploring directions were placed: Localization of genes, Protocols of molecular markers, Wheat diseases, resistance and virulence genes. On the left side, there is a main navigation menu with data domains: “home, protocols, genes, chromosomes, markers, plants, pyramids, diseases, resistance, pathogens, and papers”. They are labeled separately to represent the stored information related to the specific tab. Each tab content has the similar structure of data visualization: the list of available domain objects, on the main page; and the single object description, that appears after double clicking on the list element. Moreover, the Description page contains the tabs with list of related objects. The “Diseases” domain (Fig. 3) is a good illustration of such an interface.
Selection of the “Diseases” option shows the list of fungal diseases of wheat stored in the database. By clicking the specific disease tab, one can get the diseases description page (Fig. 3, right) and obtain the following information: the list of wheat resistance genes and their effectiveness dependent on different regions of the Russian Federation, the wheat varieties and breeding lines carrying the resistance genes as single and as in pyramids with others; distribution of virulence in the pathogen population on the map of the Russian Federation.
After choosing “Genes” tab in the menu (Fig. 4), one can see the list of resistance genes available in the database as well as other basic properties of these genes, such as translocation state and localization on the chromosome. Across the page, there is a search bar for gene symbols. By double-clicking on the tag of the interesting gene, one will open the page with additional information: publications where the gene is mentioned, effectiveness against diseases, estimated in 0–4 scale; molecular markers for identification the gene; pyramids containing this gene; plants where this gene was identified with links to the corresponding publications.
Another way to get a gene information is to request it via chromosome. It can be done by choosing “Chromosome” tab in the menu. Across the page, there is a search bar for chromosomes. The bar also supports filtering chromosomes by the name of genome (A, B, D), the number of chromosome (1–7) and chromosome arm (large or short). The selected chromosome description page contains list of genes, which are carried on the target fragment of wheat genome.
After choosing “Markers” tab in the menu, one can see the list of available markers stored in the database and their basic properties such as inheritance type, structure and a short description (Fig. 5, left). Across the page, there is a search bar for marker symbols. By double-clicking on the tag of the interesting marker, one opens the page with more information (Fig. 5, right): corresponded publications, genes that it marks; marker protocol; plants where this marker was identified. Markers, verified by the Laboratory of Plant Molecular Genetics and Cytogenetics at the Institute of cytology and genetics SB RAS, are illustrated with original electropherograms. There is an option in the MIGREW to contribute the data of markers, which verified in another laboratory.
After choosing “Protocols” tab in the menu (Fig. 6, left), one can find the list of molecular markers and corresponding resistance genes. Double-click on a marker or a gene tag shows the protocol data, which includes amplification cycles, annealing temperature, visualization system and length of PCR diagnostic fragment (Fig. 6, right).
After choosing “Plants” tab in the menu (Fig. 7, left), one can see the list of plants stored in the database. Double-click on a certain plant opens the page with additional information (Fig. 7, right) such as: originator of variety genes and markers identified in this plant genotype, effectiveness against diseases and links to corresponded publications.
After choosing “Pyramids” tab in the menu (Fig. 8, left), one can see the list of gene pyramids represented in wheat varieties. On the top of the page, there is a search bar for gene symbols to search for a gene pyramid in which a gene under interest is involved (Fig. 8, right). Selected pyramid description page contains not only the corresponding paper and the plant, but also links to the genes forming this pyramid.
After choosing the tab “Resistance” in the menu, one can go to the “Resistance” page or to the “Virulence” page, both according with search bar and interactive map (Fig. 9).
The “Resistance” page provides data on effectiveness of the wheat resistance gene or group of co-segregated resistance genes in the regions of Russia. The “Virulence” page allows one to get information about occurrence rate of the virulence gene in the pathogen populations of Russia. Maps show distributions of wheat resistance genes or virulence genes of pathogens. Regions on map are given according to the ISO_3166–2 standard. The region is highlighted and a tooltip is appeared only in case MIGREW contains appropriate data on resistance or virulence. The user is able to update information through the Feedback option. Pathogens information is available in additional menu tab “Pathogens”. It shows the list of pathogens stored in the database with search bar for virulence genes symbols.
To search information from particular paper, one can use the “Papers” menu tab. It contains the list of publications stored in the database with the set of filters that help to find the paper(s) of interest. On the paper description page, one can find data associated with the selected paper such as Genes, Markers and Protocols.