Research Projects

Scientists at the National Center for Soybean Biotechnology are committed to developing modern genomic tools to support soybean research. NCSB scientists are working on a number of projects, in collaboration with other members of the soybean community. The number of projects underway are too numerous to describe in detail. A few key examples are given below.

Genome Sequencing and Analysis

A significant effort is underway by NCSB Scientists to sequence a large number of diverse soybean germplasm lines and to create more soybean reference genomes.

NCSB Scientists contributed to the sequencing of the first soybean reference genome, Williams 82. The article titled Genome sequence of the palaeopolyploid soybean describes the importance of this reference genome in soybean.

350 diverse soybean germplasm lines are being sequenced as part of a project titled “Large Scale Sequencing of Germplasm to Develop Genomic Resources for Soybean Improvement”. This project was funded by the United Soybean Board and three private companies: Bayer CropScience, DOW AgroSciences, LLC, and Monsanto. Dr. Henry Nguyen’s Laboratory at the University of Missouri is coordinating this research. This is the first large-scale public-private partnership of this magnitude in the soybean genetic research arena. The results of this project will lay the groundwork for future soybean genetic and breeding research. As part of this project, the southern U.S. soybean cultivar “Lee” (PI 548656) was selected for sequencing to create a second reference genome in soybean. This will complement the first reference genome, “Williams82”, that was chosen to represent the northern U.S. germplasm.

In collaboration with USDA-ARS scientists, the most diverse soybean lines in the U.S. germplasm collection were selected for sequencing. The list of the 350 soybean lines that have been sequenced is available at SoyBase, SoyKB, and on the Nguyen Lab Website News Page.

The United Soybean Board provided additional support to Dr. Henry Nguyen’s Laboratory to sequence an additional 300 diverse soybean lines. The Nguyen lab is also sequencing a wild soja line to create another soybean reference genome. Glycine Soja is the wild relative to Glycine max (modern day soybean). This will help soybean researchers discover how soybean that is grown today has evolved over millions of years.

The data being generated by these projects will benefit the soybean community and allow both public and private soybean breeders and researchers to have new resources that they can use to improve soybean varieties for United States farmers. This resource will contribute to the 1,000 soybean genome initiative which will lead to the development of the next generation soybean HapMap and genomic tools for soybean improvement in the United States and around the world.

Soybean Improvement

The overall goal of the National Center for Soybean Biotechnology (NCSB) is to integrate genomics and breeding research leading to the development of superior soybean cultivars for U.S. farmers to maintain their global competitiveness and expand utilization of the soybean crop. The Center will develop and utilize new technologies from a broad range of laboratory and field research studies. The Soybean Genomics and Biotechnology program will develop genomic maps and focus on seed quality, understanding the genetic control of yield, environmental stress tolerance, and pest resistance in soybean crops.

A key goal of this group is the development of value-added soybeans with improved functionality (e.g., improved oil content, increased health benefits, modified proteins) for broader use in food, feed, biofuels and industrial products. The Soybean Breeding program will utilize molecular biology (e.g., marker assisted selection, MAS) and genomic technologies (e.g., transcriptome, proteome and metabolome) to enhance the soybean germplasm base which will be useful for developing superior cultivars for soybean producers. Research is expected to maximize production efficiency, enhance nutritional values, and develop new industrial uses of soybean. Key points among these technologies are the development and refinement of the breeder's toolbox for soybean improvement.

Biotic stress: The NCSB has a large program devoted to development of new tools for use in plant breeding and gene discovery, particularly related to breeding for disease resistance. There are over 100 diseases of soybeans but there are 10 major diseases which are very damaging to soybeans. For example, the soybean cyst nematode (SCN) (Heterodera glycines) is the number one disease for yield reduction in the US and the world. Current SCN research has focused on the discovery of new genes to this pest to complement plant breeding research that will incorporate SCN genes into productive varieties. We will apply similar approaches to discover new quantitative trait loci to other diseases such as root knot nematode [Meloidogyne incognita (Kofoid & White) Chitwood], reniform nematode (Rotylenculus reniformis), soybean rust (Phakospora pachyrhizi Sydow and Phakospora meibomiae Arthur) frogeye leaf spot (Cercospora sojina K. Hara) and Charcoal rot (Macrophomina phaseolina) The above diseases are some of the major biotic factors that significantly affect soybeans.

Abiotic stress: Abiotic stresses such as drought, flooding, salinity, temperature stress and factors associated with climate change have received significant attention of scientists at the NCSB. Development of soybean germplasm with tolerance to an array of factors is very important because significant yield losses from abiotic stress occur in soybean annually. Breeding for increased abiotic stress tolerance in soybean is long-term and difficult due, in part, to the multigenic nature of improved tolerance. Scientists at the NCSB are identifying new sources of tolerance to abiotic stresses and have developed mapping populations for the purpose of identifying major abiotic stress resistance genes and QTLs. Specific DNA markers can pinpoint the location of these genes and assist in sequencing and cloning genes of economic significance as well as for use in marker assisted breeding. Ultimately, incorporation of new alleles for tolerance to drought, flooding and alkaline soil conditions will lead to the development of productive soybeans to reduce losses from these stresses.

Seed composition: Another goal of scientists at the NCSB is to improve the functionality of soybean protein and oil for greater utility in food, feed and industrial markets. For example, modifying the fatty acid profile in soybean oil will lead to greater use of soy oil in more products. We currently have experimental germplasm lines low in saturated (palmitic and stearic fatty acids) and the poly unsaturated linolenic fatty acid. Low saturate oil is associated with reduced risk of heart disease. Low linolenic concentration improves the oxidative stability and shelf life of the oil. Recently, we discovered new alleles that increase the monounsaturated fatty acid oleic acid in soy oil from 23% to about 80%. This will make soybean oil more like olive oil in which the oil will have improved health benefits, greater heat and oxidative stability for improved use in frying, biofuels and lubricants. NCSB scientists are also working to improve anti-nutritional factors in soybeans by lowering seed content of indigestible carbohydrates stachyose and raffinose and to understand the genetic regulation of health related compounds in soybean seeds such as isoflavones, sapponins, and phytosterols.


A hallmark of modern biology is large-scale -omics data. These data are massive and often very complex in nature; thereby generating the need for extensive storage, detailed computational analyses, fast retrieval and efficient integration, for better understanding of the data and hypothesis generation for the underlying biological system. Our efforts to study the systems biology of the soybean root hair cell have generated very large datasets. In addition, other laboratories are now applying these methods to soybean to address a variety of biological questions. To address the need for web resources capable of handling the complex task of integrating soybean -omics data and to provide data annotation (e.g. the pathway information), we developed the Soybean Knowledge Base.

The Soybean Knowledge Base (SoyKB) is a comprehensive all-inclusive web resource for soybean. SoyKB is designed to handle the storage and integration of the genomics, microarray, transcriptomics, proteomics and metabolomics data along with the function and pathway information. It has four modules including the main mySQL database module at the back end that incorporates and integrates all the soybean genomics and -omics data from various sources. It is designed to contain information on four different entities namely genes, miRNAs, metabolites and SNPs. The other three front-end modules are web interface, genome browser and pathway integration.

SoyKB has four tiers of registration, which control the access to the public and private experimental datasets. Users can add comments, download data for multiple genes as well as submit their own datasets. Tools like protein 3D-structure and pathway viewers, gene family browsers and BLAST sequence similarity tool are all part of key features of SoyKB.

SoyKB can be accessed at