News Release

Graph-based pan-genome: A new key to unlocking genetic variation in chickens?

Peer-Reviewed Publication

Higher Education Press

Image

image: 

Image

view more 

Credit: Yiming WANG , Zijia NI , Yinhua HUANG

Chickens are one of the most important livestock globally, serving not only as a primary source of high-quality protein for humans but also carrying unique genetic characteristics across different breeds—for example, Leghorn chickens are renowned for their high egg production, Tibetan chickens can adapt to hypoxic plateau environments, and Silkie chickens attract attention due to their special melanin deposition. Previous studies have largely relied on single “linear reference genomes”, which struggle to fully capture the genetic differences among breeds. When using next-generation sequencing (NGS) to detect “structural variants (SVs)” longer than 50 base pairs, underdetection often occurs due to insufficient representativeness of the reference genome. While third-generation sequencing can more accurately identify SVs, its high cost limits applications in large-scale population studies. How to efficiently decipher genetic variations in chickens while controlling costs has become a key challenge in livestock and poultry breeding.

A study by Professor Yinhua Huang’s team at China Agricultural University has for the first time constructed a “graph-based pan-genome” for chickens, providing a new tool for efficient mining of genetic variations in chickens. The relevant study has been published in Frontiers of Agricultural Science and Engineering (DOI: 10.15302/J-FASE-2024591).

The “graph-based pan-genome” is a non-linear genetic reference system. Built upon the linear genome GRCg6a of the red jungle fowl, it integrates high-quality genomic data from 12 samples across 2 commercial breeds and 9 local breeds. Genetic variations among breeds are stored as “nodes” and “edges”, forming a non-linear structure with multiple genetic pathways. This design overcomes the “one-size-fits-all” limitation of traditional linear genomes, more comprehensively reflecting the genetic diversity of chicken populations.

Results show that compared with traditional linear genomes, the graph-based pan-genome achieves higher alignment efficiency for NGS data—even with low-depth (7–15×) NGS data, the median alignment rate exceeds 98.8%. It performs particularly superior in structural variant detection: when analyzing NGS data from breeds such as Leghorns and Rhode Island Reds, the number of SVs detected (e.g., 9944 SVs in Leghorns) far exceeds results from the linear genome tool Lumpy (e.g., only 3246 SVs in Leghorns), significantly enhancing the comprehensiveness of variant discovery.

Using this graph-based pan-genome, researchers further identified key variants associated with important traits. For example, in Leghorns, 666 breed-specific high-frequency SVs were found, with some located in regions related to follicle development (e.g., the MKI67 gene) and circadian rhythm regulation (e.g., the CLOCK gene), potentially linked to their high egg production. In Tibetan chickens adapted to plateau environments, combined with transcriptome data, certain SVs were identified in the promoter regions of mitochondrial protein synthesis genes (e.g., MRPS24), which may aid hypoxic adaptation by influencing gene expression. These findings provide critical clues for deciphering the genetic mechanisms underlying traits such as egg production and environmental adaptability in chickens.

This study not only offers a more efficient tool for genetic research in chickens but also provides methodological references for constructing pan-genomes in other agricultural animals. In future research, scientists will further validate the functions of these variants and optimize graph-based pan-genome construction techniques to promote their practical applications in livestock and poultry breeding.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.