Why many genome changes go unnoticed: A new look at tomato DNA
Nanjing Agricultural University The Academy of Science
image: Defining of SV types.
Credit: Horticulture Research
Structural variations (SVs)—large-scale changes in DNA sequence—play a crucial role in shaping traits such as yield, quality, and environmental adaptation in crops. However, many of these variations remain poorly characterized, especially in genomes rich in repetitive DNA. This study systematically resolves complex SVs in tomato genomes at single–base-pair resolution by combining multiple detection strategies with intensive manual curation. The researchers establish a high-confidence benchmark set of structural variants and reveal that commonly used computational tools frequently misidentify or misclassify these variants. By clarifying variant boundaries, breakpoints, and types, the work provides a more accurate framework for interpreting genome variation and lays the foundation for improved genetic analysis and crop improvement.
Structural variations (SVs), including insertions, deletions, inversions, and substitutions, can profoundly influence gene regulation and phenotype. In plants, these variants are particularly important because plant genomes often contain large proportions of repetitive sequences, which complicate accurate genome alignment and variant detection. Although long-read sequencing has improved genome assembly, current SV detection algorithms still struggle to resolve complex variants, especially in repetitive regions. As a result, many SVs are inaccurately located, incorrectly classified, or entirely missed, limiting their usefulness in genetic studies and breeding programs. Based on these challenges, there is a clear need to systematically characterize complex SVs and establish reliable standards for their detection and interpretation.
Researchers from Northeast Agricultural University and collaborating institutions reported (DOI: 10.1093/hr/uhaf107) on April 16, 2025, in Horticulture Research, that they have generated the first base-pair–resolution benchmark of complex SVs in tomato genomes. By integrating 14 variant-detection pipelines with extensive manual inspection, the team precisely resolved thousands of ambiguous genomic regions. Their findings show that most existing detection methods perform poorly in repetitive plant genomes, highlighting the need for new standards and improved algorithms to accurately capture genome diversity relevant to crop breeding and functional genomics.
The study began with the construction of a high-quality tomato genome assembly using long-read sequencing, providing a reliable foundation for variant analysis. Researchers then compared this genome with the reference tomato genome using 14 widely used SV detection pipelines, initially identifying more than 30,000 candidate variants. Through careful visualization and manual consolidation, these were refined into 4,532 structurally complex regions.
A major finding was that repetitive DNA caused widespread errors in variant detection. Misaligned copies often led to false deletions, insertions, or inversions, while breakpoint positions varied substantially among algorithms. To overcome this, the team anchored variant boundaries using uniquely aligned sequences flanking repetitive regions, enabling precise breakpoint identification.
Ultimately, 1,635 bona fide structural variants were resolved at base-pair resolution. These included insertions, deletions, inversions, and—importantly—substitutions, which the authors propose as a fundamental SV type often overlooked in plant genomics. The study also revealed that SVs preferentially occur in AT-rich regulatory regions rather than coding sequences and frequently overlap genes involved in defense responses. When evaluated against this benchmark, existing detection tools achieved surprisingly low accuracy, underscoring a critical gap between current methods and the true complexity of plant genomes.
“Structural variation has long been recognized as important, but its real complexity has been underestimated in plant genomes,” said one of the study’s senior authors. “By resolving these variants at base-pair resolution, we show that many apparent genome changes reported by algorithms are artifacts of repetitive sequence misalignment. Our benchmark provides a clear standard for evaluating detection methods and highlights the urgent need for algorithms specifically designed for complex plant genomes. This work moves us closer to accurately linking genome variation with agronomic traits.”
Accurately resolving structural variation is essential for modern crop genetics, from identifying trait-associated loci to building reliable pangenomes. The benchmark developed in this study offers a critical reference for improving SV detection algorithms and training artificial intelligence–based tools tailored to plant genomes. By clarifying how and where structural variants arise, the findings also enhance our understanding of genome evolution and adaptation. In practical terms, this work supports more precise genome-wide association studies and breeding strategies, enabling researchers to better exploit hidden genetic diversity for crop improvement, resilience, and quality enhancement in tomatoes and other plant species.
###
References
DOI
Original Source URL
https://doi.org/10.1093/hr/uhaf107
Funding information
This work was partially supported by grants from the National Natural Science Foundation of China (grant no. U22A20495 and grant no. 32072588 to A.W.), the Intelligent Molecular Breeding project of Department of Agriculture and Rural Affairs of Heilongjiang Province (to A.W.), the Science & Technology Specific Projects in Agricultural High-tech Industrial Demonstration Area of the Yellow River Delta (grant no. 2022SZX13 to Y.Z.), and the Strategic Priority Research Program of Chinese Academy of Sciences (grant no. XDA26030102 to Y.Z.).
About Horticulture Research
Horticulture Research is an open access journal of Nanjing Agricultural University and ranked number one in the Horticulture category of the Journal Citation Reports ™ from Clarivate, 2023. The journal is committed to publishing original research articles, reviews, perspectives, comments, correspondence articles and letters to the editor related to all major horticultural plants and disciplines, including biotechnology, breeding, cellular and molecular biology, evolution, genetics, inter-species interactions, physiology, and the origination and domestication of crops.
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.