cacGMS: 특성평가 정보기반 식물 유전자원 군집 알고리즘
cacGMS: An Algorithm Cluster Germplasm based on Categorical Genetic Traits
- 한국육종학회
- 한국육종학회지
- Vol.54 No.1
- : KCI등재
- 2022.03
- 16 - 24 (9 pages)
Plant germplasm is a part of living genetic resources, including seeds and plant materials, such as roots, leaves, and stems, and should be conserved and managed to maintain ecological biodiversity and to consistently generate the product and supply food crops. Plant germplasm can be categorized based on various genetic traits such as race, and clustering based on similar genetic traits is an efficient method for managing large numbers of germplasms. Therefore, we developed an algorithm, termed cacGMS (Clustering Analysis for Categorical genetic traits of germplasms in Genebank Management System), using categorical variables which statistically differentiate the datatype of genetic traits such as seed-coat color, seed shape, and flower color. Briefly, using Newman’s modularity method, cacGMS combines the hierarchical clustering algorithm using the Ward2 method and representative-based algorithms such as K-medoids, and it regroups all germplasms using germplasm core sets. We tested cacGMS using 2,378 pepper germplasms with 46 different categorical genetic traits, and it exhibited better performance than the hierarchical and K-medoids algorithms for the average distance among clusters (0.4534) and entropy (1.2672). Moreover, cacGMS showed better performance in terms of threshold (from 15 to 30) for genetic traits than other algorithms and provided similar results in a test run using tomato germplasm. From these results, we expect that cacGMS will be a useful tool for managing each group with numerous plant germplasms and facilitate the analysis of other studies, such as analysis of representative characteristics of clustered germplasms and of correlations among germplasms in a particular cluster.
서언
알고리즘 및 평가방법
결과 및 고찰
적요
사사
보충자료
REFERENCES