TY - JOUR
T1 - Intelligent deconvolution algorithm for mixed STR profiles based on locus association modeling
AU - Yu, Shanping
AU - Mao, Zhehua
AU - Yang, Xinyu
AU - Xu, Zhen
AU - Yang, Fan
AU - Zhao, Xingchun
AU - Zeng, Liang
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025
Y1 - 2025
N2 - As forensic examination technology and public safety needs advance, DNA analysis has become key for individual identification. STR typing directly impacts solving major criminal cases and identifying mass disaster victims. Probabilistic genotyping algorithms based on full continuous models have been widely applied in mixed DNA analysis, but they assume independence among loci and ignore their correlations. To address this, we propose a deep learning-based approach specifically designed to capture and utilize statistical associations among loci. The model is trained on numerous single-contributor STR profiles to learn inter-locus dependencies, which are then integrated with the results of the fully continuous model to refine mixed profile deconvolution. Experimental validation on the PROVEDIt dataset demonstrates that the method achieves accuracies of 57.5%, 46.3%, and 41.1% for 2-, 3-, and 4-person mixtures, respectively, representing improvements of up to 30% points over conventional probabilistic models. A real case study further confirms the method’s practical effectiveness, showing closer agreement with manual identification compared to the fully continuous model. Moreover, experiments on cross-platform transferability reveal that the model performs well when trained and tested on the same sequencer but exhibits significant performance decline across different platforms. Training on a mixed dataset from multiple sequencers improves generalization, highlighting the importance of multi-platform training. These results confirm that the proposed model provides a robust and accurate solution for forensic STR mixture interpretation.
AB - As forensic examination technology and public safety needs advance, DNA analysis has become key for individual identification. STR typing directly impacts solving major criminal cases and identifying mass disaster victims. Probabilistic genotyping algorithms based on full continuous models have been widely applied in mixed DNA analysis, but they assume independence among loci and ignore their correlations. To address this, we propose a deep learning-based approach specifically designed to capture and utilize statistical associations among loci. The model is trained on numerous single-contributor STR profiles to learn inter-locus dependencies, which are then integrated with the results of the fully continuous model to refine mixed profile deconvolution. Experimental validation on the PROVEDIt dataset demonstrates that the method achieves accuracies of 57.5%, 46.3%, and 41.1% for 2-, 3-, and 4-person mixtures, respectively, representing improvements of up to 30% points over conventional probabilistic models. A real case study further confirms the method’s practical effectiveness, showing closer agreement with manual identification compared to the fully continuous model. Moreover, experiments on cross-platform transferability reveal that the model performs well when trained and tested on the same sequencer but exhibits significant performance decline across different platforms. Training on a mixed dataset from multiple sequencers improves generalization, highlighting the importance of multi-platform training. These results confirm that the proposed model provides a robust and accurate solution for forensic STR mixture interpretation.
KW - DNA
KW - Deep learning
KW - Forensic genetics
KW - Mixed STR profiles
UR - https://www.scopus.com/pages/publications/105026302660
U2 - 10.1007/s00414-025-03677-x
DO - 10.1007/s00414-025-03677-x
M3 - Article
AN - SCOPUS:105026302660
SN - 0937-9827
JO - International Journal of Legal Medicine
JF - International Journal of Legal Medicine
ER -