Lu-Chuan Tian,
Ya-Bin Zhao*,
Shi-Si Tian,
Yanda Zheng,
Shuo Zhang and
Linyuan Fan
People's Public Security University of China, Beijing, 100038, People's Republic of China. E-mail: 20052172@ppsuc.edu.cn
First published on 20th August 2025
Chemical residues in fingermarks have been proven to assist in suspect tracing and population profiling. However, the composition and levels of these chemicals are derived from complex metabolic systems and are easily influenced by biological activities, which has hindered judicial institutions worldwide from establishing standardized analytical procedures. To develop a rapid, accurate, and straightforward analytical method, this study employed UPLC-QqQ-MS/MS to quantify amino acid levels in fingermark residues, integrating machine learning techniques and intelligent optimization algorithms for gender prediction. We evaluated whether the relative concentrations of amino acids in fingermark residues—normalized to endogenous serine—could reliably serve as indicators for gender determination, while also examining the effects of donors' physical activity levels, living regions, and fingermark aging periods (0–64 days) on gender classification. The results indicate that significant differences in gender were observed. Under various physical activity frequencies, leucine and valine consistently exhibited statistically significant differences, while across different living regions, valine and phenylalanine remained significant. Moreover, a comprehensive Mann–Whitney significance analysis, followed by Bonferroni correction on all measured fingermarks, revealed that the concentrations of Phe, Ile, Leu, Val, Pro, Asn, Glu, His, and Asp differ significantly between genders. Four classification models were developed based on the relative abundances of amino acids in fingermark residues, and their hyperparameters were optimized using the particle swarm optimization algorithm. Ultimately, the PSO-BP model achieved the highest accuracy of 84.49%. In summary, this study introduces a novel approach utilizing the relative concentrations of amino acids in fingermarks for gender determination. The established method is simple, accurate, and does not require derivatization, making it less susceptible to transfer loss, aging time, or individual factors. The developed models exhibit high classification accuracy and robust generalization ability. The conclusions from this study may provide valuable references for the development of sensitive amino acid reagents and also address a gap in the stability discussion of fingermark residue research.
With the iterative development of modern detection instruments, the accuracy and detection limits of these devices have greatly improved. Consequently, an increasing number of forensic researchers have shifted their focus to the compositional analysis of fingermark residues. Fingermark residues are composed of endogenous secretions from sweat and sebaceous glands, as well as exogenous substances that reflect the behavior of the individual. Endogenous secretions mainly include amino acids, proteins, glucose, lactic acid, urea, triglycerides, wax esters, fatty acids, phospholipids, sterols, sterol esters, and squalene, while exogenous substances include hand creams, toxins, drugs, lubricants, and gunshot residues.4 These trace components can be used to infer characteristics such as the individual's age,5 gender,6 personal habits,7 and medication status.8
Although fingermark residue analysis can provide insight into the characteristics of the individual, the composition and concentration of these residues are influenced by various factors such as gender, age, race, health status, and medication use. Individuals within the same demographic group may have vastly different amino acid levels, while individuals from different groups may exhibit similar fingermark amino acid profiles. In addition to the complexities of human metabolism, factors such as the force applied when leaving the fingermark,9,10 sweating, and the contact surface area between the fingers and the object11 can also influence the concentration of substances in crime scene fingermarks. Moreover, another potential limitation of many age estimation methods is that the initial concentration of residues at the time of deposition is often unknown. The discovery, collection, and examination of fingermarks at crime scenes are often delayed, and as fresh fingermarks undergo evaporation, oxidation, decomposition, and other aging processes over time, their composition inevitably changes,12,13 which could significantly affect the accuracy of estimations.
Gender is an important indicator in population profiling, numerous studies have reported the use of fingermark residues to determine the gender of donors,14 with these investigations primarily examining gender-related differences in various fingermark constituents, such as lipids and small proteins. In previous studies, besides traditional analytical techniques such as GC-MS15 and Raman spectroscopy,16 researchers have also introduced various novel techniques and methods, including LDI-TOF/MS,17 MALDI MS,18 and DESI-MSI.19 However, the majority of these studies have overlooked the influence of time on fingermark composition, nor have they considered factors such as the pressure applied during fingermark deposition or incomplete fingermarks. In real crime scenarios, law enforcement often encounters fingermarks that were deposited days or even months earlier, or those that are incomplete or distorted. This indicates that conclusions drawn solely from fresh and clearly defined fingermark samples may not effectively address practical investigative challenges. Additionally, gender alone as a classification indicator encompasses considerable diversity within populations—for example, previous studies have shown that women tend to have higher levels of arginine, serine, and aspartic acid compared to men,20,21 while alanine, aspartic acid, proline, and tyrosine levels increase in young adults suffering from obesity and insulin resistance.22 Consequently, obese males may have amino acid profiles very similar to those of non-obese females—potentially complicating accurate gender determination. Moreover, our previous studies revealed substantial variations in fingermark compositions among volunteers from different geographic regions,23 indicating that gender classification could pose further challenges in crimes involving cross-regional suspects. Thus, when using fingermark residues—particularly their concentrations—to classify gender, it is crucial to determine whether the differences in concentrations are primarily or exclusively attributable to the gender itself and not influenced by other factors such as physical activity or living region. Additionally, it is essential to confirm whether the differences caused by gender remain stable over time.
Amino acids are major organic components of sweat in fingermarks, and their concentrations can serve as indicators of various physiological functions such as health monitoring, sports medicine, and metabolic status. Currently, most forensic research using amino acids for gender determination focuses on visualization techniques and colorimetric methods,24 such as staining with reagents like 1,2-Indanedione25 or Coomassie Brilliant Blue,26 and using fluorescence intensity to discriminate gender. However, fewer studies have used mass spectrometry to analyze amino acid concentrations for t gender classification. Ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) is a mainstream method for amino acid analysis, offering high sensitivity and the advantage of no need for derivatization. It can simultaneously detect dozens of amino acids, significantly reducing experimental time, making it ideal for the detection and screening of amino acids in fingermarks.
Van Dam et al.27 and Weyermann et al.28 have suggested that using relative concentrations between targeted fingermark components could address the issues of extraction loss. To avoid losses caused by incomplete fingermarks and during the extraction process, this study utilized the relative concentrations of amino acids to serine to characterize gender in population profiling. It also examined the stability of using amino acids for gender determination under varying physical activity levels and fingermark aging periods. Furthermore, machine learning and statistical methods were employed to identify stable and independent indicators for gender differentiation.
Target compound | Precursor ion (m/z) | Fragmentation voltage (V) | Product ion A (m/z) | Collision energy A (V) | Product ion B (m/z) | Collision energy B (V) |
---|---|---|---|---|---|---|
Alanine | 90.1 | 45 | 44.1 | 9 | — | — |
Arginine | 175.1 | 105 | 116.1 | 2 | 70.1 | 8 |
Asparagine | 133 | 70 | 87.1 | 5 | 74 | 15 |
Aspartic acid | 134.1 | 70 | 88 | 9 | 74 | 13 |
Glutamic acid | 148.1 | 75 | 130 | 5 | 84 | 17 |
Histidine | 156.1 | 80 | 110.1 | 13 | 83.1 | 29 |
Isoleucine | 132.1 | 75 | 86.1 | 9 | 30 | 17 |
Leucine | 132.1 | 75 | 86.1 | 9 | 30 | 17 |
Lysine | 147.1 | 85 | 130.1 | 0 | 84.1 | 6 |
Methionine | 150.1 | 75 | 104.1 | 0 | 61 | 15 |
Phenylalanine | 166.1 | 80 | 120.1 | 13 | 103.1 | 29 |
Proline | 116.1 | 75 | 70.1 | 17 | 43.1 | 35 |
Serine | 106.1 | 67 | 60 | 15 | 42.2 | 24 |
Threonine | 120 | 75 | 74.1 | 9 | 56.1 | 17 |
Tyrosine | 182.1 | 85 | 136.1 | 13 | 91.1 | 33 |
Valine | 118.1 | 70 | 72.1 | 9 | 55.1 | 25 |
A mixed standard solution of 16 amino acids was purchased from Beijing North Weiye Metrology Technology Research Institute, and the asparagine standard was obtained from J&K Scientific Ltd.
In the section of exercise on gender, the 30 volunteers were divided into three groups based on their exercise habits, and they lived in the same area, and had similar dietary habits. The first group engaged in little to no physical activity each week; the second group engaged in physical activity 3–5 times per week; and the third group engaged in physical activity more than 5 times per week.
In the section of region effect on gender, a total of 34 volunteers (21 males and 13 females) were recruited from three provinces in China: Guangdong, Jiangsu, and Yunnan. All volunteers had resided in their respective regions for an extended period, with no significant lifestyle or environmental changes reported prior to sample collection. Their dietary habits were consistent with the local customs, ensuring regional representativeness and minimizing confounding variables.
In the section of deposition time effect on gender, 30 volunteers were instructed to press their fingermarks onto 2 × 2 cm PVC plastic sheets. The same volunteer provided 10 fingermarks within one hour. The fingermarks were stored at room temperature in darkness for 0 days, 7 days, 11 days, 14 days, 18 days, 24 days, 34 days, 44 days, 54 days, and 64 days. All fingermarks were collected anonymously using non-invasive methods, and the experiment complied with relevant legal regulations and received ethical approval prior to sample collection. Volunteers were instructed to wash their hands with water 30 minutes before fingermark collection, ensuring no cleaning products (such as soap or hand sanitizer) were used, to avoid interference from active agents. After allowing their hands to air-dry, volunteers wore disposable plastic gloves and continued with normal activities until their hands were slightly sweaty. Then, the fingermarks of all five fingers of the right hand were pressed onto the PVC plastic sheets, which were subsequently stored in a dark room under the specified conditions for the assigned time periods.
Concentration levels of 0.05 pmol μL−1 (Q1), 0.1 pmol μL−1 (Q2), and 0.5 pmol μL−1 (Q3) were selected as low, medium, and high concentrations, respectively. Using a pipette, 100 μL of the mixed standard solution of 16 amino acids, corresponding to the concentrations of Q1, Q2, and Q3, was dropped onto 1.5 × 1.5 cm2 plastic sheets. After drying, the plastic sheets were transferred to 5 mL round-bottom centrifuge tubes, and 1 mL of ultrapure water was added to dissolve the amino acids from the plastic sheets. The tubes were vortexed at 1500 rpm for 10 minutes and left to stand for 1 minute. Then, 0.5 mL of the solution was taken and mixed with 0.5 mL of acetonitrile, and the mixture was vortexed again at 1500 rpm for 10 minutes. The concentrations of Q1, Q2, and Q3 were measured using the aforementioned method, and recovery rates were calculated. The Q2 concentration was measured in parallel six times to assess repeatability. Repeatability was expressed as the relative standard deviation (RSD), which refers to the ratio of the average absolute difference between each measurement and the mean value, to the overall mean of the six measurements.
The particle swarm optimization (PSO) algorithm integrates mechanisms of social learning and individual exploration, enabling its particles to perform both global and local searches in the solution space, thereby identifying the optimal solution.30–32 In this study, the PSO algorithm was configured with learning factors c1 and c2 set to 2, and an inertia weight ranging from 0.4 to 0.9, with cross-validation error employed as the fitness function. After initializing the population and the model hyperparameters, the algorithm was applied to the four classification models. At each iteration, the fitness value was recorded and evaluated to determine whether the termination condition had been met. If the termination condition was not satisfied, the particles' positions and velocities were updated iteratively until the condition was fulfilled.
Fresh fingermarks from 30 volunteers were analyzed using UPLC-MS/MS, and the relative amino acid content distribution is shown in Fig. 2. It was observed that there were considerable differences in relative concentrations between genders, consistent with previous research, where females exhibited higher amino acid levels. A notable exception is the higher mean relative concentration of histidine observed in male fingermarks. This arises because our analysis relies on the histidine-to-serine relative concentration rather than absolute histidine levels, so conclusions36 drawn from absolute concentrations do not directly apply. Furthermore, published data indicate that sex-dependent variability in serine far exceeds that in histidine,37–39 which naturally amplifies the His/Ser ratio in male fingermarks. The mean values and Mann–Whitney test results for male and female volunteers are presented in Fig. 3. Overall, the mean relative concentrations of 11 amino acids were higher in female donors. Statistically significant differences were observed between genders for leucine, valine, proline, and aspartic acid, with the absolute values of their effect sizes exceeding 0.4. These findings indicate that the actual differences in the statistically significant amino acids are substantial, thereby supporting the feasibility of using the relative concentrations of amino acids for gender differentiation.
![]() | ||
Fig. 4 Comparison of relative amino acid concentrations between male and female donors at different exercise frequencies. |
To further validate the conclusions, a significance analysis of gender differences across different exercise frequencies was conducted. It was found that in volunteers who exercised regularly (3–5 times per week), the Mann–Whitney test result for aspartic acid was 0.03016, indicating a significant difference in gender. Due to the limited data in the low and high frequency groups, these groups were combined for further analysis. The test results revealed that, regardless of exercise frequency, leucine and valine continued to show significant gender differences, while aspartic acid had a result of 0.06292, approaching the level of significance. A 3D scatter plot using valine, leucine, and aspartic acid as the axes is shown in Fig. 5, illustrating that male and female volunteers can still be distinguished, even at different exercise frequencies. This suggests that leucine, valine, proline, and aspartic acid can reliably distinguish gender without being significantly influenced by exercise.
Fig. 6 presents a scatter plot of the 34 samples based on geographical origin and gender. It can be observed that donors from the same province tend to cluster; apart from a few outliers, the donors from Guangdong and Jiangsu exhibit distinct separation, whereas the distribution for Yunnan appears more random, possibly due to the relatively smaller sample size from that region. With regard to gender, it was noted that the dispersion among female samples is greater than that among males, with an overall trend of higher values in females.
Fig. 7 displays a volcano plot illustrating the statistical significance of differences among the three provinces. It was found that valine and phenylalanine exhibit significant differences, with their respective Cliff's Delta effect sizes being −0.567 and −0.414, corresponding to strong and moderate effects. This outcome is in agreement with the results obtained from the exercise component, thereby demonstrating that Val and Phe serve as robust biomarkers for gender differentiation.
![]() | ||
Fig. 7 Volcano plot of gender mean values, significance levels, and effect sizes among volunteers from different provinces. |
Fig. 8 shows the changes in the average relative concentration of amino acids over time for donors of different genders. It can be observed that, as the deposition time increases, the relative concentrations of amino acids fluctuate to some extent, but no clear, consistent trend was identified overall. An interesting finding is that, in male donor fingermarks, the average relative concentrations of the amino acid pairs Ile and Leu, and Tyr and Glu, displayed similar variation trends (see Fig. 9), while this pattern was not observed in female fingermarks. This phenomenon may be attributed to the combined influence of hormonal levels and the skin microbiota, with the initial proportions of amino acids and their degradation mechanisms differing significantly between genders. For instance, isoleucine (Ile) and leucine (Leu), which are classified as branched-chain amino acids (BCAA), are known to have their metabolism strongly influenced by muscle metabolism and androgens (testosterone). Since testosterone levels are markedly higher in males than in females, a synergistic effect in the secretion of Ile and Leu in sweat is promoted. In contrast, owing to lower testosterone levels and correspondingly diminished metabolic pathway activity, such a pronounced synergistic effect is not observed in female donors.47 Moreover, the skin microecology has been shown to differ between males and females, with variations in the types and proportions of bacteria present on the skin.48 Certain bacteria are predisposed to degrade specific amino acids; for example, Propionibacterium is adept at utilizing BCAA (Ile and Leu), whereas species of the genus Staphylococcus may preferentially metabolize aromatic or basic amino acids (such as phenylalanine, Phe, and lysine, Lys).49,50
Consequently, the microbial communities in different genders may preferentially or collectively degrade particular amino acids, thereby resulting in similar trends of fluctuation in these amino acids within one gender.
Finally, we conducted significance tests on all fingermarks under various conditions (different exercise frequencies, different living areas and different deposition times) to assess the stability and applicability of gender differentiation using relative amino acid concentrations under different scenarios. The results are shown in Fig. 10 and Table 2. Since multiple Mann–Whitney tests were performed (n = 15), this could increase the likelihood of false positives with the growing number of tests. Therefore, we applied a Bonferroni correction to the results. After the multiple testing correction, Phe, Ile, Leu, Val, Pro, Asn, Glu, His, and Asp still showed significant differences, indicating that these amino acids can reliably differentiate gender and are not easily affected by exercise frequency, region and deposition time.
![]() | ||
Fig. 10 Comparison of average relative amino acid concentrations between male and female donors across all deposition times. |
Relative Concentration (X/Serine) | Female mean | Male mean | Cliff's delta | p-value (bonferroni) |
---|---|---|---|---|
Phenylalanine | 0.080660502 | 0.046801813 | −0.277 | 0.000*** |
Isoleucine | 0.164017353 | 0.082057624 | −0.315 | 0.000*** |
Leucine | 0.113463605 | 0.066914135 | −0.367 | 0.000*** |
Methionine | 0.03329684 | 0.012090285 | −0.182 | 0.076 |
Tyrosine | 0.053646959 | 0.057853893 | 0.150 | 0.306 |
Valine | 0.115024349 | 0.085293282 | −0.192 | 0.046* |
Proline | 0.203226862 | 0.106727424 | −0.331 | 0.000*** |
Alanine | 0.369334242 | 0.354535313 | −0.119 | 0.991 |
Threonine | 0.235914575 | 0.214464642 | −0.037 | 1.000 |
Asparagine | 0.056770469 | 0.064150644 | 0.262 | 0.001** |
Glutamic acid | 0.098081301 | 0.096543829 | 0.194 | 0.042* |
Histidine | 0.492860995 | 0.515374196 | 0.208 | 0.020* |
Aspartic acid | 0.085651971 | 0.124888219 | 0.307 | 0.000*** |
Arginine | 0.218817188 | 0.134357395 | 0.006 | 1.000 |
Lysine | 0.179746002 | 0.107596428 | −0.091 | 1.000 |
Fig. 11 and 12 present the radar charts and ROC curves illustrating the classification performance of the four models. It can be observed that models employing 15 amino acids as input features yield higher accuracy and AUC values than those using only the 9 amino acids that exhibit statistically significant gender differences. Contrary to expectations, the inclusion of amino acids without significant gender differentiation did not compromise the models' accuracy or generalization capability. This phenomenon may be attributed to the interactions among amino acid features; a simple reduction in the number of features may disrupt the synergistic relationships, thereby leading to information loss.
From the radar charts and ROC curves, it can be observed that the optimized models exhibit a substantial improvement in accuracy compared to the original models. The four unoptimized models demonstrated accuracies ranging from 60% to 76.28%, whereas the optimized models all achieved accuracies above 77.13%. Notably, the PSO-BP model attained the highest classification accuracy of 84.49%, with an associated F1-score of 0.8436, indicating relatively low false positive and false negative rates and strong adaptability to imbalanced datasets. Moreover, an AUC value of 0.88337 further confirms the robust generalization capability and stability of the optimized models. The optimized hyperparameters for the four models are detailed in Table 3.
Classification algorithm | Hyperparameter | Optimized parameters |
---|---|---|
SVM | Kernel scale | 1.4391 |
Box constraint level | 16.0555 | |
KNN | Number of neighbors | 5 |
DT | Max depth | 74 |
BP | Number of hidden layers | 24 |
Regularization strength | 0.001282 |
In the construction of classification models, four classic machine learning algorithms—SVM, Decision Tree, KNN, and BP neural network—were used for gender determination. After optimizing the hyperparameters of the models using the PSO algorithm, the BP model achieved the highest accuracy of 84.49%, showing the best performance. Model evaluations demonstrated that the optimized models had substantial improvements in accuracy, area under the ROC curve, and other metrics, further validating the effectiveness and applicability of using relative amino acid concentrations for gender determination.
Amino acids themselves have considerable potential for development, not only reflecting the body's metabolic state but also remaining stable in fingermarks for extended periods. Existing testing methods are relatively mature, and in the future, method migration could be considered, using techniques such as spectroscopy, mass spectrometry imaging, or sensors to further explore their application in population profiling. The conclusions from this study can also support the development of sensitive amino acid reagents, allowing researchers to more effectively develop targeted reagents to improve the accuracy of testing. Additionally, relative concentrations could be used to gain further insights into the health status, smoking habits, or drug use of individuals based on the amino acids in fingermarks.
In the future, we will expand the experimental sample size, reduce the time interval between samplings, and conduct a more scientifically comprehensive evaluation of exercise habits to avoid errors caused by small sample sizes.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4ay01954g |
This journal is © The Royal Society of Chemistry 2025 |