Аннотации:
This paper presents a new methodology for analyzing lip articulation during
fingerspelling aimed at extracting robust visual patterns that can overcome the inherent
ambiguity and variability of lip shape. The proposed approach is based on unsupervised
clustering of lip movement trajectories to identify consistent articulatory patterns across
different time profiles. The methodology is not limited to using a single model. Still, it
includes the exploration of varying cluster configurations and an assessment of their robustness, as well as a detailed analysis of the correspondence between individual alphabet
letters and specific clusters. In contrast to direct classification based on raw visual features,
this approach pre-tests clustered representations using a model-based assessment of their
discriminative potential. This structured approach enhances the interpretability and robustness of the extracted features, highlighting the importance of lip dynamics as an auxiliary
modality in multimodal sign language recognition. The obtained results demonstrate that
trajectory clustering can serve as a practical method for generating features, providing
more accurate and context-sensitive gesture interpretation.