Abstract
In view of the poor accuracy, efficiency and scalability of the existing cross-language sentence similarity calculations, the Chinese- Thai cross-language sentence similarity is less studied. A new method is proposed. First, preprocess the corpus for Chinese-Thai parallel sentences, the sentence embedding model is used to obtain the Chinese-Thai sentence embedding matrix, and the sentence embedding is normalized. Then, the cross-language mapping model is used to embed the sentence. The conversion matrix is obtained through processing, and the orthogonal optimization of the conversion matrix is performed. Finally, the Chinese sentence embedding is mapped to the Thai sentence embedding space, and the Chinese- Thai cross-language sentence similarity is obtained by calculating the cosine of the two vectors, which provides a new idea for the Chinese-Thai cross-language sentence similarity calculation. The experimental results show that the method has good accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.