In the era of language big data, traditional data analysis methods can’t analyze semi-structured or unstructured data such as text, but all the contents in the equipment manufacturing corpus belong to text data. The equipment manufacturing corpus is a linguistic information base for legal activities and equipment manufacturing research, which aims to study equipment manufacturing and collect equipment manufacturing cases. At present, the construction of legal database in China is not perfect, and there are still many problems. In this paper, a method based on template transformation is proposed to automatically acquire parallel corpus on the Internet, and a method based on the number of transformation patterns and the retrieval and sorting of transformation patterns is adopted to verify bilingual parallel texts. This system can build a large-scale parallel corpus of equipment manufacturing industry by automatically acquiring a large number of parallel texts from the Internet.