Abstract

The dataset is stored in the excel table format of CSV attribute, which mainly describes the information of the restaurant. It is composed of 17457 key value pairs and 17246 human language references. Each MR is composed of 3-8 Chinese key value pairs, such as name, food or region and their values, as shown in Table 3. Among them, 15568 texts were used for training, 1678 texts were used for verification, and the remaining 211 texts were used for testing. Each set of key value pairs in the training set and verification set has multiple human language reference texts, which aims to create more natural, informative and diverse human references than Mr. After a series of data processing, including collection, cleaning, translation, screening and sorting, the parallel corpus of Chinese key value pairs is finally constructed manually.The dataset includes three data files, including: (1) trainset CSV is the training set data, with a data volume of 15568 cases; (2) devset. CSV is the validation set data, with a data volume of 1678 cases; (3) testset. CSV is the test set data, with 211 cases of dataEach instance of training set and verification set consists of key value pair group and human reference text, and the instances of test set only have key value pair group.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.