Abstract

As an important step of printed formula recognition system, formula extraction locates the formula fields on the layout images of printed documents, which influences the performance of formula recognition to a great extent. However, the errors of automatic formula extraction occur inevitably because of the complexity of formulas themselves and the layouts which the formulas situated. To solve this problem, this paper designed a post-processing method to correct the errors existing in the results of formula extraction algorithm according to relative layout knowledge. First of all, the geometrical features of various layout fields were employed to correct the extraction errors. Then, the syntax rules were used to check the boundary components of different kinds of areas in layouts to identify which field it should belong to. Finally, the formula area was adjusted according to above mentioned information.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call