Abstract

Machine learning and demographic analysis are a cornerstone for making community Question Answering (cQA) platforms more egalitarian and vibrant, safer as well. For instance, the two cooperate on successfully detecting suspicious/malicious activity and on stirring up the interest of community fellows to learn by exploring new topics. In this sense, both research fields play a vital role in reducing gender disparity across categories, when promoting unresolved questions to potential answerers.Current state-of-the-art artificial intelligence architectures, such as pre-trained transformers, train complex goals and million of parameters as a means of inferring and encoding knowledge from massive corpora. Fine-tuning is the process that allows later to transfer this encrypted information to a downstream task (e.g., gender classification). Needless to say, these pre-trained encoders also suffer from multiple disadvantages. To give an example, they are sensitive to irrelevant and misleading words, bringing about overfitting, usually on small datasets.This work offers a fresh look at this kind of technique by introducing PTM-SFFS, a novel approach that effectively pairs frontier transformers with linguistic properties via the use of traditional classifiers. Based on a feature wrapper (SFFS), PTM-SFFS refines the scores produced by a fine-tuned model via seeking for an array of mostly linguistic features to build a conventional statistical classifier (e.g., Bayes and MaxEnt). And as a result, this new discriminant function enhances the overall prediction rate by optimizing the synergy between both sorts of strategies.When applied to automatic gender recognition on cQA sites, PTM-SFFS increased the accuracy of seven fine-tuned state-of-the-art encoders up to 10% (XLNet). Thanks to its interpretability, we discover that it capitalizes on dependency parsing and metadata for improving the transference of lexicalized information to the target domain.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.