The exponential increase in the annual volume of publications places a significant challenge in assessing the disruptive potential of technologies in new papers. Prior approaches to identifying disruptive technologies based on the accumulation of paper citations are characterized by their limited prospective and time-consuming nature. Moreover, the total citation count fails to capture the intricate network of citations associated with the focal papers. Consequently, we advocate for the utilization of the disruption index instead of depending on citation counts. Particularly, we devise a novel neural network, called Soft Prompt-aware Shared BERT (SPS-BERT), to predict the potential technological disruption index of immediately published papers. It incorporates separate soft prompts to enable BERT examining comparative details within a paper's abstract and its references. Additionally, a tailored attention mechanism is employed to intensify the semantic comparison. Based on the enhanced representation derived from BERT, we utilize a linear layer to estimate potential disruption index. Experimental results demonstrate that SPS-BERT outperforms existing state-of-the-art methods in predicting five-year disruption index across the DBLP and PubMed datasets. Additionally, we conduct an evaluation of our model to predict the ten-year disruption index and five-year citation increments, demonstrating its robustness and scalability. Notably, our model's predictions of disruptive technologies, based on papers published in 2022, align with the expert assessments released by MIT, highlighting its practical significance. The code is available at https://github.com/ECNU-Text-Computing/SPS-BERT.
Read full abstract