Abstract In this paper, we first establish a locally converged bioinformatics dataset based on gradient sampling and design an optimal data mining control model to improve the accuracy of bioinformatics big data feature mining. The performance of the Compressive Tracking algorithm and Online Bosting algorithm is compared with the mining error as a test index. At the same time, we propose a social media information dissemination algorithm applicable to large-scale social network datasets, taking the degree value of each node as the node’s full influence and comparing and analyzing the dissemination influence of BP-IM, RAND and MC-CELF algorithms. Finally, taking public health big data as the research object, the least squares regression method was used to analyze the influence of the amount of public attention to bioinformatics scientific knowledge on their scientific literacy in different media. The results showed that there was a significant positive correlation between scientific literacy and willingness to engage in science participation behavior on social media when the amount of public attention to scientific information was β =0225, p <0.01. When more people are interested in scientific knowledge of bioinformatics on social media, their scientific literacy will improve.