India accommodates a huge diversity of plant and animal life across a variety of biomes. However, the degree of research, funding, and attention is asymmetric, largely focused on its charismatic vertebrates. Invertebrates, despite their megadiversity, are generally overlooked with some exceptions (for example, lepidoterans). One species-rich group, spiders, exemplifies this knowledge gap. More than 1,800 species from 63 families have been reported in the country (Mondal et al. 2020), even though the true number may be much higher, in part, owing to the need for taxonomic revisions. Several spider systematics and biogeographic studies have pointed out that spatial distribution, seasonality, and natural history data are lacking from India. The lack of foundational biodiversity information has led to poor opportunities to share knowledge between researchers and community scientists. Mining occurrences through scientific publications or databases (in some cases behind paywalls) may require specialized training. This restricted access inhibits more people from using the abundance of information. The prevalence of social media posts showcasing photographs of species in biodiversity contexts has experienced a significant increase, with a multitude of users sharing wildlife photographs on social media sites such as Instagram®, Facebook®, and Flickr®. These observations on social media are a potential source of primary biodiversity data when curated and validated (Barman and Barve 2022, Barman et al. 2022a). Such data can enhance biodiversity monitoring by expanding spatial and temporal coverage and involving a global network of citizen scientists in conservation research (Barve et al. 2023, Roy 2022, Kulkarni 2023). The India Biodiversity Portal (IBP, Vattakaven et al. 2016) has been cataloguing species diversity and spider data through citizen science. Meanwhile, major social media sites report many more species sightings than citizen science platforms. SpiderIndia, a popular Facebook group, has collected over 20,000 observations from 8,500 spider enthusiasts. However, these data remain unorganised and inaccessible to academic researchers and the public. In order to tackle these problems, we implemented a methodical process to enhance the occurrence data on spiders from well-known social media platforms, such as the SpiderIndia project. The procedure involved retrieving the relevant data, and ingesting it through a custom pipeline to parse and extract, scientific names, spatial and temporal data to generate occurrence records on the IBP. The data was then presented on a curation interface, enabling taxonomic experts to verify the records, and finally publishing the verified records on the Global Biodiversity Information Facility (GBIF) (Global Biodiversity Information Facility 2022, Barman et al. 2022b) to improve the records of spider species occurrence in India. This project showcases the capacity of citizen science via social media to involve citizen scientists in generating extensive datasets that make a significant contribution to scientific knowledge and improve our comprehension of invertebrate biodiversity. The final dataset encompasses over 15,000 observations, providing valuable insights into spider diversity and distribution across India (Fig. 1). This data is publicly available on GBIF, facilitating further research on Indian spider populations.
Read full abstract