Abstract
In ethnographic research, data collected through surveys, interviews, or questionnaires in the fields of sociology and anthropology often appear in diverse forms and languages. Building a powerful database system to store and process such data, as well as making good and efficient queries, is very challenging. This paper extensively investigates modern database technology to find out what the best technologies to store these varied and heterogeneous datasets are. The study examines several database categories: traditional relational databases, the NoSQL family of key-value databases, graph databases, document databases, object-oriented databases and vector databases, crucial for the latest artificial intelligence solutions. The research proves that when it comes to field data, the NoSQL lineup is the most appropriate, especially document and graph databases. Simplicity and flexibility found in document databases and advanced ability to deal with complex queries and rich data relationships attainable with graph databases make these two types of NoSQL databases the ideal choice if a large amount of data has to be processed. Advancements in vector databases that embed custom metadata offer new possibilities for detailed analysis and retrieval. However, converting contents into vector data remains challenging, especially in regions with unique oral traditions and languages. Constructing such databases is labor-intensive and requires domain experts to define metadata and relationships, posing a significant burden for research teams with extensive data collections. To this end, this paper proposes using Generative AI (GenAI) to help in the data-transformation process, a recommendation that is supported by testing where GenAI has proven itself a strong supplement to document and graph databases. It also discusses two methods of vector database support that are currently viable, although each has drawbacks and benefits.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.