Curating and integrating data from sources are bottlenecks to procuring robust training datasets for artificial intelligence (AI) models in healthcare. While numerous applications can process discrete types of clinical data, it is still time-consuming to integrate heterogenous data types. Therefore, there exists a need for more efficient retrieval and storage of curated patient data from dissimilar sources, such as biobanks, health records, and sensors. We describe a customizable, modular data retrieval application (RIL-workflow), which integrates clinical notes, images, and prescription data, and show its feasibility applied to research at our institution. It uses the workflow automation platform Camunda (Camunda Services GmbH, Berlin, Germany) to collect internal data from Fast Healthcare Interoperability Resources (FHIR) and Digital Imaging and Communications in Medicine (DICOM) sources. Using the web-based graphical user interface (GUI), the workflow runs tasks to completion according to visual representation, retrieving and storing results for patients meeting study inclusion criteria while segregating errors for human review. We showcase RIL-workflow with its library of ready-to-use modules, enabling researchers to specify human input or automation at fixed steps. We validated our workflow by demonstrating its capability to aggregate, curate, and handle errors related to data from multiple sources to generate a multimodal database for clinical AI research. Further, we solicited user feedback to highlight the pros and cons associated with RIL-workflow. The source code is available at github.com/magnooj/RIL-workflow.