e14711 Background: Machine learning algorithms for prediction of Adverse Events (AEs) due to immune checkpoint inhibitors (ICIs) rely on clinically annotated data to serve as gold standards. Clinical trials are the best sources of such data, but audits to assess validity are rare, compromising future machine learning efforts. We reviewed AEs reported on the National Cancer Institute Molecular Analysis for Therapy Choice (NCI-MATCH) EAY-131 Arm Z1D. Patients matched to Arm Z1D due to MSI-high status were treated with nivolumab. Routine and serious AEs were reported to NCI. Attribution data from AEs on 29 patients with tumor genomics suitable for machine learning approaches were obtained and reviewed retrospectively. Methods: Documented AEs were sorted by grade via the Common Terminology for Adverse Events (CTCAE) system and categorized as related or unrelated to ICIs. Primary source data reported to NCI were reviewed. If any ambiguity was found, trial sites were queried to provide additional information and documentation. Decisions were made by an experienced NCI medical officer with the benefit of additional data and site responses. All serious AEs and a subset of ambiguous routine AEs were assessed to determine if attribution conformed to expected attribution per the nivolumab package insert. Uncertain attribution required additional queries, including email or phone correspondence. Grade 1 or 2 events were only evaluated to determine if any attribution to ICIs was possible. If a higher grade event was attributed to ICIs, there was no further query for lower grade events on the same patient. Results: A total of 639 Grade 1, 152 Grade 2, 75 Grade 3, 10 Grade 4, and 5 Grade 5 events were reported among the 29 patients. One Grade 5 event, 5 Grade 4 events, 13 Grade 3 Events, 9 Grade 2 events, and 6 Grade 1 events were considered less likely to be attributed to ICIs compared with the initial sponsor or site investigator assessment of possibly, probably, or definitely related. Two Grade 4 events, 1 Grade 2 event, and 8 Grade 1 events were considered to be possibly, probably, or definitely due to ICIs compared with the initial sponsor or site assessment of less likely or unrelated. Site responsiveness to retrospective queries from the sponsor varied. Additional lab values, primary source documentation, and communication with site investigators were necessary to make modifications. A total of 11 AEs required additional queries to site investigators. Conclusions: Retrospective reviews of real-time site and sponsor assessments of attribution exhibit significant modification of initial attribution to ICIs in this review of 29 patients on Arm Z1D of the NCI MATCH trial. Machine learning approaches for prediction of AEs on patients treated with ICIs may require careful assessments of primary source data to develop adequate training sets, given the paucity of gold standard data available.