AbstractThe race for the most efficient, accurate, and universal algorithm in scientific computing drives innovation. At the same time, this healthy competition is only beneficial if the research output is actually comparable to prior results. Fairly comparing algorithms can be a complex endeavor, as the implementation, configuration, compute environment, and test problems need to be well‐defined. Due to the increase in computer‐based experiments, new infrastructure for facilitating the exchange and comparison of new algorithms is also needed. To this end, we propose a benchmark framework as a set of generic specifications for comparing implementations of algorithms using test cases native to a community. Its value lies in its ability to fairly compare and validate existing methods for new applications, as well as compare newly developed methods with existing ones. As a prototype for a more general framework, we have begun building a benchmark tool for the model order reduction (MOR) community. The data basis of the tool is the collection of the Model Order Reduction Wiki (MORWiki). The wiki features three main categories: benchmarks, methods, and software. An editorial board curates submissions and patrols edited entries. Data sets for linear and parametric‐linear models are already well represented in the existing collection. Data sets for non‐linear or procedural models, for which only evaluation data, or codes/algorithmic descriptions, rather than equations, are available, are being added and extended. Properties and interesting characteristics used for benchmark selection and later assessments are recorded in the model metadata. Our tool, the Model Order Reduction Benchmarker (MORB), is under active development for linear time‐invariant systems and solvers. An ontology (MORBO) and knowledge graph are being developed in parallel. They catalog benchmark problem sets and their metadata and will also be integrated into the Mathematical Research Data Initiative (MaRDI) Portal to help improve the findability of such data sets. MORB faces a number of technical and field‐specific challenges, and we hope to recruit community input and feedback while presenting some initial results.