AbstractAs quantum computing technology slowly matures and the number of available qubits on a QPU gradually increases, interest in assessing the capabilities of quantum computing hardware in a scalable manner is growing. One of the key properties for quantum computing is the ability to generate multipartite entangled states. In this study, aspects of benchmarking entanglement generation capabilities of noisy intermediate‐scale quantum (NISQ) devices are discussed based on the preparation of graph states and the verification of entanglement in the prepared states. Thereby, entanglement witnesses that are specifically suited for a scalable experiment design are used. This choice of entanglement witnesses can detect A) bipartite entanglement and B) genuine multipartite entanglement for graph states with constant two measurement settings if the prepared graph state is based on a two‐colorable graph, e.g., a square grid graph or one of its subgraphs. With this, it is experimentally verified that a bipartite entangled state comprising all qubits can be prepared on a 127‐qubit IBM Quantum superconducting QPU, and genuine multipartite entanglement can be detected for states of up to 23 qubits with quantum readout error mitigation.