CHIP multiprocessor (CMP) architectures are formed when multiple compute cores are integrated onto the same chip, forming a single, powerful, computational entity. Nearly every major high-performance processor manufacturer has at least two cores (dual-core) on the die, and their roadmaps are increasingly multicore, signaling that the era of big, monolithic uniprocessors has ended. This results from the fact that ever-larger uniprocessors do not scale well in power/performance, area/performance, or design complexity/performance. Continued performance scaling of these processors will thus be focused primarily on increasing multithreaded throughput. The rapid adoption of small-scale CMP platforms and the quest for high performance continues to accelerate the rate at which processor manufacturers are considering adding more cores on the die. Over the last decade, there has been significant progress in research and development in both academia and industry on CMP architecture and design for client and server platforms. And, while we have successfully entered the era of CMP, there are a significant set of challenges and opportunities that are yet to be investigated deeply. Some of the broad research areas being investigated include CMP architecture alternatives (for core, cache, interconnect, and memory), CMP design and technologies (process implications, new technologies like 3D-stacking, voltage/clock domain management, etc.), CMP performance evaluation (new simulation and modeling techniques, emerging applications and execution environments like virtualization), and novel CMP architectures and use cases (asymmetric or heterogeneous architectures, accelerators, etc.). There are many questions that are still to be answered for CMP architectures. Below, we list a few of the most compelling ones.