The modern integrated circuit is one of the most complex products engineered to date. It continues to grow in complexity as years progress. As a result, very large-scale integrated (VLSI) circuit design now involves massive design teams employing state-of-the-art computer-aided design (CAD) tools. One of the oldest, yet most important CAD problems for VLSI circuits is physical design automation, where one needs to compute the best physical layout of millions to billions of circuit components on a tiny silicon surface (Lim in Practical problems in VLSI physical design automation, Springer, Dordrecht, 2008). The process of mapping an electronic design to a chip involves several physical design stages, one of which is clustering. Even for combinatorial circuits, there exists several models for the clustering problem. In particular, we consider the problem of clustering in combinatorial circuits for delay minimization, without permitting logic replication (CN). The problem of clustering for delay minimization when logic replication is allowed (CA) has been well-studied and is known to be solvable in polynomial time (Lawler et al. in IEEE Trans Comput 18(1):47–57, 1969; Rajaraman and Wong, in: 30th ACM/IEEE design automation conference, pp 309–314, 1993). However, unbounded logic replication can be quite expensive. It follows that CN is an important problem. We show that selected variants of CN are NP-hard. We also obtain approximability and inapproximability results for some of these problems. A preliminary version of this paper appears in Donovan et al. (in: 9th International conference on combinatorial optimization and applications, COCOA 2015, Proceedings, pp 334–347, 2015).