Underwater sensor networks (UWSNs) are vulnerable to jamming attacks. Controllable reactive jamming is a kind of realistic and very harmful jamming attack. A reactive jammer controls the probability of jamming and the duration of the jam signal, in order to maintain high vulnerability with low detection probability. The existing works on reactive jamming detection are focused on terrestrial wireless sensor networks (TWSNs), and are limited in their ability to (a) detect jamming correctly, (b) distinguish between the corrupted and uncorrupted parts of a packet, and (c) be adaptive with the dynamic environment. In this paper, we develop a non-parametric cumulative sum (CUSUM)-test and weak estimation learning automata (WELA)-based scheme, named CURD, for controllable reactive jamming detection. In the proposed scheme, the CUSUM-test allows to quickly find the abrupt changes in bits without any a priori knowledge about the adversary, and the learning scheme helps in “absorbing” the impact of the dynamism in the environment. Unlike the existing works on this issue, we introduce the concept of partial-packet (PP) for analyzing a packet in different fragments (i.e., PPs) to identify packet corruption in few bits.We develop a Markov chain-based analytical model for theoretical analysis of the proposed scheme, CURD, under the consideration of a single controllable reactive jammer. We evaluate the performance of CURD through simulation studies in a UWSN environment. Results show that the proposed scheme is capable of accurately detecting reactive jamming in UWSNs, and outperforms the benchmark scheme considered in the study.