Enzymatic molecular in situ self-assembly (E-MISA) that enables the synthesis of high-order nanostructures from synthetic small molecules inside a living subject has emerged as a promising strategy for molecular imaging and theranostics. This strategy leverages the catalytic activity of an enzyme to trigger probe substrate conversion and assembly in situ, permitting prolonging retention and congregating many molecules of probes in the targeted cells or tissues. Enhanced imaging signals or therapeutic functions can be achieved by responding to a specific enzyme. This E-MISA strategy has been successfully applied for the development of enzyme-activated smart molecular imaging or theranostic probes for in vivo applications. In this Perspective, we discuss the general principle of controlling in situ self-assembly of synthetic small molecules by an enzyme and then discuss the applications for the construction of "smart" imaging and theranostic probes against cancers and bacteria. Finally, we discuss the current challenges and perspectives in utilizing the E-MISA strategy for disease diagnoses and therapies, particularly for clinical translation.