Multi-instance imbalanced semantic segmentation (MISS) is a common issue where larger foreground instances can dominate smaller ones, yet still produce satisfactory Dice. Detecting small instances is crucial for many applications. For example, in the diagnosis of coronary artery disease, it is essential to locate small-scale lesions. To address the challenge of MISS, we propose MISS-Net (MISS-Network), which considers the distribution characteristics and segmentation difficulty of instances. On one hand, given the spatially sparse distribution characteristics of instances, we propose an instance-dependent SparseViT. SparseViT learns unstructured sparse attention to capture instance-related patterns by computing content-aware attention scores and selecting the top-K indices. On the other hand, we propose a hard instance-aware loss that identifies ‘hard-to-segment’ instances through a dynamic instance weighting strategy. Combined with a global segmentation loss, it produces a trade-off between the global-level and instance-level segmentation performance. We extensively evaluated MISS-Net on three challenging 3D semantic segmentation tasks: two coronary artery calcification (CAC) datasets and one liver tumor (LITS) dataset. These datasets contain a large number of instances with significant variability in instance volumes (e.g., in the LITS dataset, a patient has an average of 12.39 instances, with instance volumes ranging from 987.66cm3 to 0.04cm3). The results demonstrate that MISS-Net significantly improves both instance-level and global-level segmentation performance. For the CAC task, MISS-Net improves Dice by 3.6% and 6.4%, and instance F1 by 3.1% and 1.9%. For the LITS dataset, Dice and instance F1 are improved by 1.6% and 3.8%, respectively.