Efficient methods for the extraction of features of interest remain one of the biggest challenges for the interpretation of cryo-electron tomograms. Various automated approaches have been proposed, many of which work well for high-contrast datasets where the features of interest can be easily detected and are clearly separated from one another. Our inner ear stereocilia cryo-electron tomographic datasets are characterized by a dense array of hexagonally packed actin filaments that are frequently cross-connected. These features make automated segmentation very challenging, further aggravated by the high-noise environment of cryo-electron tomograms and the high complexity of the densely packed features. Using prior knowledge about the actin bundle organization, we have placed layers of a highly simplified ball-and-stick actin model to first obtain a global fit to the density map, followed by regional and local adjustments of the model. We show that volumetric model building not only allows us to deal with the high complexity, but also provides precise measurements and statistics about the actin bundle. Volumetric models also serve as anchoring points for local segmentation, such as in the case of the actin-actin cross connectors. Volumetric model building, particularly when further augmented by computer-based automated fitting approaches, can be a powerful alternative when conventional automated segmentation approaches are not successful.