A key challenge in neuroscience is to understand the structural and functional relationships of the brain from high-dimensional, multimodal neuroimaging data. While conventional multivariate approaches often simplify statistical assumptions and estimate one-dimensional independent sources shared across modalities, the relationships between true latent sources are likely more complex - statistical dependence may exist within and between modalities, and span one or more dimensions. Here we present Multimodal Subspace Independent Vector Analysis (MSIVA), a methodology to capture both joint and unique vector sources from multiple data modalities by defining both cross-modal and unimodal subspaces with variable dimensions. In particular, MSIVA enables flexible estimation of varying-size independent subspaces within modalities and their one-to-one linkage to corresponding sub-spaces across modalities. As we demonstrate, a main benefit of MSIVA is the ability to capture subject-level variability at the voxel level within independent subspaces, contrasting with the rigidity of traditional methods that share the same independent components across subjects. We compared MSIVA to a unimodal initialization baseline and a multimodal initialization baseline, and evaluated all three approaches with five candidate subspace structures on both synthetic and neuroimaging datasets. We show that MSIVA successfully identified the ground-truth subspace structures in multiple synthetic datasets, while the multimodal baseline failed to detect high-dimensional subspaces. We then demonstrate that MSIVA better detected the latent subspace structure in two large multimodal neuroimaging datasets including structural MRI (sMRI) and functional MRI (fMRI), compared with the unimodal baseline. From subsequent subspace-specific canonical correlation analysis, brain-phenotype prediction, and voxelwise brain-age delta analysis, our findings suggest that the estimated sources from MSIVA with optimal subspace structure are strongly associated with various phenotype variables, including age, sex, schizophrenia, lifestyle factors, and cognitive functions. Further, we identified modality- and group-specific brain regions related to multiple phenotype measures such as age (e.g., cerebellum, precentral gyrus, and cingulate gyrus in sMRI; occipital lobe and superior frontal gyrus in fMRI), sex (e.g., cerebellum in sMRI, frontal lobe in fMRI, and precuneus in both sMRI and fMRI), schizophrenia (e.g., cerebellum, temporal pole, and frontal operculum cortex in sMRI; occipital pole, lingual gyrus, and precuneus in fMRI), shedding light on phenotypic and neuropsychiatric biomarkers of linked brain structure and function.