Target-Driven Semantic Navigation (TDSN) shows great potential to be applied in intelligent domestic assistants supporting humans with daily activities. Although numerous methods have been explored to achieve efficient static TDSN, socially aware TDSN in dynamic and crowded scenarios remains challenging and has not been adequately investigated. The main challenges come from the complex human–robot interaction mechanisms and the semantic relation exploitation, which requires the robot to understand the surroundings and perform foresighted behaviors. In this paper, a TDSN strategy named SemNav-HRO is proposed by considering Human–Robot–Object (HRO) ternary feature fusion. Specifically, a Deep Reinforcement Learning (DRL) based Dual-Channel Value Estimation Network (DCVEN) is first proposed by integrating multi-granularity map features and social awareness to learn crowded TDSN strategies. Meanwhile, the tricky and socially aware TDSN problem is slackened by eliminating the dependence on costly features (e.g., pedestrian speed) and introducing a pedestrian trajectory predictor. For the learning and evaluation of crowded TDSN strategies, a novel and semantic-rich simulator with complex layouts is constructed based on realistic domestic scenes, instead of employing the previous naive simulation settings. Experimental results show that our method relatively improves the navigation success rates by 12.8%∼25.5% and 14.6%∼19.2% compared to the baselines on the MP3D and Gibson datasets, respectively. Furthermore, we experimentally verify the promising generalization and interpretability of our method.