AbstractDuring aircraft operations, pilots rely on human‐machine interaction platforms to access essential information services. However, the development of a highly usable aerial assistant necessitates the incorporation of two interaction modes: active‐command and passive‐response modes, along with three input modes: voice inputs, situation inputs, and plan inputs. This research focuses on the design of an aircraft human‐machine interaction assistant (AHMIA), which serves as a multimodal data processing and application framework for human‐to‐machine interaction in a fully voice‐controlled manner. For the voice mode, a finetuned FunASR model is employed, leveraging private aeronautical datasets to enable specific aeronautical speech recognition. For the situation mode, a hierarchical situation events extraction model is proposed, facilitating the utilization of high‐level situational features. For the plan mode, a multi‐formations double‐code network plan diagram with a timeline is utilized to effectively represent plan information. Notably, to bridge the gap between human language and machine language, a hierarchical knowledge engine named process‐event‐condition‐order‐skill (PECOS) is introduced. PECOS provides three distinct products: the PECOS model, the PECOS state chart, and the PECOS knowledge description. Simulation results within the air confrontation scenario demonstrate that AHMIA enables active‐command and passive‐response interactions with pilots, thereby enhancing the overall interaction modality.