In this paper, we go beyond the problem of recognizing video-based human interactive activities. We propose a novel approach that permits to deeply understand complex person-person activities based on the knowledge coming from human pose analysis. The joint coordinates of interactive objects are first located by an efficient human pose estimation algorithm. The relation features consisting of the intra and inter-person features of joint distance and angle, are suggested to use for describing the relationships between body components of the individual persons and the interacting two participants in the spatio-temporal dimension. These features are then provided to the codebook construction process, in which two types of codeword are generated corresponding to distance and angle features. In order to explain the relationships between poses, a flexible hierarchical topic model constructed by four layers is proposed using the Pachinko Allocation Model. The model is able to represent the full correlation between the relation features of body components as codewords, the interactive poselets as subtopics, and the interactive activities as super topics. Discrimination of complex activities presenting similar postures is further obtained by the proposed model. We validate our interaction recognition method on two practical data sets, the BIT-Interaction data set and the UT-Interaction data set. The experimental results demonstrate that the proposed approach outperforms recent interaction recognition approaches in terms of recognition accuracy.