Chinese medical machine reading comprehension question-answering (cMed-MRCQA) is a critical component of the intelligence question-answering task, focusing on the Chinese medical domain question-answering task. Its purpose enable machines to analyze and understand the given text and question and then extract the accurate answer. To enhance cMed-MRCQA performance, it is essential to possess a profound comprehension and analysis of the context, deduce concealed information from the textual content and, subsequently, precisely determine the answer's span. The answer span has predominantly been defined by language items, with sentences employed in most instances. However, it is worth noting that sentences may not be properly split to varying degrees in various languages, making it challenging for the model to predict the answer zone. To alleviate this issue, this paper presents a novel architecture called HCT based on a Hierarchically Collaborative Transformer. Specifically, we presented a hierarchical collaborative method to locate the boundaries of sentence and answer spans separately. First, we designed a hierarchical encoding module to obtain the local semantic features of the corpus; second, we proposed a sentence-level self-attention module and a fused interaction-attention module to get the global information about the text. Finally, the model is trained by combining loss functions. Extensive experiments were conducted on the public dataset CMedMRC and the reconstruction dataset eMedicine to validate the effectiveness of the proposed method. Experimental results showed that the proposed method performed better than the state-of-the-art methods. Using the F1 metric, our model scored 90.4% on the CMedMRC and 73.2% on eMedicine.