As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.