In this paper, we tackle the problem of implementation a parallel interference cancellation scheme for multi-user detection (MUD) in M-ary CCSK-modulated CDMA system. Complexity remains the main obstacle to the practical implementation of MUD and a factor limiting, if not the very possibility, then the feasibility of using high order M-ary modulation formats for applications such as low-probability-of-intercept systems with a large spreading factor or, myriads of small self-powered IoT devices. The design of MUD algorithms for M-ary systems leads to a new round of growth in the receiver complexity comparing with the binary modulation. Demodulator has a major impact on the complexity of CDMA system. Cyclic code-shift keying (CCSK) is a modulation technique which is designed to reduce the complexity of M-ary signaling. In this, each symbol is a circularly shifted version of a single code sequence. Assuming synchronization, the receiver cyclically correlates the input signal plus noise with the base sequence and estimates the position of the correlation peak. The preliminary stage of the proposed scheme is a conventional multi-channel CDMA receiver. The preliminary estimates of the data is multiplied by user codes and amplitude estimates in the spreader, and then fed to the adder to generate the MAI estimates. After the latter are subtracted from the group signal, K-dimensional vector of cleared user signals is passed to the matched filter bank to form the refined estimates of the data. At each subsequent stage, the estimates of the output data of the previous stage are used as input data. The complexity gain is the ratio of the number of computations for finding cyclic convolution and, the amount of computations for finding M linear convolutions, i.e., M/log2M. Since the performance of the algorithm is sensitive to the reliability of the preliminary decisions, there are strict criteria for the selection of spreading codes: the length which must be an integer power of 2, the large family size, good periodic autocorrelation and good cross-correlation properties. To examine the proposed algorithm using computer simulation, we selected minimax periodic and odd-periodic complementary codes, the properties of which are close to the properties of the codes used in the CCSK scheme implemented in Link-16 protocol tactical data networks JTIDS and MIDS. Our study shows that the gain over the conventional receiver increases as the SNR increases, achieving 15 dB for BER 10-5. The system with complementary codes outperform system based on minimax PN-codes, achieving a target bit error rate at a lower SNR.