Edge Intelligence (EI) aims at addressing concerns like response latency risen by the conflict between predominating Cloud-based deployments of computationally intensive AI applications and the expensive uploading of explosive end data. Convolutional Neural Networks (CNNs) leading the latest flourish of AI inevitably suffer from the aforementioned conflict. There emerge increasing EI-driven attempts on fast CNN inference with high accuracy in the End-Edge-Cloud (EEC) collaborative computing paradigm, where, however, neither model compression approaches for on-device inference nor collaborative inference methods across devices can effectively achieve the trade-off between latency and accuracy of End-to-End (E2E) inference. In this article, we present CNNPC that jointly partitions and compresses CNNs for fast inference with high accuracy in collaborative EEC systems. We implemented CNNPC (source code available at <uri>https://github.com/IoTDATALab/CNNPC</uri> ) and evaluated its performance within extensive real-world EEC scenarios. Experimental results demonstrate that, compared with state-of-the-art single-end and collaborative approaches, without obvious accuracy loss, collaborative inference based on CNNPC is up to <inline-formula><tex-math notation="LaTeX">$1.6\times$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$5.6\times$</tex-math></inline-formula> faster, and requires as low as <inline-formula><tex-math notation="LaTeX">$4.30\%$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$6.48\%$</tex-math></inline-formula> communications, respectively. Besides, when determines the optimal strategy, CNNPC requires as low as <inline-formula><tex-math notation="LaTeX">$0.1\%$</tex-math></inline-formula> actual compression operations that the traversal method (the only viable method providing the theoretically optimal strategy) requires.