Convolutional neural networks (CNN) is a popular architecture in machine learning for its predictive power, notably in computer vision and medical image analysis. Its great predictive power requires extensive computation, which encourages model owners to host the prediction service in a cloud platform. This article proposes a CNN prediction scheme that preserves privacy in the outsourced setting, i.e., the model-hosting server cannot learn the query, (intermediate) results, and the model. Similar to SecureML (S&P’17), a representative work that provides model privacy, we employ two non-colluding servers with secret sharing and triplet generation to minimize the usage of heavyweight cryptography. We made the following optimizations for both overall latency and accuracy. 1) We adopt asynchronous computation and SIMD for offline triplet generation and parallelizable online computation. 2) As MiniONN (CCS’17) and its improvement by the generic EzPC compiler (EuroS&P’19), we use a garbled circuit for the non-polynomial ReLU activation to keep the same accuracy as the underlying network (instead of approximating it in SecureML prediction). 3) For the pooling in CNN, we employ (linear) average-pooling, which achieves almost the same accuracy as the (non-linear, and hence less efficient) max-pooling exhibited by MiniONN and EzPC. Considering both offline and online costs, our experiments on the MNIST dataset show a latency reduction of <inline-formula><tex-math notation="LaTeX">$122\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>122</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq1-3029899.gif"/></alternatives></inline-formula>, <inline-formula><tex-math notation="LaTeX">$14.63\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>14</mml:mn><mml:mo>.</mml:mo><mml:mn>63</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq2-3029899.gif"/></alternatives></inline-formula>, and <inline-formula><tex-math notation="LaTeX">$36.69\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>36</mml:mn><mml:mo>.</mml:mo><mml:mn>69</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq3-3029899.gif"/></alternatives></inline-formula> compared to SecureML, MiniONN, and EzPC; and a reduction of communication costs by <inline-formula><tex-math notation="LaTeX">$1.09\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>1</mml:mn><mml:mo>.</mml:mo><mml:mn>09</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq4-3029899.gif"/></alternatives></inline-formula>, <inline-formula><tex-math notation="LaTeX">$36.69\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>36</mml:mn><mml:mo>.</mml:mo><mml:mn>69</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq5-3029899.gif"/></alternatives></inline-formula>, and <inline-formula><tex-math notation="LaTeX">$31.32\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>31</mml:mn><mml:mo>.</mml:mo><mml:mn>32</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq6-3029899.gif"/></alternatives></inline-formula>, respectively. On the CIFAR dataset, our scheme achieves a lower latency by <inline-formula><tex-math notation="LaTeX">$7.14\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>7</mml:mn><mml:mo>.</mml:mo><mml:mn>14</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq7-3029899.gif"/></alternatives></inline-formula> and <inline-formula><tex-math notation="LaTeX">$3.48\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>48</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq8-3029899.gif"/></alternatives></inline-formula> and lower communication costs by <inline-formula><tex-math notation="LaTeX">$13.88\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>13</mml:mn><mml:mo>.</mml:mo><mml:mn>88</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq9-3029899.gif"/></alternatives></inline-formula> and <inline-formula><tex-math notation="LaTeX">$77.46\times$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>77</mml:mn><mml:mo>.</mml:mo><mml:mn>46</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq10-3029899.gif"/></alternatives></inline-formula> when compared with MiniONN and EzPC, respectively.