Convolutional Neural Networks (CNNs) are becoming popular in Internet-of-Things (IoT) based object tracking areas, e.g., autonomous driving, commercial surveillance, and intelligent traffic management. However, due to limited processing power of embedded devices and network bandwidth, how to simultaneously guarantee fast object tracking with high accuracy and low energy consumption is still a major challenge, which makes IoT-based vision applications unreliable and unsustainable. To address this problem, this article proposes a collaborative edge-cloud architecture that resorts to cloud for object tracking performance enhancement. By properly offloading computations to cloud and periodically checking tracking status of edge devices through convolutional Siamese networks, our novel edge-cloud architecture enables interactive collaborations between edge devices and cloud servers in order to quickly and accurately rectify tracking errors. Comprehensive experimental results on well-known video object tracking benchmarks show that our architecture can not only significantly improve the performance of object tracking, but also can save the energy consumption of edge devices.