Rise of Big Data techniques has led to the requirement for low latency analysis of high-velocity continuous data streams in real time. Several solutions, including Stream Processing Systems (SPSs), have been developed to enable real-time distributed stream processing. However, emerging application scenarios such as smart cities and wearable assistance that involve highly variable data rates keep on posing new challenges to the established stream processing engines for maintaining cost-effective executions. To cater to such scenarios, many modern SPSs have been proposed that leverage Cloud environment. The run-time scalability incorporated in these SPSs is in their early adaptations and are based on fixed scale sizes. Moreover, these scaling approaches do not adequately consider both the structure of the hosted streaming applications and the characteristic features of the underlying Cloud environment. Achieving true cost benefits of orchestrating streaming applications on Cloud-based pay-as-you-go model while maintaining the desired QoS, necessitates that both these issues are accounted in making the scaling decisions. This work presents a heterogeneity-aware, efficient auto-scaling strategy StreamScale-H which addresses both these issues. Simulation experiments, on representative stream applications, indicate that the proposed StreamScale-H auto-scaling algorithm exhibits much better performance in comparison with the state-of-the-art algorithms.