Super-resolution of facial images, a.k.a. face hallucination, has been intensively studied in the past decades due to the increasingly emerging analysis demands in video surveillance, e.g., face detection, verification, identification. However, the actual performance of most previous hallucination approaches will drop dramatically when a very low-res tiny face is provided, due to the challenging multimodality of the problem as well as lack of an informative prior as a strong semantic guidance. Inspired by the latest progress in deep unsupervised learning, this paper focuses on tiny faces of size 16 × 16 pixels, hallucinating them to their 8 × upsampling versions by exploring the potentials of Wasserstein generative adversarial networks (WGAN). Besides a pixel-wise L2 regularization term imposed to the generative model, it is found that our advocated autoencoding generator with both residual and skip connections is a critical component for WGAN representing the facial contour and semantic content to a reasonable precision. With the additional Lipschitz penalty and architectural considerations for the critic in WGAN, the proposed approach finally achieves state-of-the-art hallucination performance in terms of both visual perception and objective assessment. The cropped CelebA face dataset is primarily used to aid the tuning and analysis of the new method, termed as tfh-WGAN. Experimental results demonstrate that the proposed approach not only achieves realistic hallucination of tiny faces, but also adapts to pose, expression, illuminance and occluded variations to a great degree.