In this paper, a fast inter-coding algorithm is proposed to reduce the computational load of HEVC encoders. The HEVC reference model (HM) employs the recursive depth-first-search (DFS) of the quad-tree search in terms of rate-distortion optimization in selecting the best coding modes for the best CU, PU, TU partitions, and many associated coding modes. The proposed algorithm evaluates the RD costs of the current CU only for its square-type PUs in the top-down search of the DFS. When the CU partition with the square-type PU is better than its sub-level CU partitions in terms of RD cost in bottom-up search of the DFS, the square type of current CU partition, along with its coding mode, is selected as the best partition. Otherwise, non-square-type PUs for the current CU level are evaluated. If the sub-partition is better than the CU with the non-square PUs, the sub-partition is finally selected as the optimum PU. Otherwise, the best non-square PU is selected as the best PU for the current level. Experimental results demonstrate that the proposed square-type-first inter-PU search can reduce the computational load in average encoding time by 66.7 % with 1---2 % BD loss over HM reference software. In addition, the proposed algorithm can yield an additional average time saving of 26.8---35.5 % against the three fast encoding algorithms adopted in HM.