Hash function algorithms are widely used to provide security services of integrity and authentication, being SHA-2 the latest set of hash algorithms standardized by the US Federal Government. The main computation block in SHA-2 algorithms is governed by a loop with high data dependence for which several implementation strategies are explored in this work as well as designs efficiently mapped to hardware architectures. Four new different hardware architectures are proposed to improve the performance of SHA-256 algorithms, reducing the critical path by reordering some operations required at each iteration of the algorithm and computing some values in advance, as possible as data dependence allows. The proposed designs were implemented and validated in the FPGA Virtex-2 XC2VP-7. The achieved results show a significant improvement on the performance of the SHA-256 algorithm compared to similar previously proposed approaches, obtaining a throughput of 909Mbps and an improved efficiency of 0.713Mbps/slice.