Massive multiple-input multiple-output (M-MIMO) is a significant pillar in fifth generation (5G) networks where a large number of antennas is deployed. It provides massive advantages to modern communication systems in data rate, spectral efficiency, number of users serviced simultaneously, energy efficiency, and quality of service (QoS). However, it requires advanced signal processing for data detection. The growing MIMO size leads to complicated scenarios, which makes the detector design a knotty problem. The problem is also becoming more complicated when high-order modulation schemes are exploited and more users are multiplexed. Therefore, it is not practical to employ the maximum likelihood (ML) detector despite the excellent performance. Linear detectors are alternative solutions and relatively simple. Unfortunately, they still need an exact matrix inversion computation, which bears to a significant high complexity. Therefore, several iterative methods are utilized to approximate or evade the matrix inversion rather than computing it. This paper studies the pros and cons of iterative matrix inversion methods where the number of computations and bit-error-rate (BER) are considered to compare between the methods. The comparison is conducted in several scenarios such as different ratio between the number of base station (BS) antennas and user terminal (UT) antennas (β), the number of iterations (n), and the relaxation parameter (ω). This paper also studies the impact of ω in the performance of Richardson (RI) and the successive over-relaxation (SOR) methods. Numerical results show that the conjugate gradient (CG) and optimized coordinate descent (OCD) methods exhibit the lowest complexity with an acceptable performance. In addition, the Gauss-Seidel (GS) method outperforms all other detectors with a trivial complexity increment. It is also noticed that the performance is not improved with every iteration. It is also shown that ω has a great impact and a significant role in achieving a satisfactory performance in both RI and SOR based detectors. From implementation point of view, detectors based on RI, OCD, and CG methods have achieved the highest hardware efficiency (HE) while Jacobi (JA) based detector has obtained the lowest HE. Recent research advances of detection methods are also presented in the open research direction with a potential impact of linear detection methods in initialization and pre-processing.