The bit parallel multiplication scheme is characterized by an important feature called concurrency, which makes its execution faster. The output through this scheme is generated in every clock cycle, after taking ‘m’ clock cycles in the beginning. In this paper, a polynomial basis systolic multiplier over irreducible polynomials (xm+tm−1xm−1+....t2x2+t1x1+t0) in GF(2m) is designed specifically for odd ‘m’, which takes ‘m’ clock cycles and this is further, used to design the low complexity bit-parallel architecture for general irreducible polynomials and trinomials (xm+xk+1). The area complexity of the proposed multiplier for general irreducible polynomials matches with the best existing multiplier with a 17% reduction in time complexity due to a considerable decrease in critical path delay. To the best of our knowledge, the area-delay product of the proposed multipliers is the lowest achieved when compared with the best multipliers available in the literature. FPGA implementation results also show that the proposed multiplier has 59% less space complexity and 47% less time complexity than the best-reported multiplier for m = 233.