Trident: a scalable architecture for scalar, vector, and matrix operations

Mostafa I Soliman ,Stanislav G Sedukhin

doi:10.1145/563952.563944

Abstract

Within a few years it will be possible to integrate a billion transistors on a single chip. At this integration level, we propose using a high level ISA to express parallelism to hardware instead of using a huge transistor budget to dynamically extract it. Since the fundamental data structures for a wide variety of applications are scalar, vector, and matrix, our proposed Trident processor extends the classical vector ISA with matrix operations. The Trident processor consists of a set of parallel vector pipelines (PVPs) combined with a fast in order scalar core. The PVPs can access both vector and matrix register files to perform vector, matrix, and matrix-vector operations. One key point of our design is the exploitation of up to three levels of data parallelism. Another key point is the ring register files for storing vector and matrix data. The ring structure of the register files reduces the number and size of the address decoders, the number of ports, the area overhead caused by the address bus, and the number of registers attached to bit lines, as well as providing local communication between PVPs. The scalability of the Trident processor does not require more fetch, decode, or issue bandwidth, but requires replication of PVPs and increasing the register file size. Scientific, engineering, multimedia, and many other applications, which are based on a mixture of scalar, vector, and matrix operations, can be speeded up on the Trident processor.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Trident: a scalable architecture for scalar, vector, and matrix operations

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Matrix bidiagonalization on the Trident processor
M.I Soliman ... S.G Sedukhin
-
M.I Soliman, et. al.M.I Soliman ... S.G Sedukhin
22 Apr 2003
22 Apr 2003

Supporting matrix operations in vector architectures
H Bi ... W.K Giloi
-
H Bi, et. al.H Bi ... W.K Giloi
01 Mar 1992
01 Mar 1992

GPU-accelerated adjoint algorithmic differentiation
Felix Gremse ... Uwe Naumann
Computer Physics Communications | VOL. 200
Felix Gremse, et. al.Felix Gremse ... Uwe Naumann
12 Nov 2015
Computer Physics Communications | VOL. 200

GPU register file virtualization
Hyeran Jeon ... Nam Sung Kim
-
Hyeran Jeon, et. al.Hyeran Jeon ... Nam Sung Kim
05 Dec 2015
05 Dec 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Trident: a scalable architecture for scalar, vector, and matrix operations

Abstract

Talk to us

Similar Papers