Abstract

A novel efficient bus architecture is presented together with an application. The bus architecture belongs to a slotted-ring class. 32-bits of data, l4-bits address, and signalling buses span across a maximum of sixteen processors configured in a ring. The bus information arriving at each processing element can be either: passed without change, captured by the processing element (PE) and/or overwritten by the PE. The delay through each PE is 30 ns when using 1989 IC technology. Through the use of newer IC technology and due to unique physical arrangement of the bus the delay time can be reduced to approximately 15 ns. Through the use of time slot arrangements and/or signalling lines the data can reach any of the other processors in the system. Logically each processor sees the memory of the other as part of a global write-only memory. The unique hardware processor internal synchronization mechanism reduces the synchronization overhead. This paper presents implementation details of the hardware as well an application in the iterative solution of dense linear equations as the test-bed multiprocessor. >

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.