Parallel Digital Signal Processing With the TMS320C40
This paper examines parallel processing using the Texas Instruments TMS320C40 floating-point processor. It demonstrates popular parallel architecture topologies such as hypercube, mesh, ring, and pyramid with the ?C40 and discusses the tradeoffs and performance of these ?C40-based architectures. This
paper is divided into the following sections:
Overview
Tells why the ?C40 architecture is ideal for parallel processing and describes a VME-based ?C40 board.
Parallel Processing Topologies
Describes different parallel architectures such as hypercube, pyramid, mesh, ring, and tree. Also discusses designing of massively parallel systems using the principle of reconfigurability.
TMS320C40-Based AT/ISA and VME Architecture
Discusses the architecture of VME and AT/ISA TMS320C40-based boards that are expandable from two nodes capable of 100 MFLOPS (million floating-point operations per second) to hundreds of nodes.
System-Level Design Issues With the TMS320C40 Communication Ports
Discusses the ?C40 node reset and offers design suggestions.
Matrix Multiplication Application of Parallel Processing
Explains matrix multiplication.
Parallel Processing Architecture Topologies
Describes the hypercube and its properties and mesh topologies.
Reconfigurable Massively Parallel Processors
Discusses software approaches and links.
Benchmarks, Analysis, and Data Composition
Explains benchmarking and evaluation of parallel processing systems, as well as algorithm efficiency and data decomposition strategies.
Conclusion
Definitions
References
|