Design Support

Parallel Digital Signal Processing With the TMS320C40

This paper examines parallel processing using the Texas Instruments TMS320C40 floating-point processor. It demonstrates popular parallel architecture topologies such as hypercube, mesh, ring, and pyramid with the ?C40 and discusses the tradeoffs and performance of these ?C40-based architectures. This paper is divided into the following sections:  Overview Tells why the ?C40 architecture is ideal for parallel processing and describes a VME-based ?C40 board.  Parallel Processing Topologies Describes different parallel architectures such as hypercube, pyramid, mesh, ring, and tree. Also discusses designing of massively parallel systems using the principle of reconfigurability.  TMS320C40-Based AT/ISA and VME Architecture Discusses the architecture of VME and AT/ISA TMS320C40-based boards that are expandable from two nodes capable of 100 MFLOPS (million floating-point operations per second) to hundreds of nodes.  System-Level Design Issues With the TMS320C40 Communication Ports Discusses the ?C40 node reset and offers design suggestions.  Matrix Multiplication Application of Parallel Processing Explains matrix multiplication.  Parallel Processing Architecture Topologies Describes the hypercube and its properties and mesh topologies.  Reconfigurable Massively Parallel Processors Discusses software approaches and links.  Benchmarks, Analysis, and Data Composition Explains benchmarking and evaluation of parallel processing systems, as well as algorithm efficiency and data decomposition strategies.  Conclusion  Definitions  References



Bookmark and Share