FQA about Xputers:
Frequently questioned Answers about Xputers
In the late eighties we had problems
in publishing about Xputers. Although many submissions have been
in 1989 we succeeded in publishing our first speed-up data - our
to questions about the usefulness of the Xputer paradigm. These answers
sometimes have been questioned. Some people did not believe in our
Inalmost ten years we have implemented three different generations of Xputer architectures: the MoM-1, MoM-2, and the MoM3 (also see the section section about sequencers). Our first performance figures, from the MoM-1, have been published in 1989, showing the speed-up against Motorola 68020 by using a MoM-1 Xputer (see following table). MoM-1 is our first generation Xputer architecture. MoM stands for "Map-oriented Machine".
By the way, the complexity of this grid-based design rule check goes linear with chip area. This is substantially better than with conventional design rule check algorithms, which need complex divide and conquer strategies to obtain halfways reasonable computation speed. We have also investigated the influence of the number of sequencers used simulatiously on the performance (see table no. 2). In a 10 by 10 matrix vector multiplication the use of two sequencers has brought a speed-up by more than an order of magnitude, compared to the version with only a single sequencer.
Most of the acceleration has been achieved due to computing 800 boolean equations in parallel within a single clock cycle. The optimized 68020 version of the algorithm skips certain boolean equation computations depending on results of other boolean equaltions. Such an optimization does not make sense for the MoM version. That's why comparing to the non-optimized 68020 algorithm version is more fair, where the acceleration factor is greater than fifteen thousand (se table above). Another source of speed-up is avoiding address computation overhead by the Xputer's data sequencer. In the design rule check example this contributes more than an order of magnitude (see section on sequencers).
A couple of years later we have experimented with two different versions of the MoM-3 architecture (the MoM-3, and the faster MoM-3NT, where NT stands for "newer technology"). Table 3 shows the results for the core of the JPEG algorithm (a multi media application). Also this time the performance can be improcved by using more sequencers.
The Ising model (chemistry, molecular biology)
is used for the analysis of phase transitions by explaining how short-range
interactions between components of a large structure (e.g. molecules in
a crystal) give rise to a large-range, correlating behaviour, for predicting
the potential for a phase transition.
The use of an Xputer as an accelerator co-processor also means (1) software-to-hardware migration, but not only. Most frequently used loops and their expression body is moved onto the rALU. But only with Xputers more speed-up phenomena are available: (2) almost complete avoidance of addressing overhead by migration into the sequencer(s) (not into the rALU!), and, (3) run time to compile time migration, where such is not possible on parallel von Neumann platforms. Parallelism in Xputers does not suffer from the fine granularity switching explosion, like parallel (von Neumann) computers, since most of the communication in Xputer systems is defined by the compilation method. Table 5 shows the run time addressing overhead for von Neumann implementations of several algorithms. For example, the grid-based design rule check   has a high addressing overhead. It uses 92% of somputation time for addressing. By migration into a data sequencer due to Xputer use this yields a speed-up of more than ten.
 R.Hartenstein, A. Hirschbiel, M.Weber: Rekonfigurierbare ALU erlaubt Parallelisierung auf unterster Ebene; VMEbus, 1990
 R. Hartenstein, A. Hirschbiel, M.Weber: Xputers - An Open Family of Non von Neumann Architectures; Proc. of 11th ITG/GI-Conf. Architektur von Rechensystemen, VDE-Verlag, 1990
 R.Hartenstein, A.Hirschbiel, M.Weber: The Machine Paradigm of Xputers and its Application to Digital Signal Processing Acceleration; 1990 IntÂ´l Conference on Parallel Processing, St. Charles, Illinois, 1990
 R. Hartenstein, A.Hirschbiel, M. RiedmÃ¼ller, K. Schmidt, M.Weber: Automatic Synthesis of Cheap Hardware Accelerators for Signal Processing and Image Preprocessing; 12. DAGM-Symposium Mustererkennung (Pattern Recognition), Oberkochen-Aalen, Germany 1990
---- Best Paper and Best Presentation Award: DM 1000.-- Speaker: M. Weber ----
 R. Hartenstein, A. Hirschbiel, M. Weber: A Novel Paradigm of Parallel Computation and its Use to Implement Simple High Performance Hardware; InfoJapan'90- Int'l Conf. memorating 30th Anniversary Computer Society of Japan, Tokyo, Japan, 1990
[6b] R. Hartenstein, A. Hirschbiel, M.Weber: A Novel Paradigm of Parallel Computation and its Use to Implement Simple High Performance Hardware; Future Generation Computer Systems, no. 7, pp. 181-198 (North-Holland PublishingCo., 1991/92)
 R. Hartenstein, R. Kress, H. Reinig: A Dynamically Reconfigurable Wavefront Array Architecture for Evaluation of Expressions; Proc. ASAP'94 , Int'l Conf. on Application-Specific Array Processors, San Francisco, Aug. 1994, IEEE CS Press 1994
 R. Hartenstein, J. Becker, R. Kress, H. Reinig, K. Schmidt: A Reconfigurable Machine for Applications in Image and Video Compression; Int`l Conf. on Compression Technologies and Standards for Image and Video Compression, Amsterdam, Holland, March 1995
 Reiner W. Hartenstein, Rainer Kress, Helmut Reinig: A Reconfigurable Accelerator for 32-Bit Arithmetic; International Parallel Processing Symposium, Santa Barbara, USA, April 1995
 J. Becker, R. Hartenstein, R. Kress, H. Reinig: High-Performance Computing Using a Reconfigurable Accelerator; Proceedings of Workshop on High Performance Computing, Montreal, Canada, July 1995
 R. Hartenstein, R. Kress: A Scalable, Parallel, and Reconfigurable Datapath Architecture; Sixth International Symposium on IC Technology, Systems & Applications, ISIC'95, Singapore, Sept. 6-8, 1995
 R. Hartenstein (opening key note): Custom Computing Machines - An Overview; Int`l Workshop on Design Methodologies for Microelectronics, Smolenice Castle, Slovakia, September 1995
 J. Becker, R. Hartenstein, R. Kress, H. Reinig: A Reconfigurable Parallel Architecture to Accelerate Scientific Computation; Proc. of Int. Conf. on High Performance Computing, New Delhi, India, Dec. 1995
 R. Hartenstein, J. Becker, R.Kress, H. Reinig: High-Performance Computing Using a Reconfigurable Accelerator; CPE Journal, Special Issue of Concurrency: Practice and Experience, John Wiley & Sons Ltd., 1996