Xputer Laboratory at Kaiserslautern University 
XC6k Example Applications: 
Generic 3x3 Filter 

Index | Design Flow | Example Applications | Download | Links | Gallery | Staff / Contact


 
A filter for image-transformation is an operator, which assigns a new value to pixel p0new depending on the pixel-value of p0 and N of its neighbor pixels:

If calculation of the modified pixel-value is independent from its position, the filter j is called homogeneous. The selection of neighbor pixels is done by moving a window from left to right and from top to bottom of the complete image. All pixel covered by this window are used. In the case of a 3x3 filter, an array of 3 times 3 pixels is chosen with p0 as the upper left corner of the array (see figure). Linear filter operation is a linear function of all elements in the defined array:

 

In the case of a generic 3x3 linear filter, p0new is calculated as follows:

Applying this operator on an m*n-pixel image results in an image of size (m-2)*(n-2).
Some examples will explain which effects can be obtained by such linear filter operations (see fig). All pixel-values are assumed to be an 8-bit greyscale value (0 to 255). The following example images and the filter operations are taken from [Kop97].

In this current implementation of filter, the coefficients ki are signed integer numbers in the range from -16 to 15 (5 bit). Divisor j is a two‘s exponent in the range from 20 to 28.


Using a two‘s exponent divider, the division can be done by a shift operation. This leads to the dataflow-graph:

 
 
The implementation illustrated below uses 2373 logic cells (about 58% of the XC6216 FPGA) and operates at a maximum clock rate of 25 MHz. The design transforms a 350x350 pixel image in about 10.5 ms leading to a performance summary of 11.66 MPixel/s. Higher device utilization could be achieved by improving the precision of the constant coefficients. This design with 6-bit or 7-bit constants may be possible to layout, leading to a device utilization of 60% to 70%. But the maximum clock-rate will drop under 20MHz.
(click on the image to enlarge)
 
 
 
 
Performance Results
 
XC6216 Implementation
Pentium PC Implementation
Speed-up
Input Image
350x350 8bit greyscale
350x350 8bit greyscale
 
System 
2373 logic cells (58%)
64 MB Ram

WinNT 4.0

 
Compiler
Own design flow
see [HHG98]
C program
Micorosoft Visual Studio 97
 
Clock Speed
25 MHz
100 MHz
 
Computation Time
10.5 ms
280 ms
26.7
 

 

References

[Kop97]
Herbert Kopp: Bildverarbeitung interaktiv, Eine Einführung mit multimedialem Lernsystem auf CD-ROM; Teubner Verlag, Stuttgart, Germany, 1997
[Gil98]
Frank Gilbert: Development of a Design Flow and Implementation of Example Designs for the Xilinx XC6200 FPGA Series; Diploma Thesis, University of Kaiserslautern, Kaiserslautern, 1998.
[HHG98]
R. Hartenstein, M. Herz, F. Gilbert: Designing von Xilinx XC6200 FPGAs; to be published in the Proceedings of FPL98, Tallinn, Estonia, Aug.31-Sept.3, 1998

You can find more about this design in the Diploma Thesis of Frank Gilbert on the Download Page.
The VHDL sources and the C program can also be accessed there.


Computer Structures GroupUniversity of KaiserslauternDepartment of Computer ScienceCopyrightContact the WebmasterThe Xputer CD