#### PAPER Special Issue on High-Performance Analog Integrated Circuits

# An Hadamard Transform Chip Using the PWM Circuit Technique and Its Application to Image Processing

Kousuke KATAYAMA<sup> $\dagger a$ </sup>, Student Member, Atsushi IWATA<sup> $\dagger \dagger$ </sup>, Takashi MORIE<sup> $\dagger \dagger$ </sup>, and Makoto NAGATA<sup> $\dagger \dagger$ </sup>, Regular Members

**SUMMARY** A circuit that carries out an Hadamard transform of an input image using the pulse width modulation technique is proposed. The proposed circuit architecture realizes the function of an Hadamard transform with a full-size pixel image. A test chip that we designed and fabricated integrates  $64 \times 64$  pixels in a  $4.9 \text{ mm} \times 4.9 \text{ mm}$  area, with  $0.35 \,\mu\text{m}$  CMOS technology. The functional operation and linearity of this chip are measured. An image processing application utilizing this chip is demonstrated.

**key words:** Hadamard transform, image sensor, pulse width modulation technique, charge sharing

# 1. Introduction

Among the many kinds of orthogonal transformations, discrete cosine transform (DCT) has been applied to the compression of natural images. On the other hand, the Hadamard transform is used for implementation of image cording because it does not require real number multiplication [1]. Because transforming large images requires power dissipation, chip area, accuracy and processing time, the data for such images are usually divided into small sub-blocks (e.g.  $16 \times 16$ ) and then transformed. However, dividing an image into sub-blocks restricts the processing of the image. For example, a spatial low-pass filter frequency is limited to sub-block size.

In order to overcome these restrictions, we propose a novel Hadamard transform circuit that can entirely transform large-size images. To solve the problems of power dissipation and chip area, we adopt a pulse width modulation (PWM) technique because it depresses the frequency of signal changing more than a digital circuit does, and because PWM circuit elements are compact [2]–[4]. To solve the accuracy problem, we adopt a charge-sharing scheme, preventing corruption of the circuit during calculation [5]. To solve the problem of time consumption, parallel calculation using the PWM circuit is effective [6]. A digital circuit, such as that of a fast Hadamard transform, also achieves speed and accuracy. However, this would require the conversion of each pixel into a digital value through an analog-to-digital converter (ADC). In the circuit we propose, on the other hand, the Hadamard transform function can be applied to a chip image sensor, because the circuit can deal with analog values directly.

The artificial retina chip realizes orthogonal transforms to detect features, but it can do only onedimensional transforms [7].

In this paper, we propose a circuit that performs an entire Hadamard transform. The proposed circuit not only compresses images but detects shapes and features, and it would be applicable to intelligent machine vision systems.

# 2. Hadamard Transform Circuit

# 2.1 Hadamard Transform and Inverse Hadamard Transform

The Hadamard transform is achieved by calculating correlation values of an image matrix and Hadamard bases.

One-dimensional (1-D) Hadamard bases are obtained recursively by Eq. (1).

$$H_N = H_{N/2} \otimes H_2 \qquad H_2 = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$$
(1)

Where N is a base size and  $\otimes$  is a Kronecker multiplication. When N = 8,

Each column is a 1-D Hadamard base and is replaced in the order of the frequency.

Manuscript received January 20, 2002.

Manuscript revised March 12, 2002.

<sup>&</sup>lt;sup>†</sup>The author is with the Graduate School of Engineering, Hiroshima University, Higashi-hiroshima-shi, 739-8527 Japan.

<sup>&</sup>lt;sup>††</sup>The authors are with the Graduate School of Advanced Sciences of Matter, Hiroshima University, Higashihiroshima-shi, 739-8527 Japan.

a) E-mail: kata@dsl.hiroshima-u.ac.jp

The two-dimensional (2-D) Hadamard bases are made using the 1-D Hadamard bases;

$$W_{ij} = W_i^T W_j \tag{4}$$

Correlation values  $c_{ij}$  between Hadamard bases  $W_{ij}$ and an image matrix F are obtained as follows:

$$c_{ij} = \frac{W_{ij} \oplus F}{N^2} \qquad \left(A \oplus B = \sum_{l,m} a_{lm} b_{lm}\right) \tag{5}$$

An element of correlation values  $c_{ij}$  is called an Hadamard coefficient. Similarly, an inverse Hadamard transform is calculated using Hadamard coefficients  $c_{ij}$  and Hadamard bases  $W_{ij}$ .

$$F = \sum_{i,j} c_{ij} W_{ij} \tag{6}$$

Figure 1 shows 2-D Hadamard bases when N = 8. In Fig. 1 and throughout this paper, the white area represents +1 and the black area represents -1. As the figure shows, the frequency increases from the left upper side to the right bottom side. All elements of  $W_{11}$  are +1.



**Fig. 1** 2-D Hadamard bases (N = 8).

Figure 2 shows examples of the Hadamard transform and of the inverse Hadamard transform. Figure 2(b) shows Hadamard coefficients made from the original image (Fig. 2(a)) without image division to subblocks. As shown in Fig. 2(b), the power concentrates in the left upper part. Using these coefficients of the left upper part, the image can be reconstructed (Figs. 2(c)–(f)).

#### 2.2 Pulse Width Modulation Technique

The analog-digital merged circuit architecture using PWM approaches is applied to a basic processing circuit.

Figure 3(a) shows an optical-to-PWM (O-to-PWM) converter. The capacitor  $C_p$ , which works as an analog memory for non-destructive readout, is charged to the threshold of the comparator (Comp). The photo detector (PD) works as a current source, which is proportional to pixel intensity, and discharges from capacitor  $C_p$ . The voltage of node  $C_p$  is increased by the application of ramped voltage. The comp compares this voltage with its own threshold and outputs a PWM



Fig. 2 Hadamard transform and Inverse Hadamard transform.



Fig. 3 Pulse width modulation technique elements.

signal.

Figure 3(b) is an integrator with a sign bit. Either switch is closed by a PWM signal selected by the sign bit. This circuit calculates PWM signal input corresponding to the sign bit.

Figure 3(c) is an average calculator. Each capacitor  $C_{Ii}$  integrates current  $I_i$  during the charge/share control signal (C/S) is "C." During C/S is "S," all of the capacitors are connected to common line and share charges. This circuit calculates average voltage  $V_{out}$ biased by a reference voltage  $V_r$ .

# 2.3 Hadamard Transform Circuit

We propose a Hadamard transform circuit.

Figure 4 is a block diagram of our Hadamard transform circuit. This circuit is composed of two Hadamard transform base generators, an array of cells and an ADC. The Hadamard base generators provide cells with 1-D Hadamard bases from row and column. All cells connect to a common line. The ADC converts the voltage of the common line to digital data.

Figure 5(a) is an Hadamard transform base generator when N = 8. To input the base number as a gray





Fig. 4 Block diagram of Hadamard transform circuit.

code G, this generator generates 1-D Hadamard base W. Another base size can be easily analogized from this schematic.

Figure 5(b) is a circuit schematic of the cell. This circuit is composed of the PWM processing circuits described in Sect. 2.2. Ex-Nor calculates a 2-D Hadamard base element using the 1-D Hadamard base elements from row and column (cf. Eq. (4)). A 2-D Hadamard base element with a bipolar state +1 or -1 works a sign bit. The O-to-PWM converter outputs the PWM signal, which corresponds linearly to pixel intensity. The capacitor  $C_{Iij}$  integrates the charge depending on the PWM signal and sign bit. All charges are shared by closing all switches  $(S_{ij})$  and are averaged instantaneously.

The circuits shown in Figs. 4 and 5 also realize an inverse Hadamard transform (cf. Eq. (6)). Each cell is provided with the PWM signal simultaneously corresponding to Hadamard coefficients  $c_{ij}$  instead of the O-to-PWM output. Capacitors  $C_{Iij}$  integrate the coefficients multiplied by 2-D Hadamard base elements. These capacitors  $C_{Iij}$  perform summation by providing all coefficients  $c_{ij}$  sequentially and reconstruct pixels. The reconstructed pixel values are read out by closing  $S_{ij}$  sequentially.

#### 2.4 Circuit Simulation of Basic Operation

Figure 6 shows simulated waveforms of  $V_{ij}$  with  $2 \times 2$ 



(b) Circuit schematic of cell

Fig. 5 Hadamard transform circuit elements.

cells. Input PWM pulse width and 2-D Hadamard base elements were set to  $(1 \,\mu s, +1)$ ,  $(1 \,\mu s, -1)$ ,  $(0.5 \,\mu s, +1)$  and  $(0.5 \,\mu s, -1)$ . The charging terms are those from 0 s to  $1 \,\mu s$ , and the sharing terms are those after  $1 \,\mu s$ . This simulation verified charging and sharing operations.

For SPICE simulation, the standard Lena image with a reduced resolution of  $32 \times 32$  is applied to the Hadamard transform circuit. To change an Hadamard base sequentially ordering shown in Fig. 7(a), the circuit outputs each coefficient corresponding to the base. Figure 7(b) shows values of coefficients calculated by



Fig. 6 SPICE simulation results (charging and sharing).



Fig. 7 Comparison between calculation and SPICE simulation.

numerical calculation (Calc.) and SPICE simulation (SPICE). Accuracy is estimated as the difference between the Calc. scaled voltage and the SPICE voltage divided by the full scale of voltage. The two are in good agreement, with an accuracy of more than 7 bits.

# 3. Test Chip Design and Evaluation

# 3.1 Chip Design

We designed a test chip to confirm that the Hadamard transform circuit would perform in an actual chip.

The Hadamard transform circuit contains PDs and capacitors. The PD needs at least  $200 \,\mu\text{m}^2$  for light sensitivity (maximum current is  $0.05 \,\text{nA}$ ). The sample capacitor  $C_p$  needs at least  $0.3 \,\text{pF}$  to tolerate current leakage and noise. The share capacitor  $C_I$  needs  $0.8 \,\text{pF}$ to ignore parasitic capacitance of the common line. The parasitic capacitance of the common line is estimated to be about 14 pF. Capacitors  $C_p$  and  $C_I$  require chip areas of  $160 \,\mu\text{m}^2$  and  $420 \,\mu\text{m}^2$ , respectively. Based on these chip area estimations, a cell is able to array  $64 \times 64$ on a  $4.5 \,\text{mm} \times 4.5 \,\text{mm}$  chip. (When charge sharing, total share capacitance is 250 times larger than the parasitic capacitor.)

This test chip consisted of  $64 \times 64$  cells (CELL), two base generators with a shift register (DFF) to control the charge sharing signal, and an output buffer (FOL).

The estimated power consumption of this chip using SPICE is 178 mW at a 3.3 V supply. Specifications are shown in Table 1.

To enhance application fields, resolution must be

| Technology          |      | $0.35\mu\mathrm{m}$ CMOS, 3Al, 2Poly-Si      |  |
|---------------------|------|----------------------------------------------|--|
| Power Supply        |      | $3.3\mathrm{V}$                              |  |
| Power Dissipation   |      | $178\mathrm{mW}$                             |  |
| Fill Factor         |      | 6%                                           |  |
| Number of Pixels    |      | $64 \times 64$                               |  |
|                     | CELL | $55\mu\mathrm{m} 	imes 55\mu\mathrm{m}$      |  |
| Chip Size           | DFF  | $55\mu\mathrm{m}	imes 66.6\mu\mathrm{m}$     |  |
|                     | FOL  | $321\mu\mathrm{m} 	imes 120.25\mu\mathrm{m}$ |  |
|                     | ALL  | $4.5\mathrm{mm} 	imes 4.5\mathrm{mm}$        |  |
| Number              | CELL | 48 tr.                                       |  |
| of                  | DFF  | 46 tr.                                       |  |
| Transistors         | FOL  | 200 tr.                                      |  |
|                     | ALL  | $202696 \mathrm{tr.}$                        |  |
| PWM Signal Accuracy |      | 7 bits                                       |  |
| Readout Time        |      | $4.096\mathrm{ms}$                           |  |
| Dynamic Range       |      | 33.3 dB                                      |  |

Table 1Specifications.

higher than about  $256 \times 256$  pixels. To realize such high resolution, we have to reduce the pixel size, which is dependent on the photo detector, the sample capacitor and the shared capacitor. Accuracy is determined by using the ratio of parasitic capacitance on the common line for charge share to the sum of the shared capacitors. If we use a  $0.35 \,\mu\text{m}$  CMOS process, the estimation value of the parasitic capacitor is  $36 \,\text{pF}$  and share capacitance is almost  $0.13 \,\mu\text{F}$  ( $256 \times 256 \times 0.2 \,\text{pF}$ ). The ratio of share capacitance to parasitic capacitor is 360. The parasitic capacitors decrease with minimum feature size. Therefore, we can decrease the size of shared capacitor as we decrease the minimum feature size. This shows that this circuit architecture is suitable for use in a highresolution Hadamard transform chip.

Figure 8 shows a cell constituted by CMOS. For implementation, several brand-new signals have been added. PRST resets  $C_p$  to the threshold of the comparator. SEL selects an external (PWMIN) or internal PWM signal. PWMIN is used for inverse Hadamard operation. SROW and SCOL connect to the shift register located in each row and column. SROW and SCOL select which capacitor  $C_I$  to share. SROW and SCOL are used when reading  $C_I$  directly and when making exotic bases. CRST resets  $C_I$  and the common line to MID. Figure 8 also provides the meaning of each signal.

A top view of the layout is shown in Fig. 9. What appears as a geometric pattern is CELL array. CELL array occupies most of the chip's area. Two DFF arrays are located along upper and left edge. FOL is located at the bottom left. ADC is equipped outside of this test chip.

#### 3.2 Evaluation

In order to evaluate the function of this Hadamard



|       | r r                               |       | 1 1                                |
|-------|-----------------------------------|-------|------------------------------------|
| RMP   | RaMP Input                        | CLK   | CLocK for comparator               |
| PWMST | PWM STart                         | PWMIN | external PWM INput                 |
| SEL   | SELect external / internal PWM    | HROW  | Hadamard base element from ROW     |
| HCOL  | Hadamard base element from COLumn | SHR   | SHaRe signal                       |
| VBIAS | Vdd side BIAS for current source  | GBIAS | Gnd side BIAS for current source   |
| SROW  | Shift register input from ROW     | SCOL  | Shift register input from COLumn   |
| MID   | MIDdle voltage supply             | CRST  | Charge share capacitor $C_I$ reset |



Fig. 9 Layout of Hadamard transform chip.

transform chip, a measurement method is devised. The image applied to the chip is fixed as shown in Fig. 10(c), and Hadamard bases are changed sequentially as shown in Fig. 10(a).

Figure 10(b) shows the outputs versus Hadamard bases as they change with time. In this measurement, about 30% of the light power passes through the black area. In this case, the output voltages of coefficients coincide with a numerical simulation. If we deal with these outputs as Hadamard coefficients, the image is reconstructed (Fig. 10(d)) by the inverse Hadamard transform (cf. Eq. (6)).

The reconstructed image is quite similar to the input image. Thus we can confirm the basic operation of the Hadamard transform.

In order to evaluate the linearity of the Hadamard transform chip, the measurement method shown in Fig. 11 is devised. The Hadamard base is selected such that the left side is +1 and the right side is -1, as shown in Fig. 11. When an optical power is applied to a + 1 pixel, the cell outputs positive value. Likewise, when an optical power is applied to -1 pixel, the cell outputs a negative value. On the whole, the chip output depends on the difference between the number of lighted up +1 pixels and that of lighted up -1 pixels. Thirty-three images made by transparent sheets painted partly opaque using the color black are prepared. The No. 0 image provides maximum light with the -1 side, and it will lead to minimum output. The No. 32 image provides maximum light with the +1 side, and it will lead to maximum output. Other images are prepared so that the numbers of lighted up -1 and +1cells may change gradually.

Figure 12 shows the output versus each image and the errors causing each to differ from an ideal line.

From these measurements, the maximum linearity error is about 1.5%. This error is caused by the manual



Fig. 10 Coefficients and inverse Hadamard transform result.



Fig. 11 Linearity measurement method.

alignments of the images and by the lack of uniformity in both input light and edges.

#### 4. Application

We propose a novel technique to calculate the center of

IEICE TRANS. ELECTRON., VOL.E85-C, NO.8 AUGUST 2002



Fig. 12 Linearity evaluation.

gravity using the Hadamard transform chip. For example, we can obtain the y coordinate of the center of gravity  $g_y$ , by performing the process shown in Fig. 13(a), using only six coefficients ( $64 = 2^6$ ) as shown in Eq. (7). In addition, the division (/2, /4, ...) is realized easily using a shift resistor. This calculation reduces cost drastically, compared with the cost of conventional calculation. This will become evident as the number of pixels increase. Figure 13(b) shows the results of numerical simulations.

$$g_{y} = \begin{cases} \frac{\sum_{k=1}^{\log_{2} n} c_{12^{k}} \times 2^{-(k-1)}}{c_{11}} & (\text{Proposed}) \\ \frac{\sum_{j=1}^{n} j \times \sum_{i=1}^{n} f(i,j)}{\sum_{i=1}^{n} \sum_{j=1}^{n} f(i,j)} & (\text{Conventional}) \end{cases}$$
(7)

In this section, we propose the feature of orthogonal transform not restricted to sub-block size. This feature is easily applicable to segmented and normalized images seen in area of pattern recognition.

#### 5. Conclusion

We proposed a novel Hadamard transform circuit using the PWM technique operating on the principle of charge sharing. We confirmed that this circuit achieved high accuracy and high speed when we applied the PWM technique to SPICE simulations. Based on area estimation, we designed and fabricated a test chip containing  $64 \times 64$  cells to receive pixels. We also designed and fabricated peripheral circuits. We tested the chip through an actual application of image reconstruction, and confirmed that it works. We tested how linear this chip is by gradually changing an image, and thereby estimated that its linearity error is about 1.5%. We demonstrated that the chip drastically reduces the cost of calculating the center of gravity.

We are exploring effective applications of this chip to a machine vision system, and want to investigate



(b) Simulation resultsFig. 13 Center of gravity.

how effective the chip is in actual applications by systematization.

### Acknowledgments

This work was supported by a Grant-in-aid for Scientific Research from the Ministry of Education, Science and Culture of Japan.

This work was also supported in part by Grantsin-aid for the Core Research for Evolutional Science and Technology (CREST) from the Japan Science and Technology Corporation (JST).

The VLSI chip in this study was fabricated in the chip fabrication program of the VLSI Design and Education Center (VDEC), the University of Tokyo, with the collaboration by the Rohm Corporation and the Toppan Printing Corporation.

#### References

- L.M. Po and C.K. Chan, "Directionally classified image vector quantization using Walsh Hadamard subspace distortion," Electron. Lett., vol.27, no.21, pp.1964–1967, Oct. 1991.
- [2] A. Iwata and M. Nagata, "A concept of analog-digital merged circuit architecture for future VLSI's," IEICE Trans. Fundamentals., vol.E79-A, no.2, pp.145–157, Feb. 1996.
- [3] T. Morie, J. Funakoshi, M. Nagata and A. Iwata, "An analogdigital merged neural circuit using pulse width modulation technique," IEICE Trans. Fundamentals, vol.E82-A, no.2, pp.356–363, Feb. 1999.
- [4] A. Iwata, T. Morie, and M. Nagata, "Mrged analog-digital circuits using pulse modulation for intelligent SoC applications," IEICE Trans. Fundamentals, vol.E84-A, no.2, pp.486-496, Feb. 2001.
- [5] K. Katayama, M. Nagata, T. Morie, and A. Iwata, "A highresolution Hadamard transform circuit using pulse width

1602

modulation technique," Ext Abstracts of Int. Conf. on Solid State Devices and Materials, pp.366–367, Sendai, Aug. 2000.

- [6] M. Nagata, M. Homma, N. Takeda, T. Morie, and A. Iwata, "A smart CMOS imager with pixel level PWM signal processing," Symposium on VLSI Circuits Digest of Technical Papers, pp.141–144, Kyoto, June 1999.
- [7] E. Funatsu, Y. Nitta, Y. Miyake, T. Toyoda, J. Ohta, and K. Kyuma, "An artificial Retina chip with current-mode focal plane image processing functions," IEEE Trans. Electron Devices, vol.44, no.10, pp.1777–82, Oct. 1997.



Makoto Nagata received the B.S. and M.S. degrees in physics from Gakushuin University, Tokyo, Japan, in 1991 and 1993, respectively, and the Ph.D. degree in electrical engineering from Hiroshima University, Japan, in 2001. From 1994 to 1996 he was with the Research Center for Integrated Systems, Hiroshima University, where he was involved in the development of OEIC design and fabrication techniques. He has been

a Research Associate with the Integrated Systems Laboratory, Hiroshima University, where he has focused on mixed signal LSI design techniques, especially in the area of signal integrity and test issues, and analog signal processing integrated circuit design. Dr. Nagata is a member of the IEEE and has been a program committee member for the Symposium on VLSI Circuits.



Kousuke Katayama received the B.S. and M.E. degrees in electronics engineering from Hosei University, Tokyo, Japan, in 1997, 1999, respectively. He is currently working towards the Ph.D. degree in electronics engineering. His main research interests include hardware implementation of brain inspired models.



Atsushi Iwata received the B.E., M.S. and Ph.D. degrees in electronics engineering from Nagoya University, Nagoya, Japan, in 1968, 1970, and 1994 respectively. From 1970 to 1993, he was with the Electrical Communications Laboratories, Nippon Telegraph and Telephone Corporation. Since 1994 he has been a professor of Electrical Engineering at Hiroshima University. His research is in the field of integrated circuit design

where his interests include, circuit architecture and design techniques for analog-to-digital and digital-to-analog converters, digital signal processors, ultra high-speed telecommunication IC's, and large-scale neural network implementations. He received an Outstanding Panelist Award at the 1990 International Solid-State Circuits Conference. Dr. Iwata is a member of the Institute of Electrical and Electronics Engineers.



**Takashi Morie** received the B.S. and M.S. degrees in physics from Osaka University, Osaka, Japan, and the Dr.Eng. degree from Hokkaido University, Sapporo, Japan, in 1979, 1981 and 1996, respectively. From 1981 to 1997, he was a member of the Research Staff at Nippon Telegraph and Telephone Corporation (NTT). Since 1997 he has been Associate Professor at the Faculty of Engineering, Hiroshima University, Higashi-

Hiroshima, Japan. His main interest is in the area of VLSI implementation of neural networks, mixed/merged analog-digital circuits, and new functional devices. Dr. Morie is a member of the Japan Society of Applied Physics and the Japanese Neural Network Society.