Amplitude and Time Measurement ASIC with Analog Derandomization

Paul O’Connor, Gianluigi De Geronimo, and Anand Kandasamy

Abstract—We describe a new ASIC for accurate and efficient processing of high-rate pulse signals from highly segmented detectors. In contrast to conventional approaches, this circuit affords a dramatic reduction in data volume through the use of analog techniques (precision peak detectors and time-to-amplitude converters) together with fast arbitration and sequencing logic to concentrate the data before digitization. In operation the circuit functions like a data-driven analog first-in, first-out (FIFO) memory between the preamplifiers and the ADC. Peak amplitudes of pulses arriving at any one of the 32 inputs are sampled, stored, and queued for readout and digitization through a single output port. Hit timing, pulse risetime, and channel address are also available at the output.

Prototype chips have been fabricated in 0.35 micron CMOS and tested. First results indicate proper functionality for pulses down to 30 ns peaking time and input rates up to 1.6 MHz/channel. Amplitude accuracy of the peak detect and hold circuit is 0.3% (absolute). TAC accuracy is within 0.3% of full scale. Power consumption is less than 2 mW/channel. Compared with conventional techniques such as track-and-hold and analog memory, this new ASIC will enable efficient pulse height measurement at 20 to 300 times higher rates.

I. INTRODUCTION

The move to high channel count systems has been stimulated by the development of monolithic circuits that have dramatically reduced the per-channel cost of the front-end electronics. The most flexible approach to collect and process the data from these detectors is to digitize the signals immediately after the preamplifiers and to perform all subsequent operations (pedestal/gain correction, peak extraction, sparsification, and buffering) in the digital domain. This method requires a high rate ADC for each channel, generates a large volume of digital data, and is generally unsuitable for low cost, low power systems. The alternative is to concentrate the data in analog form in front of the ADC. Common techniques include sample/hold or analog pipelines followed by analog multiplexing. The drawbacks are the requirement of an external trigger to define the time of occurrence of the event, the lack of sparsification, and the low speed of analog multiplexing. In the case of the sample/hold, deadtime during readout also limits the efficiency of data collection.

We have developed a data-driven multichannel readout ASIC that combines the functions of:

- trigger
- peak detect and hold
- time-to-amplitude converter
- analog FIFO
- buffer manager
- multiplexer.

The sampling and buffering are accomplished using fast and accurate peak detector (PD) circuits in conjunction with time-to-amplitude converters (TAC) to provide pulse amplitude and timing information. The data concentration ratio is 32:1.

II. ASIC ARCHITECTURE

A simplified schematic that illustrates the architecture is shown in Figure 1. The ASIC consists of:

- inputs for 32 shaped, positive unipolar pulses with minimum peaking time of 30 ns;
- 32 comparators with common threshold;
- array of 8 offset-free, two-phase peak detectors [1] with associated TACs;
- 32-to-8 crosspoint switch that can route any input channel to any PD/TAC;
- fast, nonblocking arbitration logic to control the crosspoint switch;
- output multiplexer and sequencing logic to control the readout of the PD/TACs.

The 32 comparators monitor the inputs for activity. When an input goes above threshold, the arbitration logic sets up a connection between that channel and the next available peak detector. The connection is maintained until the peak is found. Then the peak amplitude is stored on the PD hold capacitor until the external ADC becomes available to digitize that sample. At that time the sequencing logic presents the amplitude and time (analog) samples from the PD/TAC to the ADC along with the address of the channel where the hit occurred. Incoming pulses are not blocked as long as there is at least one PD free.

The elements of the PD/TAC array are available to any input channel; thus, resources can be allocated to highly active channels as necessary.

Manuscript received November 13, 2002. This work was supported in part by the U.S. Department of Energy under Grant No DE-AC0298CH10886.

The authors are with Brookhaven National Laboratory, Upton, NY 11973, USA. Correspondence should be addressed to P. O’Connor, Instrumentation Division Bldg. 535B, Brookhaven National Laboratory, e-mail: poc@bnl.gov.
Traditional FIFO control lines (FULL, EMPTY, DATA_VALID, READ_REQUEST) are available for operating the ASIC in continuously clocked, polled, interrupt-driven, or token-passing mode. In addition, an SPI compatible interface allows serial configuration of TAC gain and mode, arbitration locking, channel exclusion, powerdown, and analog monitor functions.

Fig. 1: Block diagram of the Amplitude and Time measurement ASIC showing the routing of the input signals to the PD/ TAC arrays.

III. CIRCUIT DESIGN

A. Peak detector

The main building block of the system is a novel two-phase peak detector (PD) that circumvents the major sources of error of conventional configurations [1]. An analysis of the limitations of the classical CMOS PD was given in [2]. Summarizing, the two-phase approach eliminates the errors caused by amplifier offset by re-configuring the amplifier as a unity-gain follower of the voltage on the hold capacitor, while also providing strong drive capability. By canceling amplifier offset, the two-phase PD can also be designed for rail-to-rail sensing and driving. Switching between tracking and holding phases is automatic and provides a time marker with characteristics similar to that of a zero-crossing discriminator.

In the present design, the speed and accuracy of the two-phase PD were improved to allow it to process pulses with peaking times down to 30 ns. A comparison of the simulated absolute error in peak height for the original [1] and new designs is shown in Figure 2.

B. Time-to-amplitude converter

Two time-to-amplitude converters (TACs) are associated with each PD. The TAC is based on a capacitor charged by a constant current source. Full scale can be varied from 0.5 to 64 µs by means of an on-board DAC. The main TAC has two programmable measurement modes (Fig. 3): in “risetime” mode it measures the time from comparator firing until a peak is detected, while in “time of occurrence” mode the TAC is started when a peak is detected and stopped by the readout request, allowing measurement of the hit time relative to a (known) system clock.

The secondary TAC is a time-out control, which resets the PD after a known delay. This feature prevents PDs from being locked out of the arbitration pool in case of an anomalous comparator firing. It can also be used to reject pulses based on a risetime constraint, or as a latency delay in systems with a delayed global trigger.

Fig. 2: Pulse height accuracy for original and new PD designs (simulated). (a) Peaking time = 500 ns; (b) peaking time = 50 ns

Fig. 3: TAC operation: A – risetime mode; B—event timing mode
C. Arbitration logic

The arbitration logic and crosspoint switch are arrayed in a matrix of 32 rows and 8 columns. Arbitration logic is asynchronous and responds to three types of event: comparator firing, PEAK_FOUND signal from the PD, and READ_REQUEST from the external data acquisition system (DAQ). A fast wired-OR allows the logic to discriminate between comparator events from different channels occurring about 3 ns apart. If more than one event arrives within 3 ns, priority is given to the event on the lowest-numbered channel and the events on the higher-numbered channels are rejected. The impact of this preferential treatment of the low-numbered channels is a small bias in favor of the lower-numbered channels. However, the bias amounts to less than 0.04% at the highest expected count rate.

In response to a READ_REQUEST rising edge the amplitude (analog), time (analog), and channel address (digital) are presented to the DAQ. If more than one pulse is being buffered they are read out in the same sequence that they were recorded, i.e. in first-in, first-out order. When READ_REQUEST goes low, the peak detector that was read out is reset and made available for newly arriving data. While READ_REQUEST is low the outputs are tri-stated, so several ASICs can be bussed together to expand the number of channels processed by a single ADC. The ASIC also has FULL, EMPTY and DATA VALID logic outputs. FULL indicates that all eight peak detectors captured pulses since the last READ_REQUEST, and any pulses arriving after the eighth were not processed. In multi-chip applications the EMPTY flags can be programmed to be asynchronous. In this mode the chips’ EMPTY flags can interrupt a central controller which can then be directed to collect data from chip requesting service.

IV. ASIC SIMULATION

We performed 500000-event Monte Carlo simulations of the ASIC with randomly arriving pulses of 50 ns peaking time to determine efficiency. The inefficiency (fraction of events blocked) was recorded as a function of the ratio of the readout rate to the average input event rate. Figure 4(a) shows the blocking probability for 4- and 8-PD arrays. In Fig. 4(b) the inefficiency is given for an 8-PD array as a function of input rate, when the read_request frequency was fixed at 1.5 and 2.0 times the average event rate. For 8 peak detectors, the efficiency is excellent up to rates in excess of 1 MHz/channel. The inefficiency rises at high rates when the peaking time becomes an appreciable fraction of the average inter-arrival time.

Fig. 5 shows a 40μs portion of a full transistor-level (BSIM3v3.1) SPICE simulation of the circuit with pulses on 16 inputs. To simulate the signals expected in a spectroscopy experiment, the arrival times of the pulses are Poisson distributed with a mean rate of 100 kHz per channel. Amplitudes of the pulses are random, and peaking times of 50ns to 1μs are used. The circuit responds to read requests by outputting the peak sample from each channel that was hit. Address and time of the corresponding hit are also output.

V. FIRST EXPERIMENTAL RESULTS

The chip was fabricated during summer 2002 in a 0.35 μm DP-4M CMOS process. Die size is 3.2 x 3.2 mm and is pad-limited. Power consumption is less than 2 mW/channel.

We made preliminary measurements on the first prototype ASIC. The results reported here were made using 8-bit sampling oscilloscopes and have not been corrected for other
systematic inaccuracies. In addition, testing was hampered by apparent metallization problems with this run, which resulted in low yield and poor functionality for most chips.

The amplitude accuracy and uniformity of the eight peak detectors was measured with a burst of eight gaussian-shaped pulses on the first input channel, followed by a burst of eight READ_REQUESTs. The r.m.s. error in peak height was 0.27%, and the uniformity among the eight PDs was also within 0.3%.

Fig. 6 shows the inputs and primary outputs of the ASIC in response to a series of pulses on the first channel. Fig. 6(a) shows the ASIC inputs and Fig. 6(b) shows the outputs from the amplitude and timing channels (time of occurrence mode). In Fig. 7 the data from Fig. 6(b) have been used to calculate the peak positions (amplitude and time), shown as points superimposed on the actual waveform data. Accurate reconstruction of the pulses is seen. The disagreement in amplitude between the actual and reconstructed pulse heights of the third pulse in Fig. 7 is due to slow settling of the PD_OUT signal for this setup.

We measured the output of the TAC in time of occurrence mode as the delay between pulse arrival and READ_REQUEST was varied. The results are shown in Fig. 8(a), where the TAC output is plotted against event delay for four different settings of the TAC gain. Fig. 8(b) shows the TAC error as a percent of full scale range for the four settings. The r.m.s. error is below 0.3% in all cases. The TAC also operated correctly in risetime mode.

![Figure 6](image.png)

**Figure 6:** (a) ASIC inputs; (b) PD_OUT and TD_OUT outputs (offset for clarity). Time detector in time of occurrence mode. Vertical scale: Volts. Horizontal scale: time in microseconds.

![Figure 7](image.png)

**Figure 7:** Reconstructed pulse peaks calculated from PD_OUT and TD_OUT data shown in Fig. 6 (points) superimposed on actual waveform (solid line).

![Figure 8](image.png)

**Figure 8:** (a) TAC output vs. pulse-to-READ_REQUEST delay for four settings of TAC gain; (b) TAC error.

Pulses with peaking time as short as 30ns were correctly processed by the ASIC, at repetitition rates up to 1.6 MHz (single channel). The arbitration operated properly with 500ns-wide pulses on two channels arriving within 40ns of each other.

VI. COMPARISON WITH CONVENTIONAL READOUTS

Let us contrast the efficiency of the peak-detect/derandomizer (PDD) with the alternative approaches to multichannel data acquisition.

A. Direct digitization

Direct digitization with free-running ADCs [3,4] (waveform recording) is impractical in systems with high channel count because of the cost and power dissipation of the ADCs and the huge volume of data generated. If a trigger is available, the length of the waveform record can be reduced; but a means must be found to adjust the timing of the trigger to capture the peak of the pulse.
B. Sample-and-hold with analog multiplexing

The most common data-concentrating readout, the sample and hold with analog multiplexing (SH/AM) [5,6] suffers from long deadtime since it can only buffer one event. Amplifier settling time limits the maximum rate of the analog multiplexer, and so there is a tradeoff between accuracy, multiplexing ratio, and deadtime. The other inefficiency in the SH/AM stems from the need to read out unoccupied channels, i.e. there is no sparsification until after the analog multiplexer. For a 32-channel chip with 10 MHz multiplexing, the maximum input event rate for 99% efficiency is less than 1.5 kHz per channel. Because the readout is unsparsified, up to 97% (31/32 channels unoccupied) of the digitized data would need to be eliminated in a fast digital processor.

C. Analog pipeline memory

Analog pipeline memory with analog multiplexing (AP/AM) [7-9] can increase the rate capability compared with the SH/AM approach by providing on-chip buffering, but the control logic to allow simultaneous read and write of the pipeline is complex. This architecture does not overcome the bottleneck of the analog multiplexer, which still limits the rate. For the same channel count and analog multiplexing speed, the AP/AM can handle event rates about 6-8 times higher than the SH/AM if sufficient buffer cells are provided. Post-multiplexing data sparsification is still needed.

Inherently, neither the SH/AM nor the AP/AM circuitry generates its own sampling trigger. The SH/AM must get a HOLD strobe from either on- or off-chip, and timing errors in the strobe introduce inaccuracy in the pulse amplitude measurement because the peak is not sampled precisely. In the AP/AM the usual approach is to sample continuously, read out several samples, and interpolate the peak. This can improve accuracy at the expense of high-bandwidth sampling and more digitization.

D. Single peak detector per channel

In this form of readout, there is a single peak detector per channel followed by an analog multiplexer [10]. Unlike in the SH/AM, this architecture is self-triggered and provision may be made for sparse readout [11,12]. The main performance constraint, however, is deadtime due to the lack of a derandomizing buffer as in the SH/AM. Therefore this architecture is limited to low-rate applications. Also, it requires high power consumption because the number of PDs equals the number of analog channels.

E. Peak detector/derandomizer

The PDD ASIC described here can operate at rates up to 300 kHz per channel with 99% efficiency. This rate capability is faster than the SH/AM and AP/AM of similar channel count by a factor of about 200 and 30 times respectively. The increased efficiency is a result of three factors:

- the shared 8-event, on-chip derandomizing buffer;
- self-sparsification which reduces the number of samples to be read out by a factor of 32; and
- reduction of the multiplexing ratio by a factor of 4, i.e. 32 channels are read out by multiplexing the outputs of 8 PD cells.

In addition, the chip generates accurate timing information without the need for an external trigger.

VII. ACKNOWLEDGMENT

The authors thank Veljko Radeka of BNL and Joe Grosholz and Fred Ferraro of eV Products for helpful discussions on design issues. Sachin Junnarkar of BNL designed the SPI interface for the chip.

VIII. REFERENCES