Considerations for reading out a large pixel array

Serial readout is the simplest to interconnect. A shift rate of about 10MHz should be possible without using too much power. At 10MHz a single serial line can only read out 1000 bits of data within the allowed 100usec.

For a sparse readout, the data words need to contain tags giving the location of the pixel within the array. For a 640 by 640 array it would take 19-bits of location + 8-bits pulse height or 27-bits/pixel. If all the data is read sequentially, about 37 pixels could be read out within 100usec . This falls considerably short of the 640 pixel goal.

The full array can be grouped into smaller arrays that require fewer tag bits. The system then reads all groups simultaneously and maintains group tags. This combination serial/parallel readout would allow more pixels to be read within 100usec as long as the useful pixels occur in different readout groups. If the readout groups are arranged to be spatially interleaved pixels, useful pixels would span different readout groups.

Any interleaved arrangement other than grouping by fiber planes may not be desirable from a packaging and interconnect standpoint. If each ASIC handles 64 pixels, the straight-forward packaging would put these as spatially adjacent fibers.

If the event pixels were spread evenly over all groups, the throughput would increase by a factor equal to the number of groups. A more realistic expectation is that the square root of the total groups would see event pixels. This means about 300 groups are needed reach the 640 pixel/100usec goal and puts the group size at about 1000 pixels.

There are other practical reasons to break the readout into smaller groups. The task of determining which pixels need to be read out also takes some time and this time increases for larger groups.

An example readout scheme for a group of 1024 pixels

A group of 16 chips was chosen fairly arbitrarily. This is a small enough group that the logic signal drive capabilities should be no problem. With 16 chips only 4 tag bits are needed giving a nice even 12-bit word. This scheme requires a minimal amount of readout logic included on the ASIC but may not be fast enough to maintain high event rates.

A system wide trigger causes all pixels to hold their analog values in their sample and hold. The pixels that are above threshold are converted and read out in sequence.

Each 64 pixel ASIC contains an analog multiplexer and an analog to digital converter (see 64 pixel ASIC concept).

The system cycles through each of the 64 multiplexer channels in turn. For each channel the system shifts out data until all 16 chips are empty. The chips share a single serial line and enable data onto this line only when the held analog value is above threshold and all chips before it in the chain (that had data) have been read out. (See readout timing)

Actual throughput for a scheme like this depends on how fast the ASIC is. Rough guesses are that the multiplexer can switch to a new channel and settle in 500nsec. Propagation delay through the chain for the readout arbitration would be 10ns/chip or 160nsec. This gives 660ns/chan * 64 chans = 42usec overhead. This leaves 58usec for shifting data. At 12bits/sample and 10Mhz shift clock (1.2usec/sample) this gives 48 samples max within the allotted 100usec.

Separate chains can be read out simultaneously so overall system throughput would be higher than this by some factor depending on how well the event pixels are distributed to different readout chains. For a 640 by 640 array there would be 400 groups of 1024 pixels. A event could be expected to track through 20 groups giving a possible readout of 20*48= 960pixels/100usec.

The wire OR event signal shown on the chip drawing isn't actually used for this example but it could be used to group chains together without increasing the arbitration delay. This signal can be used to quickly determine if any event pixels are present in the entire chain (for the current mux chan). However unless the number of tag bits are increased, the system controller would have to use some other means to determine which chain the data belongs to.

Other readout schemes

The cycling through 64 channels could be moved on-chip. The ASIC would have to include a more complex sequencer as well as a 64 word memory. The mux and ADC would have to be faster to complete a full 64 conversions within 100usec. Only channels that were above threshold would be stored. Six bits of mux channel ID would be stored with each sample.

Having a buffer memory on-chip would allow the next event to be sampled while the current one was being read out. This gives you a better average deadtime but the max throughput may in fact be worse because now conversions have to complete before readout begins.

System throughput would still be limited by shift clock rate with improvements possible via the same sort of parallel readouts of pixel groupings. The readout of all digital FIFO circuitry on the ASIC might be able to clock at rates significantly higher than the 10MHz guess.