scieee Science in your language
[en] (orig)

ACE16K: The Third Generation of Mixed-Signal SIMD-CNN ACE Chips Toward VSoCs

Author: Rodríguez Vázquez, Ángel Benito; Liñán Cembrano, Gustavo; Carranza González, Luis; Roca Moreno, Elisenda; Carmona Galán, Ricardo; Jiménez Garrido, Francisco José; Domínguez Castro, Rafael; Espejo Meana, Servando Carlos
Publisher: Institute of Electrical and Electronics Engineers
Year: 2004
DOI: 10.1109/TCSI.2004.827621
Source: https://idus.us.es/bitstreams/b9de94e8-4db2-4f4a-9799-1b9b00905d5e/download
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 5, MAY 2004 851
ACE16k: The Thi d Gene a ion o Mixed-Signal
SIMD-CNN ACE Chips Towa d VSoCs
Angel Rod íguez-Vázquez, Fellow, IEEE, Gus a o Liñán-Cemb ano, L. Ca anza, Elisenda Roca-Mo eno,
Rica do Ca mona-Galán, Membe , IEEE, F ancisco Jiménez-Ga ido, Ra ael Domínguez-Cas o, and
Se ando Espejo Meana
Abs ac —Today, wi h 0.18- m echnologies ma u e and s able
enough o mixed-signal design wi h a la ge a ie y o CMOS
compa ible op ical senso s a ailable and wi h 0.09- m echnolo-
gies knocking a he doo o designe s, we can ace he design
o in eg a ed sys ems, ins ead o jus in eg a ed ci cui s. In ac ,
signi ican p og ess has been made in he las ew yea s owa d he
ealiza ion o ision sys ems on chips (VSoCs). Such VSoCs a e
e en ually a ge ed o in eg a e wi hin a semiconduc o subs a e
he unc ions o op ical sensing, image p ocessing in space and
ime, high-le el p ocessing, and he con ol o ac ua o s. The
consecu i e gene a ions o ACE chips de ine a oadmap owa d
lexible VSoCs. These chips consis o a ays o mixed-signal
p ocessing elemen s (PEs) which ope a e in acco dance wi h
single ins uc ion mul iple da a (SIMD) compu ing a chi ec u es
and exhibi he unc ional ea u es o CNN Uni e sal Machines.
They ha e been concei ed o co e he ea ly s ages o he isual
p ocessing pa h in a ully-pa allel manne , and hence mo e
e icien ly han DSP-based sys ems. Ac oss he di e en gene a-
ions, di e en imp o emen s and modi ica ions ha e been made
looking o con e ge wi h he newes disco e ies o neu obiologis s
ega ding he beha io o na u al e inas. This pape p esen s
conside a ions pe aining o he design o a membe o he hi d
gene a ion o ACE chips, namely o he so-called ACE16k chip.
This chip, designed in a 0.35- m s anda d CMOS echnology,
con ains abou 3.75 million ansis o s and exhibi s peak com-
pu ing igu es o 330 GOPS, 3.6 GOPS/mm
2
and 82.5 GOPS/W.
Each PE in he a ay con ains a econ igu able compu ing ke nel
capable o calcula ing linea con olu ions on 3 3 neighbo hoods
in less han 1.5 s, imagewise Boolean combina ions in less
han 200 ns, imagewise a i hme ic ope a ions in abou 5 s, and
CNN-like empo al e olu ions wi h a ime cons an o abou 0.5 s.
Un o una ely, he many ideas unde lying he design o his chip
canno be co e ed in a single pape ; hence, his pape is ocused
on, i s , placing he ACE16k in he ACE chip oadmap and, hen,
discussing he mos signi ican modi ica ions o ACE16K e sus
i s p edecesso s in he amily.
Index Te ms—Analog p og ammable e y la ge-scale in eg a-
ion (VLSI), ea ly ision chips, silicon e inas.
I. INTRODUCTION
VISION in ol es ex emely complex compu a ional asks
[1]–[8]. So complex ha , despi e i s huge se o applica-
ions and po en ial uses, no a i icial ision sys em has been able
Manusc ip ecei ed July 29, 2003; e ised Janua y 8, 2004. This wo k was
suppo ed in pa by LOCUST unde P ojec IST2001—38 097, in pa by
VISTA unde G an TIC2003—09 817 - C02—01, and in pa by ONR-NICOP
unde G an N000 140 210 884. This pape was ecommended by Gues Edi o
B. Shi.
The au ho s a e wi h he Ins i u e o Mic oelec onics o Se ille, Cen o
Nacional de Mic oelec ónica (IMSE-CNM), Uni e sidad de Se illa, 41012
Se ille, Spain (e-mail: [email p o ec ed]).
Digi al Objec Iden i ie 10.1109/TCSI.2004.827621
o each he le el o e iciency o na u al ision sys ems up o
da e. Indeed, pe o mances o cu en ly a ailable a i icial i-
sion sys ems a e a below hose o he smalles insec , despi e
he usage o he mos sophis ica ed la es gene a ion compu ing
de ices. Is his pa adox due o a lack o indus ial o comme -
cial in e es ? Clea ly no , since he numbe o applica ions o
a i icial ision sys ems a e eno mous. Which can be hence he
eason unde lying he gap be ween na u al and a i icial ision
sys ems?
P obably, he eason is ha con en ional signal p ocessing
a chi ec u es a e no he bes sui ed o ision. In hese a chi-
ec u es, he e exis s a clea sepa a ion be ween signal acquisi-
ion and signal p ocessing, wi h he ole o analog p ocessing
being es ained o he on -end unc ions, namely ansduc-
ion, signal condi ioning and da a encoding. The p oblem is ha
images con ain a huge amoun o da a, many o hem edun-
dan , i.e., no ca ying any in o ma ion. Hence, does i make any
sense o consume esou ces in handling, i.e., con e ing and p o-
cessing, hese da a? Na u e gi es us some guesses abou ha . In
na u al ision sys ems, he on -end de ice, he e ina, does no
only acqui e bu also p e-p ocesses he isual in o ma ion [9],
[10], such ha he amoun o da a ansmi ed h ough he op ic
ne e o he b ain ge s comp essed by a ac o a ound 150.1
A simila comp ession o in o ma ion occu s in any ision
p ocessing chain. As he signal climbs h ough consecu i e
le els in he p ocessing pa h, i s dimensionali y sh inks
whe eas i s abs ac ion inc eases. Thus, al hough using se ial
digi al signal p ocessing is ad isable a he uppe le els o he
hie a chy, i migh no be so adequa e o ea ly p ocessing.
Ope a ing wi h images a he bo om le el o he p ocessing
hie a chy implies in ensi e memo y accesses and poses im-
po an cons ain s on he bandwid h o he communica ions
be ween memo y and p ocesso . Also, ha ing a chip o sense
he isual in o ma ion (image ) and ano he one o p ocess i
(p ocesso ), equi es high-speed da a con e sions and ans-
e ences o achie e la ge ame a es. Using he con en ional
Image -Memo y-DSP a chi ec u e i is possible o each
30 FPS, e en o la ge esolu ion images. Howe e , high-speed
indus ial applica ions equi ing ul a as ame a es2migh
u n un easible.
ACE chips ende ul a as ope a ion easible by using mas-
si ely pa allel analog p ocessing a he ea ly s ages, as na u al
1The human eye con ains abou 150 mill. pho o ecep o s whils he op ic
ne e con ains abou 1 mill. ibe s.
2In he o de o 1000 FPS.
1057-7122/04$20.00 © 2004 IEEE
852 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 5, MAY 2004
e inas do. Some easons suppo ing his choice a e [11]–[13] as
ollows.
1) The accu acy equi ed o ea ly p ocessing is mode a e o
e en low. Ac ually, he pe cep ual quali y o he images
does no d op signi ican ly in he p esence o pe u ba-
ions (noise, spa ial a iances, nonlinea i ies, ); e en i
hese pe u ba ions a e as la ge as 5% o he ull scale3.
2) The speed e sus powe e iciency o mode a e- o-low
esolu ion analog ci cui s is much la ge han ha o dig-
i al coun e pa s. This is ele an since e y high speed is
needed o achie e high ame a es o mode a ely la ge
images.
3) The a ea e iciency o analog ci cui s o mod-
e a e- o-low esolu ion applica ions is be e han
ha o digi al coun e pa s.
The chip desc ibed in his pape ep esen s he hi d gene a-
ion o ACE chips and has been designed o o e comesome lim-
i a ions o i s p edecesso s, pa icula ly hose o he he so-called
ACE4k chip [5]. Majo imp o emen s o ACE16k include he
ollowing.
•Inco po a ion o digi al buses o g ayscale da a:
ACE16k embeds pe -column da a con e e s (a anged
in analog- o-digi al (A/D) and digi al- o-analog (D/A)
e-con igu able pai s) o ully digi al in e acing.
•Exac con ol o he iming o inpu /ou pu (I/O) ac-
cess: To ha pu pose, ACE16k does no include he possi-
bili y o indi idual cell selec ion; ins ead i inco po a es
an au onomous add essing scheme. Also, i employs a
hand-shaking p o ocol o elimina e iming cons ain s.
•Be e in e nal o ganiza ion o he p ocessing cells:
ACE16k inco po a es he so-called ACE-BUS o allow
any unc ional block wi hin he cell o communica e wi h
any o he .
•Use o noncon en ional logic blocks: Pa icula ly, he
ou local logic memo ies (LLMs) o ACE4k ha e been
eplaced by local analog memo ies (LAMs), and he local
logic uni (LLU) has been designed o ope a e wi hin e-
duced analog-compa ible ol age anges, ins ead o wi hin
comple e digi al ol age ones. Also, dynamic, ins ead o
s a ic, digi al memo ies a e used o s o e empla e masks.
Finally, dedica ed logic in e e s wi h peak cu en limi a-
ion ha e been used ins ead o con en ional ones.
•Imp o emen o he op ical in e ace: ACE16k inco po-
a es a e-con igu able op ical inpu module wi h he ol-
lowing ea u es:
•Use -de ined pho o-sensing de ice: The use can
selec among a P-Di usion/N-Well pho o-diode, a
N-Well/P-Subs a e pho o-diode o a P-N-P e ical
pho o- ansis o .
•Use -de ined sensing scheme: The use is allowed
o selec be ween no mal linea in eg a ion modes o
loga i hmic comp ession sensing.
•Inco po a ion o an add ess e en de ec ion scheme:
o simpli y he ex ac ion o in o ma ion om black and
3The exac numbe is ob iously applica ion dependen .
whi e (B/W) images. The associa ed ci cui y p o ides ad-
d esses (ins ead o images) co esponding o a ay loca-
ions whe e ac i i y is de ec ed. This scheme also embeds
he unc ionali y o he global ga es—no add ess is p o-
ided i no ac i e cells exis .
•Imp o ed powe consump ion managemen : ACE16k
has ou imes mo e cells han ACE4k, and much la ge
unc ional capabili ies. Howe e , i swi ches idle blocks
o and uses scaleddown logic le els o keep he powe
consump ion mode a e—less han 180 W pe cell.
ACE16k has been designed in a digi al CMOS 0.35- m
5M-1P echnology and con ains mo e han 3.75 million an-
sis o s—85% o hem wo king in analog mode. I can each
peak compu ing igu es4o 330 GOPS, 3.6 GOPS/mm , and
82.5 GOPS/W. I p o ides and accep s 8-bi digi ized images
h ough a 32-bi da a bus which wo ks a 120 Mby es/s
II. ACE16k IN ROAD MAP OF ACE MIXED-SIGNAL
VISION CHIPS
ACE chips consis o an a ay o iden ical p ocessing ele-
men s (PE) which execu e he same ins uc ions a he same
ime. Ins uc ions a e execu ed on da a which a e locally de-
ined, i.e., a he PE le el, while he sequence o ins uc ions
is con olled and imed by a digi al con olle which is sha ed
by all he PEs. Typically, o implemen a ion pu poses, com-
munica ions be ween PEs a e es ic ed o he nea es neighbo s.
Howe e , despi e such an a chi ec u al limi a ion, ACE chips a e
able o implemen mos ea ly- ision p ocessing asks [4]–[6],
[13]. Adding he capabili y o sensing he isual in o ma ion in
a one-by-one pixel- o-PE co espondence makes hese sys ems
e y well sui ed o implemen he on -end s age o VSoCs.
Ob iously, p ocessing images whose esolu ion is la ge han
he a ay size (necessa ily limi ed due o he inco po a ion o
p og ammable p ocessing ci cui y a pixel le el) equi es win-
dowing and ime mul iplexing.
Rega ding ACE chip a chi ec u es, di e en ques ions a ise,
which ela e o:
1) unc ions o be inco po a ed wi hin he PE;
2) complexi y o he con ol uni ;
3) in e acing wi h o he ha dwa e and/o equipmen .
The answe s o hese ques ions a e la gely dependen o he
in ended applica ion. Howe e , due o size, design complexi y,
and ab ica ion cos s o hese chips, he design o special pu -
pose de ices is only ad isable i a ma ke niche abso bing mass
p oduc ion is ensu ed. O he wise, he a chi ec u e o he PE
mus be lexible enough o gua an ee he execu ion o he la ges
possible amoun o ision algo i hms unde eal-li e illumina-
ion condi ions. Thus, aking in o accoun ha mos ea ly ision
p ocesses consis o he applica ion o con olu ions masks, and
he combina ion (ei he by Boolean ope a ions in he case o
B/W images, o by a local analog a i hme ic ope a o ) o hei
esul s in a bi u ca ed- low algo i hm, he ollowing ope a o s
should be included a he PE le el:
4These da a co espond o expe imen al esul s.
RODRÍGUEZ-VÁZQUEZ e al.: ACE16k: THIRD GENERATION MIXED-SIGNAL SIMD-CNN 853
Fig. 1. Concep ual a chi ec u e o ACE16k.
1) mul iplie s and adde s; o he con olu ion ope a ion;
2) analog egis e s; o allow o he s o age o p e ious e-
sul s a he local le el;
3) a i hme ic ope a o and/o bina y ope a o ; o combine
p e iously ob ained esul s;
4) local masks; o allow o he condi ional execu ion o ce -
ain ope a ions a PE le el depending on some locally de-
ined alue.
5) wide dynamic ange op ical inpu ; o pe mi he ligh -
sensing capabili y, and, hence, o a oid he bo leneck
exis ing in da a ansmission om he senso y o he
p ocessing plane in con en ional nonmassi ely pa allel
solu ions.
To cope wi h he objec i e o co e ing he la ges possible
se o applica ions, all unc ions abo e mus be p og ammable,
including eliable se ing o analog pa ame e s, econ igu a ion
o opologies and con ol o in e nal da a- lows. Rega ding he
con ol uni , i s oles a e:
1) con olling he sequence o ope a ions o be execu ed on
he a ay;
2) s o ing he machine code o he algo i hms o be imple-
men ed;
3) s o ing he da a which de ine he in e nal analog pa ame-
e s o he a ay.
4) in e acing he ex e nal wo ld using s anda d p o ocols;
5) pe o ming high-le el signal p ocessing asks.
Based on, i s , he con enience o making he in e acing
comple ely s anda d, and, second, he necessi y o gua an ee o-
bus ness in he con ol o he analog pa ame e s, he con ol uni
should be ully digi al, wi h he ob ious excep ion o he blocks
which in e change in o ma ion (bo h da a and commands) wi h
he a ay.
ACE chips ha e been designed wi h hese guidelines in mind.
Speci ically, his is he case o ACE16k [6] whose concep uala -
chi ec u e is depic ed in Fig. 1. As al eady men ioned, ACE16k
ep esen s he hi d gene a ion o ACE chips. Fig. 2 depic s
he e olu ion o hese chips, whe e a bi u ca ion appea s a he
ime when ACE16k was eleased. Such bi u ca ion is ela ed o
he di e en na u e o he beha io s add essed by ins ances be-
longing o each o he b anches. On he one hand, ACEXX chips
a e basically concei ed o pe o m spa ial image p ocessing on
empo al image lows. On he o he hand, CACEXX chips a e
designed o emula e he spa ial- empo al dynamic e olu ions
obse ed in mammalian e inas [14].
Table I summa izes some main ea u es o he h ee di e en
gene a ions o ACEXX chips. I highligh s a con inuous im-
p o emen ac oss ime. ACE400, he i s membe o his amily,
was designed in 1996 using a s anda d 0.8- m echnology [4].
I was concei ed o ope a e only on B/W image lows, and in-
cluded educed p og amming capabili y. Special a en ion was
paid o he op ical in e ace in o de o achie e high speed cap-
u ing h ough he inco po a ion o Da ling on-based pho ocu -
en ampli ica ion.
Fou yea s la e , in 2000, he ACE4k chip was eleased [5].
Toge he wi h an inc ease by a ac o o en in spa ial esolu ion,
his chip inco po a ed much la ge p og amming capabili ies.
Despi e he inc eased complexi y and i s capabili y o handle
g ayscale images, his chip ea u ed signi ican ly la ge PE den-
si y and lowe powe consump ion while basically keeping he
ime cons an unal e ed. These amelio a ions we e basically he
consequence o majo a chi ec u al and ci cui al imp o emen s,
and ma ginally due o he scaling down o he ab ica ion ech-
nology— om 0.8 m o 0.5 m.
By he end o 2002, he i s e sion ACE16k chip was made
a ailable om he ound y [6]. Imp o emen s o ACE16k
e sus ACE4k ha e al eady been men ioned in he In oduc ion
and a e summa ized in Table II. De ails abou he a chi ec u al
and ci cui al icks employed o achie e such signi ican
enhancemen s can be ound in [13]. In Sec ion III, we basically
discuss he modi ica ions a ec ing he PE i sel . Below, we
gi e some hin s ega ding he p og amming memo y and he
I/O in e ace, whose ci cui le el de ails a e p esen ed in [6].
Rega ding he p og amming memo y o ACE16k, i is simila
o ha o ACE4k. Howe e , h ee main di e ences exis .
•The ins uc ion memo y has been a anged in o wo blocks
wi h 64 wo ds o 32 bi s each. This di ision aims o sepa-
a e add esses om de ini ion o ope a ions—some hing
like de ining ope a ions and ope a o s sepa a ely. Thus,
and hanks o he use o sepa a e con ol buses, ACE16k
has a p og amming memo y o 64 64 wo ds o 32 bi s,
ins ead o simply 64 wo ds o 48 bi s as wi h ACE4k. Such
an inc ease in he memo y gi es he use he possibili y o
p og amming and es ing mo e complex algo i hms.
•ACE16k uses a memo y con ol ci cui y which includes
a ol age-con olled oscilla o o gene a e all he iming
signals equi ed o memo y managemen .
•Finally, ACE16k uses hand-shaking p o ocols, ins ead o
s obing signals, o con ol he access o he p og amming
memo y. This o e ly simpli ies con ol.
A ela ed majo modi ica ion o ACE16k consis s o he
inco po a ion o sel -calib a ion s ages o he analog bu e s
854 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 5, MAY 2004
Fig. 2. His o ical oadmap o ACE chips.
TABLE I
OVERVIEW OF ACEXX CHIP FEATURES
which d i e weigh s and analog e e ences o he cell a ay.
Al hough ACE16k uses he same dis ibu ed bu e s a egy as
ACE4k, he opology o he bu e includes ex a ci cui y o
calib a ion pu poses. Fig. 3 shows a simpli ied block diag am
o he weigh gene a ion ci cui y in ACE16k, including he
RAM block in which coe icien s a e digi ally s o ed, he 8-bi
D/A con e e (DAC), he wo-le el bu e s uc u e, and he
calib a ion ci cui y.
Rega ding I/O, ACE16k inco po a es a ully digi al po o
image ans e ences. Fig. 3 shows a simpli ied block diag am o
heI/OblockinACE16k.I includesabanko 1288-bi A/Dcon-
e e (ADC) and DAC. Since he da a bus is 32 bi s wide, each
wo d ansmi ed o/ om he chip con ains in o ma ion abou
ou adjacen cells—same ow, consecu i e columns. Then, and
byjus lookinga hewayo w i ing/ eadingimages, hea aycan
be di ided in o 32 iden ical blocks o ou adjacen columns.
Da a ans e ence uses a wo-s age pipeline a chi ec u e. In
he inpu mode, da a a e sen o an inpu egis e o 8 128
bi s (see Fig. 4). Once illed, his egis e is ansmi ed in pa -
allel o an in e nal 8 128 egis e whose ou pu s (in blocks
RODRÍGUEZ-VÁZQUEZ e al.: ACE16k: THIRD GENERATION MIXED-SIGNAL SIMD-CNN 855
TABLE II
COMPARING ACE16k VERSUS ACE4k
Fig. 3. Dis ibu ed bu e s in ACE16k.
o 8) a e pe manen ly connec ed o a bank o 128 DACs which
ope a e in pa allel. A he same ime, he ex e nal egis e is
again being illed wi h he in o ma ion abou he nex ow o
be w i en—a oiding idle pe iods. A he end o he con e sion,
he i s module o a double bank o 2 128 sample-and-hold
(S&H) ci cui s acqui es he con e ed da a and sends i o he se-
lec ed ow o cells. While he i s module o he bank o S&H
sends he analog alue o he a ay, he second module is his
bank is cap u ing he nex ow o da a which is being con e ed
by he DACs.
Du ing an ou pu p ocess, he i s ow is acqui ed by he i s
module o he S&H bank. In he nex s ep, hese da a a e held
and con e ed while he second module o he S&H bank cap-
u es he second ow. A he end o his s ep, he digi al in o ma-
ion ( he esul o he con e sion o he i s ow) is sen o he
ex e nal egis e whe e i is eady o be downloaded du ing he
hi d s ep. In he hi d s ep, he con en o he i s ow is ead
a he ou pu o he ex e nal egis e , he con en o he second
ow is being con e ed and he hi d ow is being cap u ed by
he i s module o he S&H bank.

856 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 5, MAY 2004
Fig. 4. I/O Block diag am.
III. ACE16K VERSUS ACE4K: MODIFICATIONS IN PE
A. ACE-BUS
The PE o ACE4k is designed o allow di ec communica ion
only be ween closely ela ed blocks. Howe e , ACE16k uses
a new communica ion scheme, he ACE-BUS. The ACE-BUS
is basically a node o he p ocessing uni (PU), whe e e e y
unc ional block connec s i s inpu and ou pu po s. Commu-
nica ions be ween blocks always happen in he same way; i s ,
one unc ional block ( he da a sou ce) is con igu ed in ou pu
mode, while a second one ( he des ina ion) is con igu ed in inpu
mode. I o e ly simpli ies he de ini ion o ope a ions and da a
mo emen s, and allows o apid checking o con lic i e swi ch
con igu a ions.
Fig. 5 shows he block diag am o he PU in ACE16k.
Synapses ake hei inpu s om he ACEBUS. They can be
ini ialized by using ei he he LAM module con en s, he esul
o LLU ope a ions, he esul o an op ical acquisi ion, o he
esul o a passi e di usion ealized by using one esis i e g id
embedded in he chip. The analog p ocessing co e s ee s he
p ocessed inpu cu en ( he inpu cu en a e elimina ing all
he o se con ibu ions) o he ACE-BUS; his cu en can hen
be oo ed ei he o he s a e capaci o o o any o he LAM
modules.
B. Image-P ocessing Ke nel
The synap ic analog mul iplie s a e designed by using he
same one- ansis o echnique as in ACE4k [13]. They a e
d i en by ol ages a bo h he signal and he scaling inpu and
deli e a cu en a he ou pu . The bank o mul iplie s, depic ed
a he concep ual le el in Fig. 6, is d i en by h ee di e en
pixel alues, , and so ha he cu en which lows
in o he PE is
(1)
whe e he ope a o deno es he con olu ion p oduc o he
empla e and he pixel alue ma ix, and is he o se e m
gene a ed by he one- ansis o mul iplie s. This o se e m is
elimina ed by using a high-accu acy cu en memo y [13], [15].
Fig. 7 shows a concep ual schema ic o he PE inpu block in-
cluding he S3I cu en memo y used o o se cancella ion,
based on [15]. The esul ing cu en
(2)
is ei he s ee ed o he ACE-BUS, o o he inpu o a capaci i e-
inpu cu en compa a o [16] whose ou pu is connec ed o he
ACE-BUS h ough an analog swi ch. Then, wo si ua ions may
occu .
•A ol age codi ying he sign o (i.e., he sign o he
ou come o he con olu ion ope a ion) is deli e ed o he
ACE-BUS
(3)
In his case, he ou pu is a B/W pixel alue.
•The analog cu en is ou ed o one o he capaci o s
associa ed o he pixels and he ou pu is a g ayscale pixel
alue.
In any case, he speci ic pixel capaci o o which he ou pu
o he inpu block is ou ed is selec ed by he use h ough he
ac i a ion o some bi s in he digi al ins uc ion. By so doing,
he e olu ion o he PU is desc ibed by a s a e equa ion whose
ac ual exp ession depends on he selec ed in eg a ing capaci o .
The e o e, di e en kinds o p ocessing ke nels a e a ailable.
•Conside , o ins ance, ha you wan o execu e a Sobel
ope a o [8]. The con olu ion ma ix is hen de ined in ;
he image is loaded in o ; he ollowing alues a e se :
, and ; and he signal cu en is ou ed
o . Hence, he equi alen s a e equa ion ob ained o
each PU is
(4)
whose s eady s a e is , as co esponds o he
desi ed con olu ion ou pu .
•Conside now ha he capaci o which ecei es he inpu
cu en is . Then, he cells a e dynamically coupled and
CNN spa io- empo al ope a ions a e ealized.
•Conside inally ha he cu en is ou ed o ; ha all
bu he cen al en ies o ma ix a e null; and ha his
cen al en y is . The s eady-s a e solu ion is
hen
(5)
which co esponds o he ealiza ion o g ayscale image-
wise a i hme ic ope a ions.
Al hough ACE16k also uses one- ansis o synapses and, due
o he simila i ies be ween he elec ical pa ame e s o 0.5- m
0.35- m echnologies, he same ol age anges as in ACE4k,
he aspec o synapse ansis o s ha e been educed om 2/20
o 1/20. This keeps he ol age d op ac oss he me al line which
RODRÍGUEZ-VÁZQUEZ e al.: ACE16k: THIRD GENERATION MIXED-SIGNAL SIMD-CNN 857
Fig. 5. Func ional block diag am o he ACE16k PE.
Fig. 6. Bank o mul iplie s in ACE16k.
d i es he weigh o he cell. A e some calcula ions, he ol-
lowing exp ession can be ob ained:
(6)
whe e deno es he aspec o synapse ansis o s and
deno es he wid h o he me al laye s d i ing he weigh s. Since,
and , (6) becomes
(7)
Since is almos in a ian om echnology o
echnology (in he ideal case, bo h scale as he echnology
scaling ac o does), he aspec a io o he synapse ansis o
in ACE16k mus be educed by a ac o o ou in o de o
keep he same ol age d op as in ACE4k. Howe e , because
he numbe o mul iplie s is wo imes smalle in ACE16k, he
aspec a io is educed only by a ac o o wo. The eason o
educing he wid h is ha i does no p ac ically a ec he ime
cons an . The coun e pa is a deg ada ion o ma ching which
is a enua ed by ha dwa e.
C. Inc easing he Cell Densi y
Las ly, he PE size is de e mined by he lines which ca y
he weigh s and con ol signals: hei numbe , hei wid h and
he minimum sepa a ion be ween hem. Ob iously, ha ing i e
me al laye s (A[email p o ec ed] m echnology) ins ead o h ee
(A[email p o ec ed] m echnology) gi es some oom o dec easing
he cell size. Howe e , he ollowing hold.
•The op me al laye , me al 5, should be used only o powe
supply and g ound. On he one hand, his laye has he
maximum sepa a ion be ween adjacen lines. On he o he
hand, i has he g ea es conduc i i y and hence he max-
imum cu en d i ing capabili y.
•ACE16k has a much la ge numbe o PE-embedded unc-
ions han ACE4k (50 e sus 35). Ob iously his inc eases
he numbe o con ol lines.
To mee he a ge o ha ing cell densi ies la ge han 150
cells mm , ACE16k employs an in e ac ion pa e n among
cells di e en om ha o ACE4k. As Fig. 6 shows, each PE
con ains 12 analog mul iplie s. Eigh o hem connec he cell
o i s neighbo s; he o he ou p o ide addi ional inpu s o he
p ocessing block. The mul iplie s ma ked wi h a s a in Fig. 6
a e double; hey consis o he pa allel agg ega ion o wo mul-
iplie s. The pu pose o his “double s eng h”is o inc ease he
obus ness in ce ain ope a ions. F om [17], i can be seen ha in
mos cases, he cen al elemen o he empla e ma ices is la ge
han he noncen al elemen s. A he elec ical le el i means
ha he co esponding mul iplie s ha e o be d i en by qui e
di e en ol ages, hus inc easing misma ch-induced e o s. By
inc easing he s eng h o cen al mul iplie s, he di e ence be-
ween weigh ol ages, and consequen ly he o e all obus ness,
inc eases.
D. Digi al Modules
The PE o ACE4k embeds con en ional digi al ci cui y. This
is no con enien because o he ollowing.
•Le el adap e s a e needed o ans o m he logic ol age
le els, co esponding o ull-scale swings, in o le els
compa ible wi h he elec ical ope a ion o he PE analog
ci cui y.
•P o ec i e measu es mus be aken o a enua e he im-
pac o he la ge-powe swi ching noise on he analog ci -
cui y [18]. Las , his means g ea e a ea and penalizes cell
densi y.
858 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 51, NO. 5, MAY 2004
Fig. 7. Simpli ied schema ic o he PE inpu block.
Fig. 8. LLU in ACE16k.
In he case o ACE16k di e en measu es ha e been aken o
o e come hese d awbacks, he ollowing.
•The ou LLMs o ACE4k ha e been eplaced by LAMs.
On he one hand, his elimina es he digi al swi ching noise
in oduced by he LLMs. On he o he hand, he impac
on he silicon a ea is no e y la ge because he eadou
ampli ie is sha ed wi h he o he LAMS. Finally, ol age
le el adap e s be ween he LLMs and mul iplie s a e no
u he needed.
In addi ion o ha , ha ing eigh ins ead o ou LAM
modules inc eases signi ican ly he algo i hmic capabili-
ies o he chip.
•The LLU has been concei ed o ope a e as an independen
module which ge s i s inpu s om he ACE-BUS and
which also d i es i s ou pu o he same ACE-BUS. This
means a signi ican di e ence as compa ed o ACE4k.
The e, he LLU was in insically ela ed o he LLM
since i s inpu s we e always aken om wo ixed LLMs.
In addi ion, al hough he LLU wo ks as an in insically
logic de ice, i s inpu s and ou pu s a e p o ided ia he
ACE-BUS and ha e hence analog ol age le els.
Fig. 8 shows he LLU in ACE16k. I s wo inpu s, OP0 and
OP1 a e acqui ed om he ACE-BUS by using ins uc ion bi s
WOP0 and WOP1, while he esul o he LLU ope a ion is
w i en o he ACE-BUS when he bi RLLU is ac i a ed. Logic
in e e s in he LLU (as well as any o he in e e in he cell)
a e no con en ional CMOS in e e s bu cu en -peak limi ed
Fig. 9. In e e s and biasing ci cui y in he cell in ACE16k.
in e e s. They ha e been designed using an NMOS ansis o
connec ed o a PMOS esis i e load as depic ed in Fig. 9. The
esis i e load is biased by a common biasing ci cui y—sha ed
by all he in e e s in he cell. I es ablishes he quiescen poin
o he in e e a ound he middle o he ol age ange o pixels.
E. Mul imode Op ical Senso
Ligh sensing in ACE4k is ealized by a pa asi ic di usion- o-
subs a e diode o he LAM access swi ches. Thus, sensi i i y
is a he low, and c oss- alk among he LAM modules a ises.
ACE16k inco po a es a mul imode op ical senso which has
been concei ed o be lexible enough o ope a e unde e y
di e en illumina ion condi ions. Fig. 10 shows i s concep ual
schema ic, including h ee main blocks.
•The i s one, a i-s a e eadou bu e , con ols he com-
munica ions be ween he senso and o he blocks in he
PU. Senso accesses a e con olled by he global p og am-
ming signal ROPT.
•The second one is de o ed o ans o ming he pho o-gen-
e a ed cha ges in o a ol age. The use has he possibili y
o selec ing he pho o- ansduc ion mechanism by means
o signals LOG1,LOG2,PCH.
•The hi d block includes he op ical senso i sel and wo
con igu a ion swi ches used o selec one ou o he
h ee a ailable pho o-senso s. The selec ion o he senso
is ca ied ou by signals DW and WS.
The op ical senso can be con igu ed o ope a e in h ee
di e en linea in eg a ion modes [Fig. 11(a)–(c)] and h ee
di e en loga i hmic comp ession modes [Fig. 11(d)–( )]. In
RODRÍGUEZ-VÁZQUEZ e al.: ACE16k: THIRD GENERATION MIXED-SIGNAL SIMD-CNN 859
Fig. 10. Mul imode pho osenso in ACE16k. (a) Equi alen schema ic. (b) C oss sec ion.
Fig. 11. A ailable con igu a ions o he op ical senso o ACE16k.
he in eg a ion modes, he sensing p ocedu e is always ca ied
ou in he same way. Fi s o all, a e u ned o by making
. A e wa ds, swi ch p echa ges
he in e nal node o a use de inable ol age VPCH. Finally,
swi ch is u ned o and he pho o-gene a ed cu en
cha ges o discha ges (depending on he selec ed pho osenso )
he pixel capaci o . Fu he de ails abou he ACE16k
senso ope a ion can be ound elsewhe e [13], [19].
Fig. 12 shows he global block diag am o he ACE16k-PU.
The e, he di e en building blocks can be iden i ied. Con ol
and con igu a ion signals om he p og amming memo y a e in
bold. A de ailed desc ip ion can be ound in [13].
F. Cell Layou and Me al Dis ibu ion
The layou o he PU in ACE16k di e s om ha in ACE4k
in a ious poin s.
•Me al 1 and me al 2 a e used o in e nal ou ing, ins ead
o jus me al 1. This helps o inc ease cell densi y.
•As al eady men ioned, he las me al laye , me al 5, is
employed o powe and g ound dis ibu ion. The e o e,
powe and g ound lines can be as wide as almos hal
he cell heigh . This inc eases he quali y o hese signals:
be e uni o mi y ac oss he a ay, less noise, lowe p ob-
abili y o e o du ing he ab ica ion, e c.