Fast parallel prefix logic circuits for n2n round-robin arbitration

Name: Fast parallel prefix logic circuits for n2n round-robin arbitration
Author: Uğurdağ, Hasan Fatih, Baskirt, O.

İsim	Fast parallel prefix logic circuits for n2n round-robin arbitration
Yazar	Uğurdağ, Hasan Fatih, Baskirt, O.
Basım Tarihi:	2012-08
Basım Yeri	- Elsevier
Konu	Circuits for networking, Computer arithmetic, Logic synthesis, Priority encoder, Timing optimization
Tür	Süreli Yayın
Dil	İngilizce
Dijital	Evet
Yazma	Hayır
Kütüphane:	Özyeğin Üniversitesi
Demirbaş Numarası	0026-2692
Kayıt Numarası	f4495428-0593-4c8d-a719-7631dc7b9a78
Lokasyon	Electrical & Electronics Engineering
Tarih	2012-08
Notlar	Due to copyright restrictions, the access to the full text of this article is only available via subscription.
Örnek Metin	An n2n round-robin arbiter (RRA) searches its n inputs for a 1, starting from the highest-priority input. It picks the first 1 and outputs its index in one-hot encoding. RRA aims to be fair to its inputs and maintains fairness by simply rotating the input priorities, i.e., the last arbitrated input becomes the lowest-priority input. Arbiters are used to multiplex the usage of shared resources among requestors as well as in dispatch logic where the purpose is load balancing among multiple resources. Today, arbiters have hundreds of ports and usually need to run at very high clock speeds. This article presents a new gate-level RRA circuit called Thermo Coded-Parallel Prefix Arbiter (TC-PPA) that scales to any number of requestors. It uses parallel prefix network topologies (borrowed from fast carry lookahead adders) to generate a thermometer-coded pointer, thus greatly reducing critical path. Code generators were written not only for TC-PPA but also for the 5 highly competitive circuits in the literature (9 including their variants), and a rich set of timing/area results were obtained using a standard-cell based logic synthesis flow with a novel iterative strategy based on binary search. Synthesis runs include results with wire-load and without. Results show that for 54 or more ports (except 256) TC-PPA offers the best timing (lowest latency) as well as competitive area. Contributions also include transaction-level simulations that show when pipelining is used to boost clock rate, latency and input FIFO sizes are adversely affected, and hence pipelining cannot be indiscriminately exploited to trim clock period.
DOI	10.1016/j.mejo.2012.04.005
Cilt	43

Kaynağa git Özyeğin Üniversitesi

Aramaya Dön

Özyeğin Üniversitesi

Kaynağa git

Fast parallel prefix logic circuits for n2n round-robin arbitration

Yazar Uğurdağ, Hasan Fatih, Baskirt, O.

Basım Tarihi 2012-08

Basım Yeri - Elsevier

Konu Circuits for networking, Computer arithmetic, Logic synthesis, Priority encoder, Timing optimization

Tür Süreli Yayın

Dil İngilizce

Dijital Evet

Yazma Hayır

Kütüphane Özyeğin Üniversitesi

Demirbaş Numarası 0026-2692

Kayıt Numarası f4495428-0593-4c8d-a719-7631dc7b9a78

Lokasyon Electrical & Electronics Engineering

Tarih 2012-08

Notlar Due to copyright restrictions, the access to the full text of this article is only available via subscription.

Örnek Metin An n2n round-robin arbiter (RRA) searches its n inputs for a 1, starting from the highest-priority input. It picks the first 1 and outputs its index in one-hot encoding. RRA aims to be fair to its inputs and maintains fairness by simply rotating the input priorities, i.e., the last arbitrated input becomes the lowest-priority input. Arbiters are used to multiplex the usage of shared resources among requestors as well as in dispatch logic where the purpose is load balancing among multiple resources. Today, arbiters have hundreds of ports and usually need to run at very high clock speeds. This article presents a new gate-level RRA circuit called Thermo Coded-Parallel Prefix Arbiter (TC-PPA) that scales to any number of requestors. It uses parallel prefix network topologies (borrowed from fast carry lookahead adders) to generate a thermometer-coded pointer, thus greatly reducing critical path. Code generators were written not only for TC-PPA but also for the 5 highly competitive circuits in the literature (9 including their variants), and a rich set of timing/area results were obtained using a standard-cell based logic synthesis flow with a novel iterative strategy based on binary search. Synthesis runs include results with wire-load and without. Results show that for 54 or more ports (except 256) TC-PPA offers the best timing (lowest latency) as well as competitive area. Contributions also include transaction-level simulations that show when pipelining is used to boost clock rate, latency and input FIFO sizes are adversely affected, and hence pipelining cannot be indiscriminately exploited to trim clock period.

DOI 10.1016/j.mejo.2012.04.005

Cilt 43