

| e-ISSN: 2278 – 8875, p-ISSN: 2320 – 3765| <u>www.ijareeie.com</u> | Impact Factor: 7.122|

# ||Volume 9, Issue 5, May 2020||

# Area Efficient Virtual Channel Router Architecture for Scalable Network on Chips

B.Harika<sup>1</sup>, Dr.M.Sunil Prakash<sup>2</sup>

M.Tech Student, Department of ECE, MVGR College of Engineering, Vizianagaram, Andhrapradesh, India<sup>1</sup>

Head of the Department, Department of ECE, MVGR College of Engineering, Vizianagaram, Andhrapradesh, India<sup>2</sup>

**ABSTRACT:** Network on Chips (NoCs) have emerged as the standard communication fabric for connecting cores and memory modules on the chip. Current multi-core chips consist of hundreds of cores and future projections call for thousands of cores. However, today, NoCs consume a large portion (approximately 10%-36%) of the entire chip's power budget. The problem will be further exacerbated by the continuous scaling of transistor feature size. This calls for innovative static power reduction techniques for future NoCs design. Considering the NoC components, as crossbars, arbiters, buffers, and links, in the experiments realized by the buffers were the largest leakage power consumers, dissipating approximately 64% of the whole power budget. In this way, the buffers were considered as candidates for leakage power optimization, since even at high loads, there were still 85% of idle buffers. Regarding dynamic power, the buffers' consumption is also high, and it increases rapidly as the packet flow throughput increases. In this project, we propose a novel Shared Buffer Virtual Channel Router with Easy Pass Switch Architecture for NoC which reduces the static power using power gating and bye pass routing and reduces dynamic power using shared buffer while increasing overall.

KEYWORDS: Network-on-Chips, Power Gating, Energy Efficient

# I. INTRODUCTION

Network-on-chip (NoC) architectures are essential modules for both general-purpose chip multiprocessors (CMPs) and application-specific systems-on chips (SoCs). As the number of processors on a single chip and the computing complexity increases day by day, the interconnection and communication method between the processors became essential factors affecting the performance of chip-multiprocessor. It requires more effective interconnection and communication among the processors to improve its performance, rather than depending only on the processing speed. The system requires a complete set of communication demands and characteristics of all kinds of processors, and should provide better data transmission performance in limited conditions such as chip area, power consumption, data bandwidth, etc. Therefore, it requires higher demands for on-chip communication, such as high speed, high bandwidth, high throughput while small area and low power consumption. Traditional interconnection architecture of the chipmultiprocessor, such as on-chip bus and crossbar cannot satisfy these requirements due to problems like reusability, flexibility and scalability. It needs a more perfect and effective interconnection technology

Similarly, as the numbers of cores in the processor increases, the interconnection network between the cores play a vital role in the chip multiprocessor's (CMPs) overall performance. Usually, shared buses have been used as communication medium for CMPs with a few numbers of cores. As transistors improved, resulting in higher clock frequencies, the number of clock cycles required to pass from one end of the chip to the other increases due to the reason that chip size remains the same. This results in slower bus speed than that of clock frequency of cores, as the delay of cores depends on local wires which range with transistors. As simple shared buses are unsuitable for future CMPs with hundreds to thousands of cores, the Networkon-Chips (NoCs) have emerged as a solution for designing interconnection fabric in CMPs.



| e-ISSN: 2278 – 8875, p-ISSN: 2320 – 3765| <u>www.ijareeie.com</u> | Impact Factor: 7.122|



# ||Volume 9, Issue 5, May 2020||

Figure 1: Typical NoC structure in a mesh topology

#### **II. NoC ROUTER ARCHITECTURE**

A typical NoC architecture consists of multiple segments of wires and routers as shown in Figure 1. In a tiled, cityblock style of NoC layout, the wires and routers are configured much like street grids of a city, while the clients (e.g., logic processor cores) are placed on city blocks separated by wires. A network interface (NI) module transforms data packets generated from the client logic (processor cores) into fixed-length flow-control digits (flits). The flits associated with a data packet consist of a header (or head) flit, a tail flit, and a number of body flits in between. This array of flits will be routed toward the intended destination in a hop-by-hop manner from one router to its neighboring router.



Figure 2: Typical NoC router architecture

In general, each router has five input ports and five output ports corresponding to the north, east, south, and west directions as well as the local processing element (PE). Each port will connect to another port on the neighboring router via a set of physical interconnect wires (channels). The router's function is to route flits entering from each input port to an appropriate output port and then toward the final destinations. To realize this function, a router is equipped with



| e-ISSN: 2278 - 8875, p-ISSN: 2320 - 3765| <u>www.ijareeie.com</u> | Impact Factor: 7.122|

#### ||Volume 9, Issue 5, May 2020||

an input buffer for each input port, a  $5 \times 5$  crossbar switch to redirect traffic to the desired output port and necessary control logic to ensure correctness of routing results.

Networks-On-Chip (NoC) architectures are becoming the effective fabric for both general-purpose chip multiprocessors and application-specific systems-on-chip designs. In the design of NoCs, high throughput and low latency are both important design parameters and the router micro architecture plays a vital role in achieving these performance goals. High throughput and low latency are both important design parameters and the router micro architecture plays a vital role in achieving these performance goals. High throughout routers allow a NoC to satisfy the communication needs of multicore applications, or the higher achievable throughput can be traded off for power saving by using fewer resources to attain a target bandwidth. Ultimately, a routers role lies in the efficient multiplexing of packets over the network links.

## **III. PROPOSED ARCHITECTURE**

A Shared Buffer Virtual Channel (VC) Router with a Easy Pass Switch based solution to the above stated problems. The Easy Pass (EZ-Pass) router which remedies the large wake-up latency overheads while providing significant static power savings. EZ Pass router consists of a conventional router that is used for high traffic mode and EZ-Pass switch for handling sporadic and low traffic modes. This allows incoming flits to be routed without fully waking up the powered-off router. The EZ-Pass switch represents a by-pass route and consists of single-flit latches, multiplexers (MUXs) and demultiplexers (DEMUXs). For example, when the router is powered-off, the incoming flits will be buffered into the single-flit latch. The EZPass control logic routes the flit using an arbitration scheme to the NI instead of the conventional router. The NI processes the incoming flit and switches it to the designated output port. The NI also records the VC information to be used later by the flow control policy.

When the traffic load becomes heavy, the router allows utilizing **shared buffers** reducing packet stall times at input ports hence it can achieve higher throughput than a full-Xbar VC router; while at low-load packets can bypass shared queues hence has low latency similar to a WH router. EZ Pass router helps to reduce static power while shared buffer helps to reduce dynamic power and assure high performance than conventional routers with pipelined route.

A packet from an input queue simultaneously arbitrates for both shared queues and an output port; if it wins the output port, it would be forwarded to the downstream router at the next cycle. Otherwise, that means having congestion at the corresponding output port, it can be buffered to the shared queues. Intuitively, at low load, the network would have low latency because packets seem to frequently bypass shared queues. While at heavy load, shared queues are used to temporarily store packets hence reducing their stall times at input ports that would improve the network throughput.



e

# Shared Buffer VC EZ Pass Router

Figure 3: Proposed EZ Pass Shared Buffer Router Architecture



| e-ISSN: 2278 – 8875, p-ISSN: 2320 – 3765| <u>www.ijareeie.com</u> | Impact Factor: 7.122|

# ||Volume 9, Issue 5, May 2020||

# **IV. SIMULATION RESULTS**

#### Simulation Waveforms of Proposed EZ Pass Router

• When the router is power off and there is low traffic EZ Pass will work. In this waveform, depends on the input request flits are by passed to the respective output ports through NI and blue line indicating the router is power off.

| Wave - Default              |                    | 1000    |         |            |         |           |       |            |        |         |         |                     |
|-----------------------------|--------------------|---------|---------|------------|---------|-----------|-------|------------|--------|---------|---------|---------------------|
| 6) • 🚅 🔛 🎲 🚭   🗶 Pa 🛱       | י⊘ו≦⊇              | MA 🗄 🔤  | 5 🖾 🕮 🕻 | 2 🛛 🛛      | 8       | <b> -</b> | 100   | ns 🜩 🖪     | i ii   | et 😿    | C   3   |                     |
| N 🖪 🕸 💵 🔠 I 🗗 🔹             | ≝ ┺ <del>┑</del> ӻ | l I I   | 3+ - →  | € • ())+ I | Search: |           | Į.    | 💌 jid iti, | . 8%-  | •       | રે 🔍 ટે | <mark>6 78</mark> 4 |
| × ڬ                         | Msg                | s       | 1.2     |            |         |           |       |            |        |         |         |                     |
| ✓ /tb sharedEZ/reset        | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/dk             | 1'h1               | nhono   | nhnnnn  | hnnnn      | hnnnn   | hhhh      | 1000  | hnnnn      | hnn    | nnhn    | nnnh    | nnn                 |
| + 🔷 /tb_sharedEZ/north_data | 8'ha1              | -(8'ha1 |         |            |         |           |       |            |        |         | -       |                     |
| + /tb_sharedEZ/south_data   | 8'hff              | -(8'hff |         |            |         |           |       |            |        |         |         |                     |
| + /tb_sharedEZ/east_data    | 8'h56              | -{8'h56 |         |            |         |           |       |            |        |         |         |                     |
| + 🔷 /tb_sharedEZ/west_data  | 8'h78              | -(8'h78 |         |            |         |           |       |            |        |         |         |                     |
| + /tb_sharedEZ/local_data   | 8'h9e              | -(8'h9e |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/ctrl_sel_inp   | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/ctrl_sel_op    | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
|                             | 3'h0               | 3'h0    |         |            | (3'h4   | ( 3'h3    | (3'h2 | (3'h1      | X      | 3'h0    |         |                     |
| /tb_sharedEZ/Router_EN      | 1'h0               |         |         |            |         |           | -     |            |        |         |         |                     |
| /tb_sharedEZ/north_busy     | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/south_busy     | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/east_busy      | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| Itb_sharedEZ/west_busy 🎸    | 1'h0               |         |         |            |         |           |       |            |        |         | <b></b> |                     |
| /tb_sharedEZ/local_busy     | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_n        | 1'h1               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_s        | 1'h1               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_e        | 1'h1               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_w        | 1'h1               |         |         |            | -       |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_l        | 1'h1               |         |         |            |         |           |       |            |        |         |         |                     |
| + 🔷 /tb_sharedEZ/north_out  | 8'ha1              | 8h00    | 18 ha1  |            |         |           |       |            |        |         |         |                     |
| + > /tb_sharedEZ/south_out  | 8'hff              | 8h00    |         |            |         |           |       |            | (8'hff |         |         |                     |
| + 🔶 /tb_sharedEZ/east_out   | 8'h56              | 8h00    |         |            |         |           |       | 8'h56      |        |         |         |                     |
| + 🔷 /tb_sharedEZ/west_out   | 8'h78              | 8'h00   |         |            |         |           | 8'h78 | 2          |        |         |         |                     |
| + 🔷 /tb_sharedEZ/local_out  | 8'h78              | 8h00    |         |            |         | 8'h78     |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_north    | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_south    | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /tb_sharedEZ/valid_east     | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| Itb_sharedEZ/valid_west 🧄   | 1'h0               |         |         |            |         |           |       |            |        |         |         |                     |
| /th_sharedE7/valid_local    | 1'b0               |         |         |            |         |           |       |            |        |         |         |                     |
| - <b></b>                   | Now 675 ns         |         | 40 ns   | 60         | ns      | 80        | ns    | 10         | 0 ns   | TITLE I | 120 г   | ns                  |
| × 0                         | Cursor 1 106 ns    |         |         |            |         |           |       |            |        | ns      |         |                     |

Figure 4: Simulation Waveforms of Proposed EZ Pass Router

#### Shared Buffer VC Router with less Network Traffic

- VC Router will work in both lower and heavy network traffic, here is the below waveforms for both the scenario.
- If it is low traffic, then the flits are routed with respect to the request signals.
- During heavy traffic, along with the request it will route with respect to priority. We assume that the priority for the ports are N, S, E, W, L(low → high)



Figure 5: Shared Buffer VC Router with less Network Traffic



| e-ISSN: 2278 - 8875, p-ISSN: 2320 - 3765| <u>www.ijareeie.com</u> | Impact Factor: 7.122|

#### ||Volume 9, Issue 5, May 2020||

#### Shared Buffer VC Router with heavy Network Traffic



Figure 6: Shared Buffer VC Router with heavy Network Traffic

#### V. UTILIZATION AND POWER ANALYSIS

| Resource | Utilization | Available | Utilization % |
|----------|-------------|-----------|---------------|
| FF       | 349         | 866400    | 0.04          |
| LUT      | 115         | 433200    | 0.03          |
| I/O      | 103         | 1000      | 10.30         |
| BUFG     | 2           | 32        | 6.25          |

| Table 1: Area Utilization for Shared Buffer VC Route | er |
|------------------------------------------------------|----|
|------------------------------------------------------|----|

| Total On-Chip Power                | 7.145 W         |  |  |
|------------------------------------|-----------------|--|--|
| Junction Temperature               | 31.1°C          |  |  |
| Thermal Margin                     | 53.9°C (61.5 W) |  |  |
| Power supplied to off-chip devices | 0 W             |  |  |

Table 2: Power Values for Shared Buffer VC Router

#### **VI. FUTURE SCOPE**

As the future of Network on Chip will be much more complex, the cores in CMPs will become even hungrier. The packet-switched networks are assured to take a major role in addressing the complex system design and throughput problems of future complex systems-on-chip. Hence we need to focus towards the development of self repairable, fault resilient and fully adaptive routing architectures. The concept of 3D NoC is also considered to be another promising alternative for next generation nano system design. We can improve and modify our proposed EZ Pass VC Router architecture in the context of 3D NoCs / 3D MPSoC chips. Since our VC router uses more buffers in each port for routing, there is another important scope of buffer sharing logic towards achieving dynamic power reduction of VC Router. Error correction mechanism can be added inside Network Interface (NI) Unit as an additional feature.



# | e-ISSN: 2278 – 8875, p-ISSN: 2320 – 3765| <u>www.ijareeie.com</u> | Impact Factor: 7.122|

#### ||Volume 9, Issue 5, May 2020||

#### VII. CONCLUSION

The proposed router architecture is a deadlock-free and can support any network topology in NoCs. The EZ pass router offers Low latency & High throughput which is also most energy efficient and suitable for emerging massive NoC chips in the current IP industry. Unlike previously proposed routers for NoCs, the proposed architecture provides a simple by-pass routing mechanism to route messages during low traffic without completely waking up the powered-off router. This efficient mechanism improves power savings and network latency. Our results can prove that overall network latency and static power can be reduced by significantly. The proposed EZ Pass Router architecture makes the on chip communication faster and provides higher communication bandwidth. With the availability of VC router for higher traffic, it also ensures concurrent execution of processing elements, incurring modularity in the network. The proposed architecture effectively utilizes the Network Interface (NI) unit. It acts as intermediary system between the routers and processing elements and is responsible for generating, transmitting, and receiving of data packets amongst IP cores at same time working as a channel for by pass switch during low traffic.

#### REFERENCES

[1] L.Chen and T. Pinkston, "NoRD: Node-router decoupling for effective power-gating of on-chip routers," in Intl. Symp. on Microarchitecture (MICRO), Feb. 2012.

[2] John Jose, Abhijit Das "An Adaptive Deflection Router with Dual Injection and Ejection Units for Mesh NoCs"

Published 2018 2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID)

[3] ElastiStore: An elastic buffer architecture for Network-on-Chip routers

[4] Masoud Oveis-Gharan, Gul N. Khan "Dynamic virtual channel and index-based arbitration based Network on Chip router architecture" Published in IEEE 2016 International Conference on High Performance Computing & Simulation (HPCS)

[5] Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, and P. Bose, "Microarchitectural techniques for power gating of execution units," in Proceedings of the 2004 international symposium on Low power electronics and design. ACM, 2004, pp. 32–37.

[6] H. Matsutani, M. Koibuchi, D. Ikebuchi, K.Usami, H. Nakamura, and H. Amano, "Ultra fine-grained run-time nn/c gating of onchip routers for cmps," in Networks-on-Chip (NOCS), 2010 Fourth ACM/IEEE International Symposium on. IEEE, 2010, pp. 61–68.

[7] J. Zhan, J. Ouyang, F. Ge, J. Zhao, and Y. Xie, "Dimnoc: A dim silicon approach towards power-efficient on-chip network," in Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE. IEEE, 2015, pp. 1–6.

[8] R. Parikh, R. Das, and V. Bertacco, "Power-aware nocs through routing and topology reconfiguration," in 51st Design Automation Conference (DCA), June 2014.

[9] R. Das, S. Narayanasamy, S. K. Satpathy, and R. G. Dreslinski, "Catnap: Energy proportional multiple network-onchip," in Proceedings of ISCA-40, 2013.

[10] L. Chen, D. Zhu, M. Pedram, and T. M. Pinkston, "Power punch: Towards non-blocking power-gating of noc routers," in High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on. IEEE, 2015, pp. 378–389.

[11] L.Chen, L.Zhao, and T. M. Pinkston, "Mp3: Minimizing performance penalty for power-gating of clos networkon-chip," in Intl. Symp. on High-Performance Computer Architecture (HPCA), Feb.2014.

[12] W. J. Dally and B. P. Towles, Principles and practices of interconnection networks. Elsevier, 2004.

[13] C. Bienia and K. Li, "Parsec 2.0: A new benchmark suite for chipmultiprocessors."

[14] A. Kumar, L.-s. Peh, P. Kundu, and N. K. Jha, "Express virtual channels: Towards the ideal interconnection fabric," in Proceedings of ISCA-34, 2007.

[15] T. N. Jain, M. Ramakrishna, P. V. Gratz, A. Sprintson, and G. Choi, "Asynchronous bypass channels for multisynchronous nocs: A router microarchitecture, topology, and routing algorithm," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 11, pp. 1663–1676, 2011.

[16] L. Xin and C.-s. Choy, "A low-latency NoC router with lookahead bypass," in Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on. IEEE, 2010, pp. 3981–3984.