This page has only limited features, please log in for full access.
More and more attention is being paid to the use of massive parallel computing performed on many-core Networks-on-Chip (NoC) in order to accelerate performance. Simultaneously deploying multiple applications on NoC is one feasible way to achieve this. In this paper, we propose a multi-phase-based multi-application mapping approach for NoC design. Our approach began with a rectangle analysis, which offered several potential regions for application. Then we mapped all tasks of the application into these potential regions using a genetic algorithm, and identified the one which exhibited the strongest performance. When the packeted regions for each application were identified, a B*Tree-based simulated annealing algorithm was used to generate the optimal placement for the multi-application mapping regions. The experiment results show that the proposed approach can achieve a considerable reduction in network power consumption (up to 23.45%) and latency (up to 24.42%) for a given set of applications.
Fen Ge; Chenchen Cui; Fang Zhou; Ning Wu. A Multi-Phase Based Multi-Application Mapping Approach for Many-Core Networks-on-Chip. Micromachines 2021, 12, 613 .
AMA StyleFen Ge, Chenchen Cui, Fang Zhou, Ning Wu. A Multi-Phase Based Multi-Application Mapping Approach for Many-Core Networks-on-Chip. Micromachines. 2021; 12 (6):613.
Chicago/Turabian StyleFen Ge; Chenchen Cui; Fang Zhou; Ning Wu. 2021. "A Multi-Phase Based Multi-Application Mapping Approach for Many-Core Networks-on-Chip." Micromachines 12, no. 6: 613.
As a typical artificial intelligence algorithm, the convolutional neural network (CNN) is widely used in the Internet of Things (IoT) system. In order to improve the computing ability of an IoT CPU, this paper designs a reconfigurable CNN-accelerated coprocessor based on the RISC-V instruction set. The interconnection structure of the acceleration chain designed by the predecessors is optimized, and the accelerator is connected to the RISC-V CPU core in the form of a coprocessor. The corresponding instruction of the coprocessor is designed and the instruction compiling environment is established. Through the inline assembly in the C language, the coprocessor instructions are called, coprocessor acceleration library functions are established, and common algorithms in the IoT system are implemented on the coprocessor. Finally, resource consumption evaluation and performance analysis of the coprocessor are completed on a Xilinx FPGA. The evaluation results show that the reconfigurable CNN-accelerated coprocessor only consumes 8534 LUTS, accounting for 47.6% of the total SoC system. The number of instruction cycles required to implement functions such as convolution and pooling based on the designed coprocessor instructions is better than using the standard instruction set, and the acceleration ratio of convolution is 6.27 times that of the standard instruction set.
Ning Wu; Tao Jiang; Lei Zhang; Fang Zhou; Fen Ge. A Reconfigurable Convolutional Neural Network-Accelerated Coprocessor Based on RISC-V Instruction Set. Electronics 2020, 9, 1005 .
AMA StyleNing Wu, Tao Jiang, Lei Zhang, Fang Zhou, Fen Ge. A Reconfigurable Convolutional Neural Network-Accelerated Coprocessor Based on RISC-V Instruction Set. Electronics. 2020; 9 (6):1005.
Chicago/Turabian StyleNing Wu; Tao Jiang; Lei Zhang; Fang Zhou; Fen Ge. 2020. "A Reconfigurable Convolutional Neural Network-Accelerated Coprocessor Based on RISC-V Instruction Set." Electronics 9, no. 6: 1005.
Optical network-on-chip is considered to be a promising technology to solve the problems of low bandwidth and high latency in the traditional interconnection network. However, due to the inevitable leakage of optical devices, the optical signal will receive crosstalk noise during transmission. In this paper, a heuristic fusion mapping algorithm PSO_SA for crosstalk optimization is proposed. First, the initial optimal mapping is obtained by particle swarm optimization, and then the local optimization of the mapping scheme is removed by combining with simulated annealing algorithm. The experimental results show that the crosstalk optimization performance of PSO_SA algorithm is better than that of GA algorithm in 263 dec, Wavelet, DVOPD and other applications, and the maximum optimization degree is 28.7%.
Xinhao Shi; Ning Wu; Fen Ge; Fang Zhou; Muhammad Rehan Yahya. Optimizing Crosstalk in Optical NoC through Heuristic Fusion Mapping. Electronics 2020, 9, 1 .
AMA StyleXinhao Shi, Ning Wu, Fen Ge, Fang Zhou, Muhammad Rehan Yahya. Optimizing Crosstalk in Optical NoC through Heuristic Fusion Mapping. Electronics. 2020; 9 (6):1.
Chicago/Turabian StyleXinhao Shi; Ning Wu; Fen Ge; Fang Zhou; Muhammad Rehan Yahya. 2020. "Optimizing Crosstalk in Optical NoC through Heuristic Fusion Mapping." Electronics 9, no. 6: 1.
Aiming to protect cryptographic circuits against physical attacks, researchers have proposed a variety of mature and effective countermeasures. However, most of these defensive technologies are used for specific and single attack, thus it is hard to thwart combined attack, such as combined power and fault attacks. In this paper, we propose a dual complementary infection countermeasure for Advanced Encryption Standard (AES) cryptographic circuit to defend against both power and fault attacks. According to the target AES circuit, we first design and construct a dual complementary AES circuit to defend against power attacks, which can balance the power consumption when processing different data. Besides, to defend against fault attacks, in the dual complementary AES circuit, we design an improved random infection mechanism to diffuse the effect of injected faults. Experiment results show that the proposed countermeasure can thwart both power and fault attacks effectively. Compared with those AES circuits which can only defend against single attack, our designed circuit increases greatly the security under extra 83.1% area overhead and 2.1% impacts on the maximum working frequency.
Jinbao Zhang; Ning Wu; Fang Zhou; Fen Ge; Xiaoqiang Zhang. Securing the AES Cryptographic Circuit Against Both Power and Fault Attacks. Journal of Electrical Engineering & Technology 2019, 14, 2171 -2180.
AMA StyleJinbao Zhang, Ning Wu, Fang Zhou, Fen Ge, Xiaoqiang Zhang. Securing the AES Cryptographic Circuit Against Both Power and Fault Attacks. Journal of Electrical Engineering & Technology. 2019; 14 (5):2171-2180.
Chicago/Turabian StyleJinbao Zhang; Ning Wu; Fang Zhou; Fen Ge; Xiaoqiang Zhang. 2019. "Securing the AES Cryptographic Circuit Against Both Power and Fault Attacks." Journal of Electrical Engineering & Technology 14, no. 5: 2171-2180.
Recently, in 3D Chip-Multiprocessors (CMPs), a hybrid cache architecture of SRAM and Non-Volatile Memory (NVM) is generally used to exploit high density and low leakage power of NVM and a low write overhead of SRAM. The conventional access policy does not consider the hybrid cache and cannot make good use of the characteristics of both NVM and SRAM technology. This paper proposes a Cache Fill and Migration policy (CFM) for multi-level hybrid cache. In CFM, data access was optimized in three aspects: Cache fill, cache eviction, and dirty data migration. The CFM reduces unnecessary cache fill, write operations to NVM, and optimizes the victim cache line selection in cache eviction. The results of experiments show that the CFM can improve performance by 24.1% and reduce power consumption by 18% when compared to conventional writeback access policy.
Fen Ge; Lei Wang; Ning Wu; Fang Zhou. A Cache Fill and Migration Policy for STT-RAM-Based Multi-Level Hybrid Cache in 3D CMPs. Electronics 2019, 8, 639 .
AMA StyleFen Ge, Lei Wang, Ning Wu, Fang Zhou. A Cache Fill and Migration Policy for STT-RAM-Based Multi-Level Hybrid Cache in 3D CMPs. Electronics. 2019; 8 (6):639.
Chicago/Turabian StyleFen Ge; Lei Wang; Ning Wu; Fang Zhou. 2019. "A Cache Fill and Migration Policy for STT-RAM-Based Multi-Level Hybrid Cache in 3D CMPs." Electronics 8, no. 6: 639.
As a classical artificial intelligence algorithm, the convolutional neural network (CNN) algorithm plays an important role in image recognition and classification and is gradually being applied in the Internet of Things (IoT) system. A compact CNN accelerator for the IoT endpoint System-on-Chip (SoC) is proposed in this paper to meet the needs of CNN computations. Based on analysis of the CNN structure, basic functional modules of CNN such as convolution circuit and pooling circuit with a low data bandwidth and a smaller area are designed, and an accelerator is constructed in the form of four acceleration chains. After the acceleration unit design is completed, the Cortex-M3 is used to construct a verification SoC and the designed verification platform is implemented on the FPGA to evaluate the resource consumption and performance analysis of the CNN accelerator. The CNN accelerator achieved a throughput of 6.54 GOPS (giga operations per second) by consuming 4901 LUTs without using any hardware multipliers. The comparison shows that the compact accelerator proposed in this paper makes the CNN computational power of the SoC based on the Cortex-M3 kernel two times higher than the quad-core Cortex-A7 SoC and 67% of the computational power of eight-core Cortex-A53 SoC.
Fen Ge; Ning Wu; Hao Xiao; Yuanyuan Zhang; Fang Zhou. Compact Convolutional Neural Network Accelerator for IoT Endpoint SoC. Electronics 2019, 8, 497 .
AMA StyleFen Ge, Ning Wu, Hao Xiao, Yuanyuan Zhang, Fang Zhou. Compact Convolutional Neural Network Accelerator for IoT Endpoint SoC. Electronics. 2019; 8 (5):497.
Chicago/Turabian StyleFen Ge; Ning Wu; Hao Xiao; Yuanyuan Zhang; Fang Zhou. 2019. "Compact Convolutional Neural Network Accelerator for IoT Endpoint SoC." Electronics 8, no. 5: 497.
NoC architecture has been increasingly applied to complex SoC chips and how to efficiently map the specific application to NoC infrastructure is an important topic urgently needed to study for NoC. At the same time, there are many challenges for NoC embedded IP cores testing. This paper proposes a sectional NoC mapping algorithm optimized for NoC IP cores testing. Associated with the pre-designed test structure, sectional NoC mapping firstly adapts the Partition Algorithm to arrange IP cores into parallel testing groups to minimize testing time. Then, it applies genetic algorithm for NoC mapping based on the traffic information between IP cores. The experiment results on ITC’02 benchmark circuits showed that the mapping costs decreased by 24.5 % on average compared with the random mapping and the testing time can be reduced by 12.67 % on average as well, which illustrated the effectiveness of the sectional NoC mapping scheme.
Zhang Ying; Wu Ning; Ge Fen. Sectional NoC Mapping Scheme Optimized for Testing Time. Transactions on Engineering Technologies 2015, 301 -314.
AMA StyleZhang Ying, Wu Ning, Ge Fen. Sectional NoC Mapping Scheme Optimized for Testing Time. Transactions on Engineering Technologies. 2015; ():301-314.
Chicago/Turabian StyleZhang Ying; Wu Ning; Ge Fen. 2015. "Sectional NoC Mapping Scheme Optimized for Testing Time." Transactions on Engineering Technologies , no. : 301-314.
Zhang Ying; Wu Ning; Ge Fen; Chen Xin. Core Test Wrapper Design for Unicast and Multicast NOC Testing. Information Technology Journal 2013, 12, 8242 -8248.
AMA StyleZhang Ying, Wu Ning, Ge Fen, Chen Xin. Core Test Wrapper Design for Unicast and Multicast NOC Testing. Information Technology Journal. 2013; 12 (24):8242-8248.
Chicago/Turabian StyleZhang Ying; Wu Ning; Ge Fen; Chen Xin. 2013. "Core Test Wrapper Design for Unicast and Multicast NOC Testing." Information Technology Journal 12, no. 24: 8242-8248.
Ge Fen; Feng Gui; Yu Shuang; Wu Ning. Power-and Thermal-aware Mapping for 3D Network-on-chip. Information Technology Journal 2013, 12, 7297 -7304.
AMA StyleGe Fen, Feng Gui, Yu Shuang, Wu Ning. Power-and Thermal-aware Mapping for 3D Network-on-chip. Information Technology Journal. 2013; 12 (23):7297-7304.
Chicago/Turabian StyleGe Fen; Feng Gui; Yu Shuang; Wu Ning. 2013. "Power-and Thermal-aware Mapping for 3D Network-on-chip." Information Technology Journal 12, no. 23: 7297-7304.
NoC(Network-on-Chip) has been proposed as a new solution to deal with the global communication problem of complex SoC(System-on-Chip). However, there are many difficulties in testing and verification for NoC. We propose novel co-design of test architecture and data transfer schemes for 2D-Mesh topology NoC to improve the parallelism of test packets transmission. The testing efficiencies of different structures or transfer modes are evaluated under a coverage-driven and hierarchical NoC testbench, which is based on the VMM verification methodology and SystemVerilog language. The evaluation results of testing cost, testing time and hardware overhead show that the shortening of transmission path and parallel testing effectively decreases the power consumption and testing time. Furthermore, one of these test structures can be proved to the optimal scheme.
Ying Zhang; Ning Wu; Fen Ge. The Co-Design of Test Structure and Test Data Transfer Mode for 2D-Mesh NoC. Lecture Notes in Electrical Engineering 2012, 171 -184.
AMA StyleYing Zhang, Ning Wu, Fen Ge. The Co-Design of Test Structure and Test Data Transfer Mode for 2D-Mesh NoC. Lecture Notes in Electrical Engineering. 2012; ():171-184.
Chicago/Turabian StyleYing Zhang; Ning Wu; Fen Ge. 2012. "The Co-Design of Test Structure and Test Data Transfer Mode for 2D-Mesh NoC." Lecture Notes in Electrical Engineering , no. : 171-184.
A clustering-based topology generation approach is proposed to construct Network on Chip (NoC) topologies for given applications. The approach consists of four phases and constructs irregular NoC topology with design constraints, according to the communication requirements of the given application and characteristics of the router architectures. Specially, a recursion based link construction algorithm embedded in the topology generation is proposed to construct links between routers. The evaluation performed on various multimedia benchmark applications confirms the efficiency of the proposed approach. Experimental results show that the approach saves 61.5 % of power consumption on average in comparison with using regular Mesh topology. Significant network resource improvement is also achieved. Moreover, the approach performs well for two multimedia applications compared to existing algorithms.
Fen Ge; Ning Wu. Power-Aware Topology Generation Based on Clustering for Application-Specific Network on Chip. Lecture Notes in Electrical Engineering 2012, 135 -149.
AMA StyleFen Ge, Ning Wu. Power-Aware Topology Generation Based on Clustering for Application-Specific Network on Chip. Lecture Notes in Electrical Engineering. 2012; ():135-149.
Chicago/Turabian StyleFen Ge; Ning Wu. 2012. "Power-Aware Topology Generation Based on Clustering for Application-Specific Network on Chip." Lecture Notes in Electrical Engineering , no. : 135-149.