this blog will give information about vlsi for every vlsi students

Sunday, 12 July 2015

a 16 bit risc processor design for convolution application


a 16 bit risc processor design for convolution application 


Implimentation of A 16-bit RISC Processor for Convolution Application B. Rajesh Kumar1 , Ravisaketh2 and Santha Kumar3 1,2M.Tech Student, Dept of ECE, Audisankara College of Engineering & Technology, Jawaharlal Nehru University Guduru, Nellore, Andhra Pradesh, India. 3Dept of ECE, Audisankara College of Engineering & Technology, Guduru, A.P, India.



Abstract RISC architecture is used across a wide range of platforms from Cellular phones to super computers.In this paper,a 16- bit RISC processor is designed, which utilizes minimum functional units without compromising in performance. The design is based on architectural modification made in the incrementer circuit which is used in program counter.A Low Power Area Efficient carry select adder and a high speed low power modified Wallace tree multiplier has been designed to improving perfomance of ALU in RISC processor. The RISC processor has been realized using Verilog HDL.The individual modules are designed and tested at each level and finally integrated in the top level module.Individual modules, toplevel module are simulated by using Xilinx ISE14.2. Synthesis, power estimation and area estimation is done by using Cadence.The power consumption obtanied is 1174 nw and area is 15041 nm2 .As against of referace RISC processor which is used Normal Carry select adder and Wallace tree multipler. Keywords: RISC, Lopower, modified Wallace tree multiplier, Carry select adder. 1. Introduction The trend in the recent past shows the RISC processors clearly outsmarting the earlier CISC processor architectures. The reasons have been the advantages, such as its simple, flexible and fixed instruction format and the RISC processor is its ability to support single cycle operation, meaning that the instruction is fetched from the 442 B. Rajesh Kumar et al instruction memory at the maximum speed of the memory. RISC processors in general, are designed to achieve this by pipelining, where there is a possibility of stalling of clock cycles due to wrong instruction fetch when jump type instructions are encountered. This reduces the efficiency of the processors. This paper describes a RISC architecture in which, single cycle operation is obtained without using a pipelined design [1][2].

The development of CMOS technology provides very high density and high performance integrated circuits. The performance provided by the existing devices has created a never-ending greed for increasingly better performing devices. This predicts the use of a whole RISC processor as a basic device by the year 2020. However, as the density of IC increases, the power consumption becomes a major threatening issue along with the complexity of the circuits. Hence, it becomes necessary to implement less complex, low power processor designs[3]. Program counter is one of the most complex building blocks of the processor design. It performs mainly two operations, namely, incrementing and loading. In order to address this issue, the present work establishes a novel design of an incrementer structure. The second part of this work concentrates on the complexity reduction in ALU by optimizing the design of arithmetic circuits. In this work, we have designed and developed a 16-bit single cycle RISC processor. In order to improve the performance, modification on incrementer circuit and Low power area efficent carry select adder circuit and modified wallace tree multipler have been done and modified structure has been integrated into the design and the performance is validated[5],[8]. In this paper we are main focus on ALU design. Section II presents the design of the RISC CPU. Section III presents the implementation of Low power area efficent carry select adder circuit and modified Wallace tree multipler. Section IV gives the ASIC implementation results and analysis.


Section IV concludes. Fig. 1: Proposed Block Diagram of RISC processor. 2. Design of 16-BIT RISC CPU 2.1 Architecture The architecture of the proposed RISC CPU is a uniform 16-bit instruction format, single cycle processor. It has a load/store architecture, where the operations will only Implimentation of A 16-bit RISC Processor for Convolution Application 443 be performed on registers, and not on memory locations. It follows the classical vonNeumann architecture with just one common memory bus for both instructions and data The instruction set consists of Load, store and HALT type of instructions. The Halt instruction acts as a border line between the instruction and data memory.Each of the register is of 16-bits width capacity. 2.2 Program Counter The Program Counter (PC) is a 16-bit latch that holds the memory address of location, from which the next machine language instruction will be fetched by the processor. The proposed PC is the largest sub-block and second to the control unit in complexity. 2.3 Arithmetic and Logic unit The arithmetic and logic unit (ALU) performs arithmetic and logic operations. It also performs the bit operations such as rotate and shift by a defined number of bit positions. The proposed ALU contains three sub-modules, viz. arithmetic, logic and shift modules. The arithmetic unit involves the execution of addition operations and generates Sign flag and Zero flag as per the result shown in the process. In order to reduce the complexity of the adder circuits used in the arithmetic unit of the RISC CPU, a very fast and low power carry select adder circuit has been introduced. The ALU also consists of a modified Wallace tree multiplier, which uses compressor circuits to achieve low power and improved speed of operation. The multiplier is designed to execute in a single cycle. Hence, it satisfies the requirement of the RISC design, to execute single cycle instructions 3. ALU Design Mainly ALU design have LHI, LLI ,Xor, left shift, right shift, adder ,multiplier and halt. ALU by optimizing the design of arithmetic circuits. The previous works in literature focus on energy efficient arithmetic circuits. In order to increase the operating speed and power efficiency of the processor, we have come out with A Low Power Carry Select Adder, A Modified Wallace tree multiplier is proposed in ALU. 3.1 Low power area efficent carry select adder circuit CSA adder, like ripple-carry adders, is the carry has to to travel through every full adder block. There is a way to improve the speed by duplicating the hardware due to the fact that the carry can only be either 0 or 1. The method is based on the conditional sum adder and extended to a carry-select adder. With one RCA.each computing the case of the one polarity of the carry-in, the sum can be obtained with a 2x1 multiplexer with the carry-in as the select signal. The basic idea of this work is to use Binary to Excess-1 Converter (BEC) instead of RCA with in the regular CSLA to achieve lower area and power consumption . The main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure.this work is to use BEC instead of the RCA with cin=1 444 B. Rajesh Kumar et al in order to reduce power consumption of the regular CSA. To replace the n-bit RCA, an n+1bit BEC is required[5]. Fig. 2: Low power area efficent carry select adder circuit. 3.2 Modified Wallace tree multiplier Wallace tree multiplier is used in ALU in order to perform the multiplication operations, The purpose of defining the Wallace tree multiplier in this ALU is because of the full adders used in the above carry select adder, the Wallace tree multiplier comprises of full adders the full adders used in CSA can be called to the Wallace tree multiplier to perform the multiplication operation which in turn leads to reduction of powe [7][6]. Fig. 3: Modified Wallace tree Multiplier.

 Implimentation of A 16-bit RISC Processor for Convolution Application 445 3.3 Register File The register file consists of 8 general purpose registers of 16-bits capacity each. These register files are utilized during the execution of arithmetic and datacentric instructions. It is fully visible to the programmer. It can be addressed as both source and destination using a 3-bit identifier. The register addresses are of 3-bit length, with the range of 000 to 111. The load instruction is used to load the values into the registers and store instruction is used to retrieve the values back to the memory to obtain the processed outputs back from the processor. 3.4 Instruction Decoder Unit (IDU) Program Counter output is input to the IDU, the purpose of IDU is to load and store the data depending on the” WRENA”(write enable) signal ,and also to generate the opcode which is given to the ALU for its operations. 00000 Des addre Datain while ALU operations the status of the counter is as given below Opcode S add Des add 00000 4. ASIC Implementation and Results The RISC processor has been realized using Verilog HDL.The individual modules are designed and tested at each level and finally integrated in the top level module. Individual modules,toplevel module are simulated by using Xilinx ISE14.2. Synthesis, power estimation and area estimation is done by using Cadence. Fig. 5: RISC Simulation output. 446 B. Rajesh Kumar et al 5. Conclusion The design of a single cycle 16-bit RISC processor has been presented. A Low power adder and multiplier structures have been employed in the RISC architecture. The processor has been designed for executing based on the user requirements.

PDF FIle
FUll Paper              Full Paper

No comments:

Post a Comment

LOGIC GATES

LOGIC GATES: