|Mohammed Altaf Ahmed||Department of Computer Engineering, College of Computer Engineering & Sciences, Prince Sattam Bin Abdulaziz University, Alkharj 11942, Saudi Arabia|
|Abdullah Aljumah||Department of Computer Engineering, College of Computer Engineering & Sciences, Prince Sattam Bin Abdulaziz University, Alkharj 11942, Saudi Arabia|
|M Gulam Ahmad||Department of Computer Engineering, College of Computer Engineering & Sciences, Prince Sattam Bin Abdulaziz University, Alkharj 11942, Saudi Arabia|
In this paper, we
propose a design and implementation of a Direct Memory Access Controller (DMAC)
as a part of an SoC. The main purpose of the DMAC design is to integrate it
into a System on a Chip (SoC) for the exchange of a large volume of data
between the memory and peripherals at high speed. The proposed DMAC works on
Advanced Microcontroller Bus Architecture (AMBA) specifications. Internally,
these specifications define two buses, Advanced High-performance Bus (AHB) and Advanced
Peripheral Bus (APB). The Direct Memory Access (DMA) controller functions as
the bridge between AHB and APB and allows them to work in parallel. It works
either in buffer or non-buffer data transfer mode, according to the peripheral
speed. This is synchronized with an asynchronous FIFO. Fast data reads can be
achieved by using an AMBA based DMA controller with a processor in the SoC. This
means that the DMAC provides a high volume of data transfer. Hence, the
proposed DMAC is a better option for high volumes of data, as well as for
timing. It can be concluded that if using this AMBA-based DMA controller the
issues of high speed and high volume data have been resolved. Comparison is
made with ARM processors, such as Cortex A8 and ZC702, and design comparison with
Xilinx DMA is also made. The DMAC is viewed as a more appropriate choice.
AMBA-based DMA; Data transfer rate; DMA; DMA Controller; FPGA; SoC
The purpose of DMA is to reduce the load on the processor. As the term indicates, it accesses memory directly for peripheral devices. If DMA is used with a processor, then data access from the memory is made by the DMA instead of the processor. DMA permits peripheral devices to access the memory directly, without dependency on the processor. Hence, the processor can execute other tasks concurrently, while the DMA is accessing the memory. As such, the overall performance of the system is boosted (Aljumah & Ahmed, 2016). DMA appears to be an easy concept, but system implementation with other hardware subsystems is cumbersome. DMA has many other vital applications, such as network cards, graphics cards and disk drive controllers.
In computer systems, DMA plays a significant role in accessing the memory, and is a vital part or entity of an embedded system. Moreover, it plays a crucial role in SoC systems, providing fairly good speed for transferring data to externally connected peripheral devices. The performance of DMA improves when it works with a bus architecture. This is available in the design literatures of DMA using bus architectures. Intel designed the first DMA, which known as IC 8237. In 1981, IBM used DMA IC 8237 for the first time in its products.
It uses bus architecture, with industry standard architecture (ISA) to improve its performance and was designed to transfer data between the system memory and peripherals (Zayati et al., 2012; Oded, 2012). The DMA design had four channels and transferred 1.6 megabytes of data every second. The individual channels had 64 kilobytes of memory address and were capable of transferring 64 KB of data with a single programming instruction (Barry, 1997). Initially, the system bus and ISA bus were identical. As the CPU of the IBM AT was cloned to work at higher frequencies than an ISA expansion bus, they were separated. An ISA bridge was used for separation (Hou, 2013).
In 1992, the Peripheral Component Interface (PCI), a new bus architecture, was introduced. Communication between the PCI and ISA was through the board. Subsequently, a PCI to ISA adapter was recommended (Jinbiao, 2013); for this reason, the basic architectural design used to contain an adapter block of logic. The hardware block, therefore, includes PCI bus interface circuit design, ISA bus interface circuit design and an I/O finding module logic block to find the peripherals (Hou, 2013). The PCI-bus architecture works on the principle of the master and the master will only have full control of the bus at a time. Only using certain arbitrary techniques, multiple devices can access the bus. An enhancement in the bus architecture took place when the concept of packet switching in full duplex mode was used to interface multiple devices and system memory. This enhanced the bus architecture and was termed PCI express (PCIe) (Li et al., 2009; Anand, 2013; Shengwei, 2016). It had an x1 link pair in its architecture, which contained channels for transmitting and receiving separately. Therefore, bandwidth was doubled compared to the previous architecture.
Apart from these buses for DMA operation, embedded products employ a very useful specific bus architecture in SoCs, known as advanced microcontroller bus architecture (AMBA), which is a registered trademark (ARM, 2017) in the IC industry for Advanced Reduced Instruction Set Computer (RISC) Machines (ARM Ltd). Subsequently, by the end of 1997 the first native AMBA interfaces with cache memory cores were introduced. AMBA was an interconnect specification (on-chip), which was used for managing and connecting the various functional blocks under the SoC. It provided support to various controllers, processors, multiprocessor systems and peripherals and was an open standard system in the industry. Two types of bus system were defined in the specification of AMBA architecture, namely AHB and APB. Nowadays, AMBA is widely used in Application Specific Integrated Circuits (ASIC)-based and SoC-based modern mobile devices. Such products use state machines separately for transmission and reception, achieving moderate data transfer rate (Berawi, 2013; Ahmed et al., 2015; Ejidokun et al., 2018).
In this proposed research study, we designed a direct memory access controller (DMAC) for embedded system-based products, working on advanced microcontroller bus architecture (AMBA). DMA performance and data transfer speed for large volumes of data improve when it uses bus architecture such as AMBA. In this way, we first improved the performance of the DMA controller and then used this DMA engine in the embedded system to improve the performance of the system and of the processor. The performance characteristics of the embedded processor with the DMA controller are consistently better than those without the controller. These performance characteristics are presented in an article on Cypress Semiconductor (Gupta & Natarajan, 2010).
The proposed DMA engine was found to have superior speed and data transfer rates compared to the existing DMA controller used in Xilinx Embedded FPGA (Faezeh & Mohammad, 2017) and the Xilinx logic core IP XPS Central DMA Controller (Xilinx, 2010). A comparison is made in the discussion in section 4. Subsequently, it is explained how the proposed DMAC is better at transferring data compared to the existing ARM cortex-A8 processor and ZC 702 processor. The design frequency of the proposed DMAC is achieved as 478 MHz, with the use of only 167 LUTs as area occupation in the FPGA Spartan 3 device. This is better than that of the existing DMA shown in Table 4 and of the xilinx and cortex-A8 processor. Therefore, the proposed design provides rich features, while keeping the gate count low. This DMA controller can be used for transferring data, while keeping the processor in a very light state when integrated with the SoC. The design is implemented in Field Programmable Gate Array (FPGA) Vertex family.
The paper is organized as follows: section 2 consists of the methodology, while section 3 presents the performance and results. Sections 4 makes an analysis and discussion to suggest the scope for further research work. Section 5 is the conclusion and is followed by the list of references.
With regard to the performance characteristics of the proposed AMBA-based DMAC, based on the investigative observations we can state that it can be considered as a good alternative for SoC design. The volume of data transfer and timing are critical issues. This architecture is a good attempt at improving the characteristics of data transfer, and the DMAC has resolved both issues, which is highlighted by a comparison of the two cases. As illustrated in these, the suggested DMAC proves itself to be superior in the transfer of data at high speed; for example, in multimedia transfer operations. Future developments of this work could include extension to other peripherals, and the generation of a test bench in advanced verification language to stimulate the various peripheral modules connected.
This project was supported by the Deanship of Scientific Research, Prince Sattam bin Abdulaziz University, under research project No. 2017/01/7713.
|R3-EECE-795-20190123175300.pdf||Main Manuscript File in pdf|
Ahmed, M.A., Rani, E.D., Syed, A.S., 2015. FPGA Based High Speed Memory Bist Controller for Embedded Applications. Indian Journal of Science and Technology, Volume 8(33), pp. 1–8
Aljumah, A., Ahmed, M.A., 2015. Design of High Speed Data Transfer Direct Memory Access Controller for System on Chip Based Embedded Products. Journal of Applied Sciences, Volume 15(3), pp. 576–581
Aljumah, A., Ahmed, M.A., 2016. AMBA Based Advanced DMA Controller for SoC. International Journal of Advanced Computer Science and Applications. Volume 7(3), pp. 188–193
Anand, S., 2013. Implementing a PCI-Express AMBA interface Controller on Spartan 6 FPGA. Master’s Thesis, Integrated Electronic System Design, Chalmers University of Technology Sweden
ARM, 1999. AMBA Specification. 2.0. Available Online at http://www-micro.deis.unibo.it/~magagni/amba99.pdf., Accessed on December 17, 2016
ARM, 2019. ARM Developer the Products Category Processors Cortex-A. Available Online at https://developer.arm.com/products/processors/cortex-a/cortex-a8, Accessed on December 17, 2016
ARM., 2017. AMBA Trademark License. Available Online at http://arm.com/about/trademarks/arm-trademark-list/AMBA-trademark.php, Accessed on October 8, 2017
Barry, B., 1997. The Intel Microprocessors Brey Architecture: Programming and Interfacing, Prentice-Hall International. Inc. Fourth Edition 1997, pp. 469
Berawi, M.A., 2013. Modeling and Simulation in Engineering Design and Technology: Improving Project/Product Performance. International Journal of Technology, Volume 4(2), pp. 100–101
Cadence, 2018. Design Reuse by Cadence, Architecture and Implementation of the ARM Cortex-A8 Microprocessor. Available Online at https://www.design-reuse.com/articles/11580/architecture-and-implementation-of-the-arm-cortex-a8-microprocessor.html, Accessed on March 3, 2019
Ejidokun, T.O., Yesufu, T.K., Ayodele, K.P., Ogunseye, A.A., 2018. Implementation of an On-board Embedded System for Monitoring Drowsiness in Automobile Drivers. International Journal of Technology, Volume 9(4), pp. 819–827
Faezeh, S., Mohammad S., 2017. Area and Performance Evaluation of Central DMA Controller in Xilinx Embedded FPGA Designs. In: Iranian Conference on Electrical Engineering (ICEE), pp. 546–550
Flynn, D., 1997. ARM, AMBA: Enabling Reusable on Chip Designs. IEEE Micro, Volume 17(4), pp. 20–27
Gupta, S., Natarajan, L., 2010. Optimizing Embedded Applications using DMA Cypress Semiconductor Corp. In: EE Times Design (http://www.eetimes.com), pp. 1–6
Hou, J., 2013. Study on PCI Bus and ISA Bus Conversion Design. International Journal of Digital Content Technology and its Applications (JDCTA), Volume 7(4), pp. 443–453
Jinbiao, H., 2013. Study on PCI Bus and ISA Bus Conversion Design. International Journal of Digital Content Technology and its Applications (JDCTA), Volume 7(4), pp. 443–453
Li, Bo., Peng, Yu., Liu, Da-T., Peng, Xi-Y., 2009. A High Speed DMA Transaction Method for PCI Express Devices. Journal of Electronic Science and Technology of China, Volume 7(4), pp. 380–84
Oded, M., 2012. Optimizing DMA Data Transfers for Embedded Multi-Cores. Master’s Thesis, Graduate Program, Université Grenoble Alpes, Grenoble, Frances
Roberts-Hoffman, K., Hegde, P., 2009. ARM Cortex-A8 vs. Intel Atom: Architectural and Benchmark Comparisons. University of Texas at Dallas EE6304 Computer Architecture Course Project – Fall. Available Online at http://caxapa.ru/thumbs/229665/armcortexa8vsintelatomarchitecturalandbe.pdf, Accessed on March 3, 2019
Shengwei, M., 2016. Design of a PCIe Interface Card Control Software. In: WDF IEEE Xplore: Instrumentation and Measurement, Computer, Communication and Control (IMCCC), Fifth International Conference, pp. 767–770
Sinha, R., Roop, P., Sinha, S.B., 2014. Correct-by Construction Approaches for SoC Design, the AMBA SoC Platform. First Edition, Springer book, pp. 11–23
Xilinx., 2010. LogiCORE IP Processor Local Bus (PLB), Product Specification, XilinxDS531, September 21, 2010. Available Online at https://www.xilinx.com/support/documentation/ip_documentation/plb_v46.pdf, Accessed on November 2, 2018
Xilinx., 2017. Xilinx Products Boards and Kits-SoC Evaluation Kit ZC702, Zynq-700. Available Online at http://www.xilinx.com/products/boards-and-kits/ek-z7-zc702-g.html#hardware, Accessed on January 16, 2019
Zayati, A., Biennier, F., Badr, M.M.Y., 2012. Towards Lean Service Bus Architecture for Industrial Integration Infrastructure and Pull Manufacturing Strategies. Journal of Intelligent Manufacturing, Volume 23(1), pp 125–139