The range of topics covers theory, architecture, algorithms, design systems, and applications that demonstrate the benefits of reconfigurable computing: • Theory - Synthesis, Mapping, Parallelization, Partitioning... • Software – CAD Systems and Languages, Compilers, Operating Systems... • Hardware - Adaptive and Dynamic Hardware, Heterogeneous and Reconfigurable Architectures... • Applications – HPC, Mobile Computing, Automotive Industry, Space and Military, Smart Cameras...
Additional Info
  • Publisher: Laxmi Publications
  • Language: English
  • Chapter 1

    Hardware Parallel Decoder of Compressed HTTP Traffic on Service-oriented Router Price 9.00  |  9 Rewards Points

    This paper proposes a parallel GZIP decoder architecture that includes a multiple context manager for decompressing network streams directly on a router. On the Internet, some HTTP packet streams are encoded by GZIP. Moreover, Internet content is often divided into smaller packets and transmitted without regard to the original order of the packets. The previously proposed Service-oriented Router for content-based packet stream processing needs to decode GZIP data in order to analyze packet payloads. The proposed GZIP decoder is implemented in hardware in order to process the data of multiple network data streams quickly and concurrently using context switching. The GZIP decoding hardware logic is simulated by Verilog- HDL. When one dictionary generation module and eight decoding modules are designed using FPGA, the throughput becomes 0.71 Gbps. When this architecture is synthesized in ASIC, the throughput reaches 10.41 Gbps and the circuit area of that architecture becomes 0.14mm2.

  • Chapter 2

    Simplifying Microblaze to Hermes NoC Communication through Generic Wrapper Price 9.00  |  9 Rewards Points

    In this paper an easy microprocessor to NoC connection strategy, based in a hardware wrapper design is proposed. The implemented wrapper simplifies the connection between a network on chip infrastructure and several MicroBlaze softcore processors. Proposed strategy improves the design process of a parallel computing environment. Wrapper development process, synthesis results and functionality test are showed and analyzed. 

  • Chapter 3

    An Area-Efficient Asynchronous FPGA Architecture for Handshake-Component-Based Design Price 9.00  |  9 Rewards Points

    This paper presents an area-efficient FPGA architecture for handshake-component-based design. The handshake-component-based design is suitable for largescale, complex asynchronous circuit because of its understandability. However, conventional FPGA architecture for handshake-component-based design is not area-efficient because of its complex logic blocks. This paper proposes an area-efficient FPGA architecture that combines complex logic blocks (LBs) and simple LBs. Complex LBs implement handshake components that implement data path controller, and simple LBs implement handshake component that implement data path. The FPGA based on the proposed architecture is implemented in a 65nm process. Its evaluation results show that the proposed FPGA can implement asynchronous circuits efficiently.

  • Chapter 4

    Implementing Alamouti's 2x1 Transmit Diversity on Software Defined Radios Price 9.00  |  9 Rewards Points

    The premise of this project is to provide a proof of concept of Alamouti’s remarkably celebrated 2x1 transmit diversity scheme with the aid of Software Defined Radios. We aim at producing the same results as Alamouti, in an environment that behaves as a frequency selective and slow fading channel. The software-defined radios provide a remote RF front-end to conduct this experiment however; the real encoding and combining are done through Simulink natively on external host machines. Index Terms—Alamouti, DBPSK, Mathworks Simulink,Matched Filter, PN sequences,Space Time Encoding, Software Defined Radios, USRP2.

  • Chapter 5

    Heuristicly Driven Task Agglomeration in Limited Resource Partially-Reconfigurable Systems Price 9.00  |  9 Rewards Points

    This paper introduces a method for enhancing run time performance of a dynamic partially reconfigurable system. The technique is applied to fully deterministic task systems that are large in comparison to the resources of the target reconfigurable device. Performance improvements are realized by increasing the granularity of the task system at compile time in a manner that reduces the number of context switches that are required during run-time, thereby decreasing the system execution time. Two algorithms are proposed to implement this technique. Both methods are implemented using simulation, and their performance is compared to a sophisticated heuristic scheduler, which reveals a significant improvement in performance.

  • Chapter 6

    An Automatic Design and Implementation Framework for Reconfigurable Logic IP Core Price 9.00  |  9 Rewards Points

    Conventional full-custom reconfigurable logic device design and implementation are time consuming processes. In this research, we propose a design framework in order to improve FPGA IP core design efficiency by link academic FPGA design flow and commercial VLSI CADs based on the synthesizable method. A novel FPGA routing tool is developed in this framework, namely the EasyRouter. By using simple templates, EasyRouter can automatically generate the HDL codes and the configuration bitstream for an FPGA. With this design flow, accurate physical information can be reported when a new FPGA architecture is evaluated with reliable commercial VLSI CADs. For FPGA architectures that cannot be easily implemented with present VLSI process, EasyRouter provides a fast performance analysis flow, which improved delay accuracy 5.1 times than VPR on average

  • Chapter 7

    Types, Signatures, Interfaces, and Components in NOOP: The Core of an Adaptive Run-time Price 9.00  |  9 Rewards Points

    Python is a dynamic language well suited to build a runtime providing adaptive support to distributed applications. NOOP introduces a type language and a way to apply typing to functions (and methods). This type system is described in the first part of this paper. The second part use this type system to create interfaces and a software component model. And finally it is discussed how NOOP can provide adaptive support to distributed applications.

  • Chapter 8

    Heterogeneous Multicore Platform with Accelerator Templates and Its Implementation on an FPGA with Hard-core CPUs Price 9.00  |  9 Rewards Points

    Heterogeneous multi-core architectures with CPUs and accelerators attract many attentions since they can achieve power-efficient computing in various areas from low-power embedded processing to high-performance computing. Since the optimal architecture is different from application to application, finding the most suitable accelerator is very important. In this paper, we propose an FPGA-based heterogeneous multi-core platform with custom accelerator templates. Accelerator templates can be reused after optimizing for different applications. According to the evaluation, the proposed platform gives comparable performance to the industrial heterogeneous multicore processors at around 1W of power.

  • Chapter 9

    On-demand Fault Scrubbing Using Adaptive Modular Redundancy Price 9.00  |  9 Rewards Points

    We present an architectural framework for N- Modular Redundant (NMR) systems exploiting the dynamic partial reconfiguration capability of FPGAs. Partial recon- figuration is used to dynamically construct the throughput datapath under failure conditions. The throughput datapath utilizes only one instance of a Functional Element (FE) while the other instances undergo evaluation by being subjected to the same actual inputs to the system. A software-based process is shown to be sufficient to periodically monitor the health of the active and standby FEs, thus avoiding a hardware voter in the datapath. The defective behavior of an active FE triggers the reconfiguration process and con- sequently a healthy element is introduced into the datapath. Meanwhile, sustainability is increased by refurbishing faulty FEs using Genetic Algorithms (GAs) to circumvent aging or radiation-induced hard faults. Furthermore, the config- uration bitstreams are protected in the flash memory using Reed-Solomon codes to provide multi-bit block correction. Together, this hybrid of adaptive modular redundancy and online error correction is shown to provide fault coverage at very low latency overhead.

  • Chapter 10

    Reducing Floating-Point Error Based on Residue-Preservation and Its Evaluation on an FPGA Price 9.00  |  9 Rewards Points

    Although scientific computing is gaining many attentions, calculations using computers always associated with arithmetic errors. Since computers have limited hardware resources, rounding is necessary. When using iterative computations, the rounding errors are added and propagated through the whole computation domain so that the final results can be completely wrong. In this paper, we propose a floating-point error reduction method and its hardware architecture for addition. The proposed method is based on preserving the residue coursed by rounding and reusing the preserved value in next iteration. The evaluation shows that the proposed method gives almost the same accuracy as the conventional double-precision floating point computation. Moreover, using the proposed method is 24% area efficient than using a conventional double-precision adder.

  • Chapter 11

    A Novel Parallel Computing Approach for Motion Estimation Based on Particle Swarm Optimization Price 9.00  |  9 Rewards Points

    Eventhough the area of video compression has existed for many decades, programming a coding algorithm is still a challenging problem. The actual bottleneck is to provide compressed video in real-time to communication systems. All those constraints have to be solved while keeping a good tradeoff between visual quality and compression rates. In this context, Motion Estimation (ME) is known to be a key operation. On the other hand, in the hardware industry, there is great emphasis on High Performance Computing (HPC) which is characterized by a shift to multi and many core systems. The programming community has to embrace the new parallelismin order to take advantage of the performance gains offered by the new technology. In this research work, we introduce a novel ME scheme with high level of data parallelism. It is capable of performing motion search for all the blocks of the frame in parallel using a modified Particle Swarm Optimization (PSO). This scheme can be implemented on Nvidia’s massively parallel Graphical Processing Units (GPUs) to yield tremendous speedup as compared to existing techniques.

  • Chapter 12

    Addressing the Challenges of Hardware Assurance in Reconfigurable Systems Price 9.00  |  9 Rewards Points

    Despite the numerous advantages of nanometer technologies, the increase in complexity also introduces a viable vector for attacking an integrated circuit (IC): a hardware attack, also known as a hardware Trojan. Since such an attack is implemented within the hardware of a design, it is generally undetectable to any software operating on this circuitry. To make matters worse, a hardware attack could be introduced at almost any point in a design’s development cycle, be it through third-party intellectual property (IP) licensed for a design, or through unknown modifications made during the fabrication process. This malicious hardware could act as a kill-switch for a vital device, or as a data-leak for sensitive information. Activation would occur at some predetermined time or by a trigger from a malicious agent. An effective method is required to find such unexpected functionality. This paper describes several key challenges to be addressed in order to provide hardware assurance for trustworthy systems. We examine the platform of field programmable gate arrays (FPGAs) both for their potential vulnerability to threats within third-party IP as well as their capability to accelerate the testing of those modules.

About the Author

Not available