ENERGY-EFFICIENT MULTICORE SOCS/NOCS WITH MACHINE LEARNING

Current Research Members:  Prof. Ben Abdallah Abderaezk; TBD


High-performance embedded and general purpose computer system researches are expanding beyond the usual focus on performance to have other quantitative and qualitative criteria. These quantitative problems include energy/power consumption, and the qualitative problems mainly include portability, reliability, and fault tolerance. Since there is no single architecture that is capable of fulfilling all these requirements, future systems will require some kind of adaptivity. This means systems should monitor themselves, analyze their own behavior, and adapt to several execution environments. From another hand,  the attraction of multicore processing for power reduction is compelling in embedded and in general purpose computing. By splitting a set of tasks among multiple cores, the operating frequency necessary for each core can be reduced, thereby facilitating a reduction in the voltage on each core. As dynamic power is proportional to the frequency and to the square of the voltage, we are able to obtain a sizable gain, even though we may have more cores running.
Current design methods are inclined toward mixed hardware/software (HW/SW) co-designs, targeting multicore SoCs for application-specific domains. To decide on the lowest cost mix of cores, designers must iteratively map the device’s functionality to a particular HW/SW partition and target architectures. In addition, to connect the heterogeneous cores, the architecture requires high performance-based complex communication architectures and efficient communication protocols, such as hierarchical bus, point-to-point connection, or the recent new interconnection paradigm—network on chip. We are currently researching about algorithm and hardware for power-efficient multicore SoCs/NoCs empowered by innovative machine learning algorithms. Our target applications range from embedded SoCs to tiny IoT devices applications.


SP4: Energy-efficient Multicore SoC Design with Machine Learning 


SP3: Dependable Real-Time Multicore System-on-Chip for Elderly Health Monitoring (BANSMOM)

Recent technological advances in wireless n to change fundamentally the way elderly health care services are practiced. Traditionally, embedded personal medical monitoring systems have been used only to collect data. Data processing and analysis are performed off-line, making such devices impractical for continual monitoring and early detection of medical disorders. The goal of this project is to research a smart, dependable embedded system to monitor elderly health remotely and in real-time. In particular, we investigate an extreme area in the design space of networked embedded objects: the domain of low energy, and real-time. Issues related to the design, implementation, and deployment of such systems are also studied.

  • Achraf Ben Ahmed, ”Interactive Real-time Interface for Smart Remote Health MonitoriAchraf ng and Analysis”, Master’s Thesis, Graduate School of Computer Science and Engineering, The University of Aizu, Feb. 2013 [Thesis.pdf] [Slides.pdf] 

SP2: Queue Computer 

This project focuses on the research about a novel low power and high-performance parallel processor based on Queue computation model, whqc2ere Queue programs are generated by traversing a given data flow graph using level order traversal. The Queue processor uses a circular queue register to manipulates operands and results and exploits parallelism dynamically with “little efforts” when compared with conventional architectures. The nonexistence of false dependencies allows programs to expose maximum parallelism that the queue processor can execute without complex and power-hungry hardware such as register renaming and large instruction windows. Parallel processing allows queue processors to speed-up the execution of applications. We are researching and developing a complete tool-chain for this promising computing model consisting of a compiler, assembler, functional and cycle accurate simulator, and hardware design.

  • Hiroki Hoshino, Development of Parallel Queue Processor Architecture and its Integrated Development Environment, Master’s Thesis, The University of Aizu, Feb. 2011. [slides]
  • Masashi Masuda, Produced Order Queue Compiler Design, Mater’s Thesis, The University of Aizu, Feb. 2011. [slides]
  • A. Ben Abdallah, M. Masuda, A. Canedo, K. Kuroda, Natural Instruction Level Parallelism-aware Compiler for High-Performance Processor Architecture, The Journal of Supercomputing, Volume 57, Number 3, pp. 314-338, Sept. 2011.[DOI]
  • A. Canedo, A. Ben Abdallah, and M. Sowa, Efficient Compilation for Queue Size-Constrained Queue Processors, The Journal of Parallel Computing, Vol.35, pp. 213-225, 2009.
  • A. Canedo, A. Ben Abdallah, and M. Sowa, “Compiler Support for Code Size Reduction using a Queue-based Processor,” Tqc3ransactions on High-Performance Embedded Architectures and Compilers, Vol. 2, Issue 4, pp. 269-285, 2009.
  • A. Ben Abdallah, A. Canedo, T. Yoshinaga, and M. Sowa, The QC-2 Parallel Queue Processor Architecture, Journal of Parallel and Distributed Computing, Vol. 68, No. 2, pp. 235-245, 2008.
  • A. Canedo, A. Ben Abdallah, and M. Sowa, “A New Code Generation Algorithm for 2-offset Producer Order Queue Computation Model”, Journal of Computer Languages, Systems & Structures, Vol. 34, Issue 4, pp. 184-194, 2007.
  • A. Ben Abdallah, Sotaro Kawata, and M. Sowa, “Design and Architecture for an Embedded 32-bit QueueCore”, Journal of Embedded Computing, Special Issue in embedded single-chip multicore architectures, Vol. 2, No. 2, pp. 191-205, 2006.
  • A. Ben Abdallah, T. Yoshinaga, and M. Sowa, “High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core,” Journal of Supercomputing, Vol. 38, Number 1, pp. 3-15, 2006
  • M. Masuda, A. Ben Abdallah, A. Canedo, “Software and Hardware Design Issues for Low-Complexity High-Performance Processor Architecture”, The 38th International Conference on Parallel Processing Workshops, pp. 558-565, 2009
  • M. Masuda, A. Canedo, A. Ben Abdallah, “Efficient Code Generation Algorithm for Natural Instruction Level Parallelism-aware Queue Architecture,” The 19th Intelligent System Symposium (FAN 2009), pp.308-313, Sep. 2009.(Best Presentation Award).
  • H. Hoshino, A. Ben Abdallah, and K. Kuroda, “Advanced Optimization and Design Issues of a 32-bit Embedded Processor Based on Produced Order Queue Computation Model”, IEEE/IFIP Int’l Conf. on Embedded and Ubiquitous Computing (EUC2008),pp.16-22, Dec.2008.
  • A. Canedo, A. Ben Abdallah, and M. Sowa, “Quantitative Evaluation of Common Subexpression Elimination on Queue Machines,” Proc. IEEE Int’l Sym. on Parallel Architectures, Algorithms, and Networks (I-SPAN 2008), pp.25-30. 2008.
  • A. Canedo, A. Ben Abdallah, and M. Sowa, “Queue Register File Optimization Algorithm for QueueCore Processor,” Proc. IEEE 19th International Symposium on Computer Architecture and High-Performance Computing (SBAC-PAD 2007), pp. 169-176, 2007.
  • A. Ben Abdallah, Mudar Sarem, and M. Sowa, ”Dynamic Fast Issue Mechanism (DFI) for Dynamic Scheduled Processors”, IEICE Transactions on Fundamentals of Electronics, Communications, and Computer Science, Vol. E83-A No.12 pp.2417-2425, Dec. 2000.

SP1: Self-Adaptive Processor (SAP)

We are investigating novel concepts of run-time and resource-aware programming as well as self-adaptive architecture which evaluates its global behavior and change it when better functionality or performance is possible. The challenge is often to identify how to change specific behaviors to achieve the desired improvement. This research is a novel computing model and architecture based on an innovative self-adaptation behavior, which will provide scalability, high resource utilization, and high-performance by dynamically synthesizing the right resources composition based on temporal program performance demand(s). A given program is considered as a set of small objects competing with other programs running on the same many-core platform. In this computing model, a given program can dynamically occupy/find new space and spread/migrate its computation (various parallelism levels) to appropriate neighbor cores. Using our dynamic temporal-allocation-technique, a given program can de-allocate resources according to its new performance requirement.

  • Taichi Maekawa, Design, and Evaluation of Dual Mode Processor Architecture, Master’s Thesis, The University of Aizu, Feb. 2011.
  • Taichi Maekawa, Abderazek Ben Abdallah, Kenichi Kuroda, Single Instruction Dual-Execution Model Processor Architecture, IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, Shanghai, pp.30-36, Dec. 2008.
  • Mushiq Akanda, A. Ben Abdallah, and M. Sowa, Dual-Execution Mode Processor Architecture for Embedded Applications”, Journal of Mobile Multimedia, Vol. 3, No. 4, pp. 347-370, 2007.
  • Mushfiq Akanda, Abderazek Ben Abdallah, Sowa Masahiro, Dual-Execution Mode Processor Architecture, Journal of Supercomputing, Vol. 44, No. 2, pp. 103-125, 2008.
  • Abderazek Ben Abdallah, Soshi Shigeta, Tsutomu Yoshinaga, Masahiro Sowa, On the Design of Register-Queue Based Processor Architecture (FaRM-rq), Lecture Notes in Computer Science, Springer-Verlag, vol. 2745, pp. 248-262, July 2003.
  • Abderazek Ben Abdallah, Soshi Shigeta, Tsutomu Yoshinaga, Masahiro Sowa, Reduced Bit-Width Instruction Set Architecture for Q-mode Execution in Hybrid Processor Architecture (FaRM-rq), IPSJ SIG TR, pp. 19-23, June 2003.
  • Mushifiq Akanda, Abderazek Ben Abdallah, Masahiro Sowa, Dual-Execution Mode Processor Architecture for Embedded Applications, in Journal of Mobile Multimedia, Vol. 3, No. 4, 2007, pp. 347-370.
  • Abderazek Ben Abdallah, Soshi Shigeta, Tsutomu Yoshinaga, Masahiro Sowa, Complexity Analysis of a Functional Assignment Register Microprocessor, Proc. of the Int. Workshop on Modern Science and Technology, IWMST02, pp.116-123, Sep. 2002.
  • Abderazek Ben Abdallah, Dynamic Instructions Issue Algorithm and a Queue Execution Model Towards the Design of Hybrid Processor Architecture”, Ph.D. thesis, Graduate School of Information Systems, the Univ. of Electro-Communications, March 2002.
  • Abderazek Ben Abdallah, Kirilka Nikolova, Masahiro Sowa, FARM-Queue Mode: On a Practical Queue Execution Model, Proceedings of the Int. Conf. on Circuits and Systems, Computers and Communications, Tokushima, Japan, pp.939-944, July 2001.
  • Abderazek Ben Abdallah, Mudar Sarem, Masahiro Sowa, Dynamic Fast Issue Mechanism (DFI) for Dynamic Scheduled Processors, IEICE Transaction on Fundamental of Electronics, Communications and Computer Science, Vol.E83-A No.12 pp.2417-2425, Dec, 2001
  • Abderazek Ben Abdallah, Kirilka Nikolova Tsutomu Yoshinaga, Masahiro Sowa, FARM Queue Mode: On a Practical Queue Execution Model (QEM), TIWSS’2001, October 2001.
  • Abderazek Ben Abdallah, Kirilka Nikolova, Masahiro Sowa, FARM-Queue Execution Model: Towards an Alternative Computing Paradigm, Proceedings of IPSJ Symposium, Yokohama pp.99-100, March 2000
  • Abderazek Ben Abdallah, Mudar Sarem, Masahiro Sowa, Acyclic DFG on a Queue Machine, JSPP2000, Tokyo, Japan, pp.119-120, 2000.
  • Abderazek Ben Abdallah, Mudra Sarem., Masahiro Sowa, Instruction Scheduling System for Superscalar Processors, JSPP2000, Tokyo, Japan, pp.161, Apr. 2000.
  • Abderazek Ben Abdallah, Masahiro Sowa, DRA: Dynamic Register Allocator Mechanism For FaRM Microprocessor. The 3rd International Workshop on Advanced Parallel Processing Technologies, Changsha, PRC, pp.131-136, Oct. 1999

Permanent link to this article: https://adaptive.u-aizu.ac.jp/?page_id=5006