Akihito Kajikawa
をテンプレートにして作成
[
トップ
] [
新規
|
一覧
|
単語検索
|
最終更新
|
ヘルプ
|
ログイン
]
開始行:
CENTER:SIZE(25){COLOR(blue){Evaluation of Error Detection Mechanism for 3D-OASIS-Network-on-Chip System}}
----
-[[OASIS-VP]]
-[[Members-Internal]]
----
CENTER:&ref(R2Rcrc.png,,30%);
#br
CENTER:&ref(3Dpack.jpeg,,50%);
CENTER:COLOR(green){3D-OASIS-NoC flit format}
----
*Contents [#re59951d]
#CONTENTS
**Background [#nec7d6fe]
During the past decade, 3D-Network-on-Chips (3D-NoCs) have been showing their advantages against 2D-NoC systems. At the same time, concerns about their reliability have grown as well due to the different kinds of faults that these systems may encounter. Therefore, 3D-NoC must be fault-tolerant to any kind of permanent failure or run-time malfunction. To achieve this goal, a fault-detection scheme is necessary to discover the presence of fault before the propagation of the fault into the entire system and cause the its collapse.
Previously, 3D-Fault-Tolerant-OASIS (3D-FTO) has been designed. 3D-FTO is able to recover from a large number of faults that can occur at links, input-buffers, and crossbar. However in this system, a fault detection mechanism is absent and the diagnosis of faults rely on assuming the presence of faults at a certain period of time. This make the fault recovery less efficient and diminish the reliability of the system.
**Problems and Motivation [#necfd6fe]
In our [[previous research>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_MitsunariIshii_Thesis.pdf]], a Network Interface (NI) to enhance the reliability of 3D-FTO system was presented. In the
proposed NI, an End-to-End error-detection mechanism based on Cyclic Redundancy Check (CRC) was implemented. However, this mechanism suffers from several problems:
-- When using End-to-End detection, the fault occurrence can be detected; however, it cannot be localized. Therefore, the fault diagnosis phase cannot be achieved.
-- At the detection of fault, the transmitter NI resends the flits along the same path already detected faulty. Consequently, the resent flit follows the same faulty path; thus, creating a significant latency and power overheads. Furthermore, the flit might not reach its destination, especially when the detected fault is permanent and cannot be recovered.
-- In case where a fault occurs at the first hops, it is only detected at the receiving NI. This results in an increasing end-to-end latency since the flit has to travel many hops to the destination and then resent again, even when the fault is in the first hop.
-- The previously proposed scheme can detect the presence of fault; but, it cannot recover from it.
-- The verification was made using test-benches with small Random-Number-Generator application without using real processors running real applications.
**Research goal[#mec7d6fe]
The main goal of this research is to design and implement a fault detection and correction scheme for 3D-Fault-Tolerant-OASIS (3D-FTO). The scheme is based on Error-Detection-Codes (EDC) and Error-Correction-Codes (ECC). It should detect the presence of any kind of errors or malfunction and make the necessary communications with the different modules of 3D-FTO to perform the quick recovery ensuring a graceful performance degradation as less as possible. In addition, the proposed scheme is implemented and tested on a 2x2x2 network using MIPS cores attached to 3D-FTO and running real applications.
***RPS [#e3620466]
- June 4th, 2015: [[Efficient Error Detection Mechanism for 3D-OASIS-Network-on-Chip System>https://onedrive.live.com/redir?resid=593c49e4ef2f5b29!112&authkey=!AE_U3sYRp1U7SLM&ithint=file%2cpdf]]
- July 2nd, 2015: [[Efficient Error Detection Mechanism for 3D-OASIS-Network-on-Chip System>https://onedrive.live.com/redir?resid=593C49E4EF2F5B29!123&authkey=!AEXLZCbf_1Mdcqs&ithint=file%2cpdf]]
***RPR [#g6eea464]
**Research plan[#lec7d6fe]
***Step 1 [#y4f17ca9]
- Understand and run Mr. Ishi's system: [[slides.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_Ishii_Final_Slide.pdf]], [[Thesis.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_Ishii_Thesis_Final.pdf]]; [[Technical Report>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/MitsunariIshii_TR2014.pdf]]
- Understand these presentations:
--What is OASIS NoC? [[Slides 1>http://webfs-int.u-aizu.ac.jp/~benab/publications/theses/Mori-MS-11/m5141120_2011_MS_slides.pdf]]; [[Slides 2>http://web-ext.u-aizu.ac.jp/~benab/classes/aco/lectures/noc-invited/invited_05_6_2014.pdf]]
***Step 2 [#g7e0becf]
-Understand [[OASIS 3D-Router Verliog HDL Source Code>http://aslweb.u-aizu.ac.jp/aslint/index.php?3D-ONoC-Verilog]]
- Run on your won machine these 2 tutorials COLOR(red){(This step is optional. Note that your target device is FPGA. Our final target technology is ASIC)}.
--[[OASIS 3D Router Design Tutorial>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS_Router_PhysicalDesign_technical_report_2014.pdf]]
--[[OASIS 3D Fault Tolerant Router Hardware Physical Design with TSV>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS-3DFTRV1_Design_Tutorial_05282015.pdf]]
***Step 3 [#p220cb46]
- Investigate about the different approaches for fault detection in NoC systems
- Modify the flit format to host the additional code portion for fault detection and correction.
- Make the necessary modification for the remaining modules of the router.
***Step 4 [#v5e30d43]
- Evaluate the performance of the scheme (Area, power, latency, ...)
- If time is enough, design a full 2x2x2 NoC system. (Optional)
***Step 5 [#b827e5ed]
Thesis Writing.
**References [#q0c7d0a8]
-Mitsunari Ishii, COLOR(blue){Design and Evaluation of Efficient Error Detection Mechanism for OASIS 3D-NoC}, Bachelor Thesis, School of Computer Science and Engineering, The University of Aizu, March 2015. [[slides.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_Ishii_Final_Slide.pdf]], [[Thesis.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_MitsunariIshii_Thesis.pdf]], [[Technical Report.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/MitsunariIshii_TR2015.pdf]],
-[[前論文テーマ (Previous GT)>http://aslweb.u-aizu.ac.jp/aslint/index.php?Theses]]
-[[CRC_study_in_Hardware_In_Japanese.pdf(巡回冗長検査 CRC32 のハード/ソフト最適分割の検討)>http://web-ext.u-aizu.ac.jp/~benab//research/references/error-correction/5.pdf]]
-[[A CRC Verilog description module for a hard real time communication protocols in a control distributed systems.pdf>http://web-ext.u-aizu.ac.jp/~benab//research/references/error-correction/1.pdf]]
-OKADA Network Interface: http://aslweb.u-aizu.ac.jp/aslint/index.php?Theses#sf0a04d4
-COLOR(red){Run this Tutorial on your machine.}
[[OASIS 3D-Router Hardware Physical Design>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS_Router_PhysicalDesign_technical_report_2014.pdf]], COLOR(olive){Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, School of Computer Science and Engineering, University of Aizu}, July 8, 2014.
-Akram Ben Ahmed, Mitsuhiro Nakamura, Abderazek Ben Abdallah, [[OASIS 3D Fault Tolerant Router Hardware Physical Design with TSV>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS-3DFTRV1_Design_Tutorial_05282015.pdf]], COLOR(olive){Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, School of Computer Science and Engineering, University of Aizu}, May 28, 2015.
-Akram Ben Ahmed, Abderazek Ben Abdallah, [[OASIS 3D Router Design Tutorial>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS_Router_PhysicalDesign_technical_report_2014.pdf]], COLOR(olive){Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, School of Computer Science and Engineering, University of Aizu, July 8, 2014.}
**OASIS NoC [#g05fde5d]
-Verilog Source Code: http://aslweb.u-aizu.ac.jp/aslint/index.php?3D-ONoC-Verilog
**Backup[#w4f759e1]
-[[prev_ishii>Akihito Kajikawa/prev_ishii]]
**Others [#k7a47b2a]
-[[Analysis of Error Recovery Schemes for Networks-on-Chips>https://drive.google.com/file/d/0B2HMlO4p7SuwTGJodnYwc1puZDg/view?usp=sharing]]
----
Updates:
-July 2, 2015: Added RPR/RPS presentations
終了行:
CENTER:SIZE(25){COLOR(blue){Evaluation of Error Detection Mechanism for 3D-OASIS-Network-on-Chip System}}
----
-[[OASIS-VP]]
-[[Members-Internal]]
----
CENTER:&ref(R2Rcrc.png,,30%);
#br
CENTER:&ref(3Dpack.jpeg,,50%);
CENTER:COLOR(green){3D-OASIS-NoC flit format}
----
*Contents [#re59951d]
#CONTENTS
**Background [#nec7d6fe]
During the past decade, 3D-Network-on-Chips (3D-NoCs) have been showing their advantages against 2D-NoC systems. At the same time, concerns about their reliability have grown as well due to the different kinds of faults that these systems may encounter. Therefore, 3D-NoC must be fault-tolerant to any kind of permanent failure or run-time malfunction. To achieve this goal, a fault-detection scheme is necessary to discover the presence of fault before the propagation of the fault into the entire system and cause the its collapse.
Previously, 3D-Fault-Tolerant-OASIS (3D-FTO) has been designed. 3D-FTO is able to recover from a large number of faults that can occur at links, input-buffers, and crossbar. However in this system, a fault detection mechanism is absent and the diagnosis of faults rely on assuming the presence of faults at a certain period of time. This make the fault recovery less efficient and diminish the reliability of the system.
**Problems and Motivation [#necfd6fe]
In our [[previous research>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_MitsunariIshii_Thesis.pdf]], a Network Interface (NI) to enhance the reliability of 3D-FTO system was presented. In the
proposed NI, an End-to-End error-detection mechanism based on Cyclic Redundancy Check (CRC) was implemented. However, this mechanism suffers from several problems:
-- When using End-to-End detection, the fault occurrence can be detected; however, it cannot be localized. Therefore, the fault diagnosis phase cannot be achieved.
-- At the detection of fault, the transmitter NI resends the flits along the same path already detected faulty. Consequently, the resent flit follows the same faulty path; thus, creating a significant latency and power overheads. Furthermore, the flit might not reach its destination, especially when the detected fault is permanent and cannot be recovered.
-- In case where a fault occurs at the first hops, it is only detected at the receiving NI. This results in an increasing end-to-end latency since the flit has to travel many hops to the destination and then resent again, even when the fault is in the first hop.
-- The previously proposed scheme can detect the presence of fault; but, it cannot recover from it.
-- The verification was made using test-benches with small Random-Number-Generator application without using real processors running real applications.
**Research goal[#mec7d6fe]
The main goal of this research is to design and implement a fault detection and correction scheme for 3D-Fault-Tolerant-OASIS (3D-FTO). The scheme is based on Error-Detection-Codes (EDC) and Error-Correction-Codes (ECC). It should detect the presence of any kind of errors or malfunction and make the necessary communications with the different modules of 3D-FTO to perform the quick recovery ensuring a graceful performance degradation as less as possible. In addition, the proposed scheme is implemented and tested on a 2x2x2 network using MIPS cores attached to 3D-FTO and running real applications.
***RPS [#e3620466]
- June 4th, 2015: [[Efficient Error Detection Mechanism for 3D-OASIS-Network-on-Chip System>https://onedrive.live.com/redir?resid=593c49e4ef2f5b29!112&authkey=!AE_U3sYRp1U7SLM&ithint=file%2cpdf]]
- July 2nd, 2015: [[Efficient Error Detection Mechanism for 3D-OASIS-Network-on-Chip System>https://onedrive.live.com/redir?resid=593C49E4EF2F5B29!123&authkey=!AEXLZCbf_1Mdcqs&ithint=file%2cpdf]]
***RPR [#g6eea464]
**Research plan[#lec7d6fe]
***Step 1 [#y4f17ca9]
- Understand and run Mr. Ishi's system: [[slides.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_Ishii_Final_Slide.pdf]], [[Thesis.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_Ishii_Thesis_Final.pdf]]; [[Technical Report>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/MitsunariIshii_TR2014.pdf]]
- Understand these presentations:
--What is OASIS NoC? [[Slides 1>http://webfs-int.u-aizu.ac.jp/~benab/publications/theses/Mori-MS-11/m5141120_2011_MS_slides.pdf]]; [[Slides 2>http://web-ext.u-aizu.ac.jp/~benab/classes/aco/lectures/noc-invited/invited_05_6_2014.pdf]]
***Step 2 [#g7e0becf]
-Understand [[OASIS 3D-Router Verliog HDL Source Code>http://aslweb.u-aizu.ac.jp/aslint/index.php?3D-ONoC-Verilog]]
- Run on your won machine these 2 tutorials COLOR(red){(This step is optional. Note that your target device is FPGA. Our final target technology is ASIC)}.
--[[OASIS 3D Router Design Tutorial>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS_Router_PhysicalDesign_technical_report_2014.pdf]]
--[[OASIS 3D Fault Tolerant Router Hardware Physical Design with TSV>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS-3DFTRV1_Design_Tutorial_05282015.pdf]]
***Step 3 [#p220cb46]
- Investigate about the different approaches for fault detection in NoC systems
- Modify the flit format to host the additional code portion for fault detection and correction.
- Make the necessary modification for the remaining modules of the router.
***Step 4 [#v5e30d43]
- Evaluate the performance of the scheme (Area, power, latency, ...)
- If time is enough, design a full 2x2x2 NoC system. (Optional)
***Step 5 [#b827e5ed]
Thesis Writing.
**References [#q0c7d0a8]
-Mitsunari Ishii, COLOR(blue){Design and Evaluation of Efficient Error Detection Mechanism for OASIS 3D-NoC}, Bachelor Thesis, School of Computer Science and Engineering, The University of Aizu, March 2015. [[slides.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_Ishii_Final_Slide.pdf]], [[Thesis.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/theses/Ishi-BS-2015/GT2015_MitsunariIshii_Thesis.pdf]], [[Technical Report.pdf>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/MitsunariIshii_TR2015.pdf]],
-[[前論文テーマ (Previous GT)>http://aslweb.u-aizu.ac.jp/aslint/index.php?Theses]]
-[[CRC_study_in_Hardware_In_Japanese.pdf(巡回冗長検査 CRC32 のハード/ソフト最適分割の検討)>http://web-ext.u-aizu.ac.jp/~benab//research/references/error-correction/5.pdf]]
-[[A CRC Verilog description module for a hard real time communication protocols in a control distributed systems.pdf>http://web-ext.u-aizu.ac.jp/~benab//research/references/error-correction/1.pdf]]
-OKADA Network Interface: http://aslweb.u-aizu.ac.jp/aslint/index.php?Theses#sf0a04d4
-COLOR(red){Run this Tutorial on your machine.}
[[OASIS 3D-Router Hardware Physical Design>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS_Router_PhysicalDesign_technical_report_2014.pdf]], COLOR(olive){Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, School of Computer Science and Engineering, University of Aizu}, July 8, 2014.
-Akram Ben Ahmed, Mitsuhiro Nakamura, Abderazek Ben Abdallah, [[OASIS 3D Fault Tolerant Router Hardware Physical Design with TSV>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS-3DFTRV1_Design_Tutorial_05282015.pdf]], COLOR(olive){Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, School of Computer Science and Engineering, University of Aizu}, May 28, 2015.
-Akram Ben Ahmed, Abderazek Ben Abdallah, [[OASIS 3D Router Design Tutorial>http://web-ext.u-aizu.ac.jp/~benab/publications/treport/OASIS_Router_PhysicalDesign_technical_report_2014.pdf]], COLOR(olive){Technical Report, Adaptive Systems Laboratory, Division of Computer Engineering, School of Computer Science and Engineering, University of Aizu, July 8, 2014.}
**OASIS NoC [#g05fde5d]
-Verilog Source Code: http://aslweb.u-aizu.ac.jp/aslint/index.php?3D-ONoC-Verilog
**Backup[#w4f759e1]
-[[prev_ishii>Akihito Kajikawa/prev_ishii]]
**Others [#k7a47b2a]
-[[Analysis of Error Recovery Schemes for Networks-on-Chips>https://drive.google.com/file/d/0B2HMlO4p7SuwTGJodnYwc1puZDg/view?usp=sharing]]
----
Updates:
-July 2, 2015: Added RPR/RPS presentations
ページ名: