Categories

Mobile — 112
NSL — 22
NSL Project — 2

StreamBox-TZ: Secure Stream Analytics at the Edge with TrustZone

by Sihyeong Park on 2019-10-08 16:13:29

Date: 2019. 10. 10 (Thu) 19:30 Locate: EB5. 507 Presenter: Sihyeong Park Title: StreamBox-TZ: Secure Stream Analytics at the Edge with TrustZone Author: Heejin Park and Shuang Zhai, Purdue ECE; Long Lu, Northeastern University; Felix Xiaozhu Lin, Purdue ECE Abstract: While it is compelling to process large streams of IoT data on the cloud edge, doing so exposes the data to a sophisticated, vulnerable software stack on the edge and hence security threats. To this end, we advocate isolating the data and its computations in a trusted execution environment (TEE) on the edge, shielding them from the remaining edge software stack which we deem untrusted. This approach faces two major challenges: (1) executing high-throughput, low-delay stream analytics in a single TEE, which is constrained by a low trusted computing base (TCB) and limited physical memory; (2) verifying execution of stream analytics as the execution involves untrusted software components on the edge. In ... Continue reading →

24 Views

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

by Jinse Kwon on 2019-09-19 16:06:25

Date : 2019. 09. 04 (Wed) 13:30 Locate : EB5. 533 Presenter : Jinse Kwon   Title : Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Author : Tal Ben-Nun, Torsten Hoefler (ETH Zurich, Zurich, Switzerland)   Abstract : Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system ... Continue reading →

66 Views

Software fault injection testing of the embedded software of a satellite launch vehicle

by Hyeoksoo Jang on 2019-09-19 15:41:10

Date: 2019. 08. 07 (Wed) 13:00 Locate: EB5. 533 Presenter: Hyeoksoo Jang Title: Software fault injection testing of the embedded software of a satellite launch vehicle Author: Anil Abraham Samuel, Jayalal N., Valsa B., Ignatious C.A., and John P. Zachariah Abstract: The software performing navigation, guidance, control, and mission-sequencing functionalities embedded in the flight computer system (FCS) of a satellite launch vehicle must be highly dependable. The presence of faults in the embedded flight software affects its dependability and may even jeopardize the entire mission, resulting in a huge loss to the space agency concerned. There are many techniques available to achieve high dependability and can be classified under fault avoidance, fault removal and fault tolerance. In the FCS of the Indian Space Research Organization’s (ISRO’s) satellite launch vehicles, all of the above means to achieve dependability are adopted. Fault avoidance and removal ... Continue reading →

113 Views

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

by Donghee Ha on 2019-07-22 16:09:47

Date: 2019. 07. 24 (Wed) 13:00 Locate: EB5. 533 Presenter: Donghee Ha Title: XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks Author: Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi Abstract: We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58x faster convolutional operations and 32x memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification ... Continue reading →

72 Views

Application Memory Isolation on Ultra-Low-Power MCUs

by Sihyeong Park on 2019-07-08 09:57:20

Date: 2019. 07. 10 (Wed) 13:00 Locate: EB5. 533 Presenter: Sihyeong Park Title: Application Memory Isolation on Ultra-Low-Power MCUs Author: Taylor Hardin, Dartmouth College; Ryan Scott, Clemson University; Patrick Proctor, Dartmouth College; Josiah Hester, Northwestern University; Jacob Sorber, Clemson University; David Kotz, Dartmouth College Abstract: The proliferation of applications that handle sensitive user data on wearable platforms generates a critical need for embedded systems that offer strong security without sacrificing flexibility and long battery life. To secure sensitive information, such as health data, ultra-low-power wearables must isolate applications from each other and protect the underlying system from errant or malicious application code. These platforms typically use microcontrollers that lack sophisticated Memory Management Units (MMU). Some include a Memory Protection Unit (MPU), but current MPUs are inadequate to the task, leading platform ... Continue reading →

105 Views

PipeDream: Fast and Efficient Pipeline Parallel DNN Training

by Jinse Kwon on 2019-06-12 11:18:17

Date : 2019. 06. 26 (Wed) 13:00 Locate : EB5. 533 Presenter : Jinse Kwon   Title : PipeDream: Fast and Efficient Pipeline Parallel DNN Training Author : Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Nikhil Devanur, Greg Ganger, Phil Gibbons (Microsoft Research, Carnegie Mellon University, Stanford University)   Abstract : PipeDream is a Deep Neural Network(DNN) training system for GPUs that parallelizes computation by pipelining execution across multiple machines. Its pipeline parallel computing model avoids the slowdowns faced by data-parallel training when large models and/or limited network bandwidth induce high communication-to-computation ratios. PipeDream reduces communication by up to 95% for large DNNs relative to data-parallel training, and allows perfect overlap of communication and computation. PipeDream keeps all available GPUs productive by systematically partitioning DNN ... Continue reading →

132 Views

How can do good research?

by Daeyoung Song on 2019-06-11 00:00:00

Date : 2019.06.13(Thur.) Locate : EB5. 607 Presenter : Hyungshin Kim Title : How Continue reading →

144 Views

[Xenomai] An Overview of the Real-Time Framework for Linux

by Hyeoksoo Jang on 2019-06-04 10:46:08

Date : 2019.6.5(Wed) Locate : EB5. 533 Presenter : Hyeoksoo Jang Title :  An Overview of the Real-Time Framework for Linux   Continue reading →

109 Views

How can do good presentation?

by Daeyoung Song on 2019-05-28 14:16:51

Date : 2019.06.04(The.) Locate : EB5. 410 Presenter : Hyungshin Kim Title : How can do good presentation? We will have time to study 'Presentation skill'. Continue reading →

125 Views

A new golden age for computer architecture

by Daeyoung Song on 2019-05-27 21:01:32

Date : 2019.5.30(Thu) Locate : EB5. 607 Presenter : Yeji Son Title :  A new golden age for computer architecture : Future Opportunities in Computer Architecture  Continue reading →

121 Views

멀티코어 시스템에서의 TLB Lockdown을 통한 WCET 개선

by Daeyoung Song on 2019-05-27 20:51:05

Date : 2019.5.30(Thu) Locate : EB5. 533 Presenter : Daeyoung Song Title :  멀티코어 시스템에서의 TLB Lockdown을 통한 WCET 개선  Continue reading →

52 Views

모바일 GPU에서 OpenCL 선형대수 라이브러리의 비정방 행렬 곱셈 최적화

by Byungkyo Jung on 2019-05-23 11:34:20

Date : 2019.5.23(Thu) Locate : EB5. 533 Presenter : Byeongkyo Cheong Title : 모바일 GPU에서 OpenCL 선형대수 라이브러리의 비정방 행렬 곱셈 최적화 Continue reading →

56 Views

How can good research be done

by Daeyoung Song on 2019-05-17 11:18:24

Date : 2019.05.23(Thu.) Locate : EB5. 410 Presenter : Hyungshin Kim Title : How can good research be done? We will have time to study 'Research skill'. Continue reading →

86 Views

How to write a paper in the system field

by Daeyoung Song on 2019-05-15 17:42:39

Date : 2019.05.16(Thu) Locate : EB5. 607 Presenter : Hyungshin Kim Title : How to write a paper in the system field We will have time to study how to write systems papers.   Continue reading →

57 Views

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

by Donghee Ha on 2019-05-14 11:29:54

Date : 2019.05.16(Thu) Locate : EB5. 533 Presenter : Donghee Ha Title : GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism Author : Yanping Huang, Yonglong Cheng, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Zhifeng Chen Abstract : GPipe is a scalable pipeline parallelism library that enables learning of giant deep neural networks. It partitions network layers across accelerators and pipelines execution to achieve high hardware utilization. It leverages recomputation to minimize activation memory usage. For example, using partitions over 8 accelerators, it is able to train networks that are 25x larger, demonstrating its scalability. It also guarantees that the computed gradients remain consistent regardless of the number of partitions. It achieves an almost linear speed up without any changes in the model parameters: when using 4x more accelerators, training the same model is up to 3.5x faster. We train a 557 ... Continue reading →

84 Views