Categories

Mobile — 89
NSL — 22
NSL Project — 2

DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices

by Byungkyo Jung on 2018-11-27 20:17:14

Date : 2018. 12. 4 (Tue) 10:00 Locate : EB5, 533 Presenter : Byeongkyo Cheong   Title :  DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications Author : Nicholas D. Lane, Sourav Bhattacharya, Petko Georgiev (Bell Labs, University of Cambridge, University of Bologna)   Abstract : Breakthroughs from the field of deep learning are radically changing how sensor data are interpreted to extract the high-level information needed by mobile apps. It is critical that the gains in inference accuracy that deep models afford become embedded in future generations of mobile apps. In this work, we present the design and implementation of DeepX, a software accelerator for deep learning execution. DeepX significantly lowers the device resources (viz. memory, computation, energy) required by deep learning that currently act as a severe bottleneck to mobile adoption. The foundation of DeepX is a pair of resource control algorithms, designed ... Continue reading →

25 Views

DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications

by Jinse Kwon on 2018-11-23 18:13:31

Date : 2018. 11. 23 (Tue) 10:00 Locate : EB5. 533 Presenter : Jinse Kwon   Title : DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications Author : Loc N. Huynh, Youngki Lee, Rajesh Krishna Balan (Singapore Management University )   Abstract : The rapid emergence of head-mounted devices such as the Microsoft Holo-lens enables a wide variety of continuous vision applications. Such applications often adopt deep-learning algorithms such as CNN and RNN to extract rich contextual information from the first-person-view video streams. Despite the high accuracy, use of deep learning algorithms in mobile devices raises critical challenges, i.e., high processing latency and power consumption. In this paper, we propose DeepMon, a mobile deep learning inference system to run a variety of deep learning inferences purely on a mobile device in a fast and energy-efficient manner. For this, we designed a suite of ... Continue reading →

29 Views

실시간 운영체제의 태스크 설정에 따른 스케줄링 성능 분석

by Hyeoksoo Jang on 2018-10-25 17:56:38

Date : 2018.11.06 (Tue) 10:00 Locate : EB5. 533 Presenter : Hyeoksoo Jang Title : 실시간 운영체제의 태스크 설정에 따른 스케줄링 성능 분석 Abstract :  Continue reading →

39 Views

임베디드 디바이스에서 OpenCL을 이용한 흑백 이미지 색생화 가속화

by Donghee Ha on 2018-10-25 17:53:49

Date : 2018.10.23 Locate : EB5. 533 Presenter : Donghee Ha Title : 임베디드 디바이스에서 OpenCL을 이용한 흑백 이미지 색생화 가속화 Abstract :      Continue reading →

38 Views

The design and implementation of microdrivers

by Sihyeong Park on 2018-08-27 10:30:57

Date : 2018. 09. 13  (Thu) 14:00 Locate : EB5. 533 Presenter : Sihyeong Park   Title : The design and implementation of microdrivers ASPLOS XIII Proceedings of the 13th international conference on Architectural support for programming languages and operating systems Author :   Vinod Ganapathy Rutgers University, Piscataway, NJ Matthew J. Renzelmann University of Wisconsin-Madison, Madison, WI Arini Balakrishnan Sun Microsystems, Santa Clara, CA Michael M. Swift University of Wisconsin-Madison, Madison, WI Somesh Jha University of Wisconsin-Madison, Madison, WI Abstract :  Device drivers commonly execute in the kernel to achieve high performance and easy access to kernel services. However, this comes at the price of decreased reliability and increased programming difficulty. Driver programmers are unable to use user-mode development tools and must instead use cumbersome kernel tools. Faults in kernel drivers can cause the entire ... Continue reading →

202 Views

Situating Wearables: Smartwatch Use in Context

by Jinyoung Choi on 2018-08-06 11:00:58

Date : 2018. 08. 08 (Wed) 13:00 Locate : EB5. 533 Presenter : Jinyoung Choi   Title : Situating Wearables: Smartwatch Use in Context Author : Donald McMillan, Barry Brown, Airi Lampinen, Moira McGregor, Eve Hogga, Stefania Pizza (The University of Stockholm at Kista, Sweden)   Abstract : Drawing on 168 hours of video recordings of smartwatch use, this paper studies how context influences smartwatch use. We explore the effects of the presence of others, activity, location and time of day on 1,009 instances of use. Watch interaction is significantly shorter when in conversation than when alone. Activity also influences watch use with significantly longer use while eating than when socialising or performing domestic tasks. One surprising finding is that length of use is similar at home and work. We note that usage peaks around lunchtime, with an average of 5.3 watch uses per hour throughout a day. We supplement these findings with ... Continue reading →

263 Views

CLBlast: A Tunes OpenCL BLAS Library

by Byungkyo Jung on 2018-07-20 17:28:45

Date : 2018.7.25 Locate : EB5. 533 Presenter : Byeongkyo Cheong Author : Cedric Nugteren Title : CLBlast: A Tuned OpenCL BLAS Library Abstract : This work introduces CLBlast, an open-source BLAS library providing optimized OpenCL routines to accelerate dense linear algebra for a wide variety of devices. It is targeted at machine learning and HPC applications and thus provides a fast matrix-multiplication routine (GEMM) to accelerate the core of many applications (e.g. deep learning, iterative solvers, astrophysics, computational fluid dynamics, quantum chemistry). CLBlast has five main advantages over other OpenCL BLAS libraries: 1) it is optimized for and tested on a large variety of OpenCL devices including less commonly used devices such as embedded and low-power GPUs, 2) it can be explicitly tuned for specific problem-sizes on specific hardware platforms, 3) it can perform operations in half-precision floating-point FP16 saving bandwidth, time and energy, ... Continue reading →

226 Views

네트워크 성능향상을 위한 시스템 호출 수준 코어 친화도

by Daeyoung Song on 2018-07-16 10:27:16

Data : 2018.7.18 (Wed) 13:00 Locate : EB5. 533 Presenter : Daeyoung Song Title : 네트워크 성능향상을 위한 시스템 호출 수준 코어 친화도 Author : 엄준용, 조중연, 진현욱 Abstract : Existing operating systems experience scalability issues as the number of cores increases. The network I/O performance on manycore systems is faced with the major limiting factors of cache consistency costs and locking overheads. Legacy methods resolve this issue include the new microkernel-like operating system or modification of existing kernels; however, these solutions are not fully application transparent. In this study, we proposed a library that improves the network performance by separating system call context from user context and by applying the core affinity without any kernel and application modifications. Experiment results showed that our implementation can improve the network throughput of Apache by up to 30%. Continue reading →

110 Views

멀티코어 환경에서 비실시간 메시지의 응답시간 지연을 최소화하는 리눅스 기반 메시지 처리기의 설계 및 구현

by Hyeoksoo Jang on 2018-06-27 15:08:03

Date : 2018. 07. 011 (Wed) 13:00 Locate : EB5. 533 Presenter : Hyeoksoo Jang Title : 멀티코어 환경에서 비실시간 메시지의 응답시간 지연을 최소화하는 리눅스 기반 메시지 처리기의 설계 및 구현 Author : 왕상호, 박영훈, 박성용, 김승춘, 김철회, 김상준, 진 철 Link : http://www.dbpia.co.kr/Journal/PDFViewNew?d=NODE07111226&prevPathCode= Continue reading →

168 Views

S3DNN: Supervised Streaming and Scheduling for GPU-Accelerated Real-Time DNN Workloads

by Jinse Kwon on 2018-06-15 11:55:01

Date : 2018. 06. 20 (Wed) 13:00 Locate : EB5. 533 Presenter : Jinse Kwon   Title : S3DNN: Supervised Streaming and Scheduling for GPU-Accelerated Real-Time DNN Workloads Author : Husheng Zhou,  Soroush Bateni,  Cong Liu (The University of Texas at Dallas)   Abstract : Deep Neural Networks (DNNs) are being widely applied in many advanced embedded systems that require autonomous decision making, e.g., autonomous driving and robotics. To handle resource-demanding DNN workloads, graphic processing units (GPUs) have been used as the main acceleration engine. Although much research has been conducted to algorithmically optimize the efficiency of applying DNN to applications such as object recognition, limited attention has been given to optimizing the execution of GPU accelerated DNN workloads at the system level. In this paper, we propose S3DNN, a system solution that optimizes the execution of DNN workloads on GPU in a ... Continue reading →

142 Views

Rehearsal: Reducing Convolutional Neural Network for Object Detection on Embedded Devices

by Do Trung Hai on 2018-05-29 15:09:39

Date : 2018. 05. 30 (Wed) 13:00 Locate : EB5. 533 Presenter : Trunghai Do Abstract :  Nowadays, convolutional neural networks (CNNs) become the center of many computer vision solutions to solve a variety of tasks. However, memory and computation are two of the most important characteristics of deep neural networks. These characteristics make neural networks difficult to effectively deploy on limited hardware resources such as embedded systems. Furthermore, to deploy models for devices and update them regularly, the model size needs to be small. In this thesis, we propose DroidDet a small fully convolutional neural network. Our DroidDet adopts You Only Look Once (YOLO) object detection algorithm for the ARM Mali-T628 MP6 GPU of ODROID-XU4. In order to build DroidDet, we not only replace all the fully connected layers that act as detection layers in YOLO with convolutional layers but also rearrange some of the convolutional layers to reduce the model size and ... Continue reading →

147 Views

매니코어 환경에서 리눅스 커널의 공유 메모리 관리에 대한 문제점 분석

by Byungkyo Jung on 2018-05-17 20:49:05

Data : 2018.5.23 (Wed) 13:00 Locate : EB5. 533 Presenter : Byeongkyo Cheong Title : 매니코어 환경에서 리눅스 커널의 공유 메모리 관리에 대한 문제점 분석 Author : 서동주, 경주현, 임성수 Continue reading →

120 Views

HeartChat: Heart Rate Augmented Mobile Messaging to Support Empathy and Awareness

by Jinyoung Choi on 2018-05-11 16:24:57

Date : 2018. 5. 16 (Wed) 13:00 Locate : EB5. 533 Presenter : Jinyoung Choi   Title : HeartChat: Heart Rate Augmented Mobile Messaging to Support Empathy and Awareness Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems May 06-11, 2017 ISBN : 978-1-4503-4655-9 doi : 10.1145/3025453.3025758   Author : Mariam Hassib, Daniel Buschek, Paweł W. Wo´zniak, Florian Alt, LMU Munich-Ubiquitous Interactive Systems Group, University of Stuttgart - VIS, Stuttgart, Germany Abstract : Textual communication via mobile phones suffers from a lack of context and emotional awareness. We present a mobile chat application, HeartChat, which integrates heart rate as a cue to increase awareness and empathy. Through a literature review and a focus group, we identified design dimensions important for heart rate augmented chats. We created three concepts showing heart rate per message, in real-time, or sending it explicitly. ... Continue reading →

153 Views

실시간 태스크 그룹과 데드라인 태스크의 동시 지원을 위한 리눅스 스케줄링 가능성 분석 개선

by Daeyoung Song on 2018-04-18 16:13:32

Data : 2018.4.25 (Wed) 13:00 Locate : EB5. 533 Presenter : Daeyoung Song Title : 실시간 태스크 그룹과 데드라인 태스크의 동시 지원을 위한 리눅스 스케줄링 가능성 분석 개선 Author : 임인구, 진현욱, 이상헌 Abstract : Linux is a general-purpose operating system that supports several schedulers, allowing different schedulers to coexist. In addition, Linux uses the Control Group (cgroup) to reserve CPU resources for task groups that follow the real-time (SCHED_FIFO, SCHED_RR) and non-real-time (SCHED_NORMAL) scheduler policies, except for the deadline scheduler (SCHED_DEADLINE). The cgroup performs the schedulability analysis to guarantee the reserved CPU resource as much as possible. However, current implementation of the schedulability analysis does not distinguish between deadline tasks and real-time tasks. Therefore, if these deadline tasks and real-time task groups coexist, there is a case where the resource reservation for the real-time task group is rejected. In this paper, we analyze the ... Continue reading →

117 Views

파티셔닝 기반 고신뢰 실시간 운영체제(RTWORKS) 개발 및 무기체계 적용사례

by Hyeoksoo Jang on 2018-04-05 16:56:11

Data : 2018.4.18 (Wed) 13:00 Locate : EB5. 533 Presenter : Hyeoksoo Jang Title : 파티셔닝 기반 고신뢰 실시간 운영체제(RTWORKS) 개발 및 무기체계 적용사례 Author : 손동환, 이화영, 임동혁     Continue reading →

139 Views