변분 오토인코더를 이용한 강화학습기반의 종단간 자율주행기술과 성능 검증 > 2020Proc

CSS

Last update : 20-08-30 11:43

2020-11-13 11:20-11:40 [A6-3] 무인자율항법 II

변분 오토인코더를 이용한 강화학습기반의 종단간 자율주행기술과 성능 검증
정승환, 조상재, 백동희, 공승현*

자율주행 자동차의 학습 및 운용 방식에는 크게 두가지 접근법이 존재한다. 하나는 카메라, 라이다, GPS, 정밀 지도를 활 용하여 인지, 판단, 제어 알고리즘 구성해 주행하는 방법과 사람이 모듈을 직접 설계하는 것없이 학습을 통해 스스로 최적 관계를 찾는 종단간(end-to-end) 주행 방식이 존재한다. 현재 종단간 학습 방식은 여러 분야에서 놀랄 만큼의 성능를 보여 주고 있으며 몇몇 분야에서는 이미 사람을 뛰어넘는 성능를 보여주고 있다. 종단간 주행 방식은 데이터를 충분히 모을 수 있다는 가정하에 스스로 학습을 하며 입력 값들 간의 최적화된 조합을 찾아 사람이 정의한 모듈러 방식을 뛰어넘는 성능을 보여줄 것이라 예상되고 있다. 본 논문에서는 단안 카메라로부터 획득한 다루기 힘든 고차원의 이미지를 변분 오토인코더 (Variational AutoEncoder)를 통하여 다루기 쉬운 저 차원의 공간으로 disentangle하게 압축하는 기술을 소개하고, 이를 통해 얻은 압축된 저 차원의 정보인 잠재 변수를 상태 값으로 사용하는 연속 공간에서의 강화학습 기반의 종단간 자율주행 알고 리즘을 개발하고 이를 실제 도로환경에서 성능을 검증한다.

Reinforcement Learning Based End-To-End Self-driving Using Variational AutoEncoder

Seung-Hwan Chung, Sangjae Cho, Donghee Paek, Seung-Hyun Kong1*

There are two main approaches to learning and operating autonomous vehicles. One method is to drive based on perception, decision, and control modules using cameras, LiDAR, GPS, and precision maps. The other one is end-to-end driving combining whole modules without human’s inductive bias. Currently, end-to-end learning methods are showing remarkable performance in many areas. Even, some areas are already performing beyond humans. It is expected that the end-to-end driving method will outperform modular approach that is constructed by human. This is because it will find an optimized combination between input values by learning through big data on the assumption that it can collect enough data. In this paper, we will utilize variational autoencoder (VAE) that compresses a high-dimensional image to a low-dimensional vector that is easy to handle. We develop an end-to-end autonomous driving algorithm based on reinforcement learning in a continuous space using a variable as a state value, and verify its performance in an actual road environment.

Speaker
정승환
한국과학기술원