RStudio for Keras 그리고 GPU

- 4월 30, 2019

This post is to help students who have issues on Keras for R during installation. If you have GPU but stuck in errors, consider the following. Sample codes are created under Windows 10. For your information, I used a computer with a preinstalled NVidia driver and a CUDA toolkit. In addition, you need Anaconda prior to install Keras for R.

설치 과정에서 생기는 몇 가지 문제점에 대응하기 위한 팁을 올려둡니다. NVidia 계열의 GPU가 있다고 합시다. 윈도우 사용자를 기준으로 설명드립니다.

Anaconda가 설치되어 있어야 합니다. 제 경우에는 NVidia 드라이버와 CUDA가 모두 사전에 설치되어 있습니다.

먼저 Rstudio에서 정상적으로 keras를 설치하면 됩니다.

install.packages("keras")
library(keras)
install_keras(method='conda', tensorflow='gpu')

이제 Anaconda 콘솔을 실행합니다. 다음 명령을 입력합니다.

activate r-tensorflow

이제 콘다 환경이 r-tensorflow로 바뀝니다. anaconda 채널에서 tensorflow-gpu를 가져와 설치합시다.

conda install -c anaconda tensorflow-gpu

numpy 버전을 바꿉니다.

pip uninstall numpy
pip install --upgrade numpy==1.16.1

설치가 완료되었습니다.
RStudio로 돌아가서 샘플 코드를 실행해 봅시다.

library(keras)
use_condaenv("r-tensorflow") max_features <- 20000
batch_size <- 32

# Cut texts after this number of words (among top max_features most common words)
maxlen <- 80

cat('Loading data...\n')
imdb <- dataset_imdb(num_words = max_features)

Python을 못찾는다고 warning이 뜰 수 있습니다. 무시하면 됩니다.
학습과 테스트를 위한 데이터를 뽑습니다.

x_train <- imdb$train$x
y_train <- imdb$train$y
x_test <- imdb$test$x
y_test <- imdb$test$y

cat(length(x_train), 'train sequences\n')
cat(length(x_test), 'test sequences\n')

cat('Pad sequences (samples x time)\n')
x_train <- pad_sequences(x_train, maxlen = maxlen)
x_test <- pad_sequences(x_test, maxlen = maxlen)
cat('x_train shape:', dim(x_train), '\n')
cat('x_test shape:', dim(x_test), '\n')

cat('Build model...\n')
model <- keras_model_sequential()

워드 임베딩의 아웃풋 벡터를 128로 하고, 전체 입력 차원은 unique 단어의 개수로 합시다.

model %>%
layer_embedding(input_dim = max_features, output_dim = 128) %>%
layer_lstm(units = 64, dropout = 0.2, recurrent_dropout = 0.2) %>%
layer_dense(units = 1, activation = 'sigmoid')

이번 딥러닝의 목적은 negative, positive를 이항분류 하는 것이므로 binary crossentropy를 손실함수로 사용합니다. 옵티마이저는 어느 때나 adam.

# Try using different optimizers and different optimizer configs
model %>% compile(
loss = 'binary_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)

cat('Train...\n')
model %>% fit(
x_train, y_train,
batch_size = batch_size,
epochs = 15,
validation_data = list(x_test, y_test)
)

scores <- model %>% evaluate(
x_test, y_test,
batch_size = batch_size
)

cat('Test score:', scores[[1]])
cat('Test accuracy', scores[[2]])

GPU로 문제없이 학습 중입니다.

노트북이 7년 되었으니 ㅋㅋㅋ 보기에 답답할 정도로 느리기는 하네요.

요약하면 이렇습니다.

RStudio에서 콘다 환경을 먼저 만들어준다. 그러나 제대로 tensorflow-gpu가 설치되지 않으니 직접 가상환경에 설치를 해준다.
numpy에 문제가 있어서 imdb 데이터 등을 불러올 수 없다. keras for R 예제 등을 실행시켜보려면 잘 안되니 numpy 버전을 1.16.1로 낮춰준다.

이상입니다. 리눅스나 맥에서도 상황은 비슷할 것입니다.

이 블로그 검색

RMaster - Dr. Kim's Homepage

RStudio for Keras 그리고 GPU

댓글

댓글 쓰기

이 블로그의 인기 게시물

Bradley-Terry Model: paired comparison models

R에서 csv 파일 읽는 법

분할 선형회귀 분석