Sunday, April 26, 2020

Run YOLOv3 on Raspberry Pi with the Intel Neural Compute Stick 2

While working on a personal project I decided to run YOLOv3 on a Raspberry Pi. I mean the full YoloV3, not the tiny version. Given the availability of decent tutorials on the internet, it did not take too long to get things working. And as expected, the inference results were great but, considering the 12 seconds to do so, it became clear that the Pi (3B+) was never designed for such tasks.


As a solution, I added VPU power (Vision Processing Unit) in the form of an Intel Neural Compute Stick 2, for some also known as Intel Movidius, or just NCS2. This is a less than 100$ USB compute stick solely made for Neural Network inference.


In order to use the Intel Movidius, you need to install the necessary software on the Raspberry Pi. And before running a Neural Network model it needs to be converted to an 'Intermediate Representation' (IR) format. This post handles the necessary steps to do the software installation, to convert the Darknet YOLOv3 model to IR, and to run a demonstration on the Raspberry Pi.

The document is split into the following sections:
  • Install Raspberry Pi Buster
  • Install OpenVINO
  • Prepare a Docker image for conversion to IR
  • Convert to IR
  • Run YOLOv3 Demo on the Raspberry Pi with camera input

Conventions used in this document:


pi$ : shell commands to be given at the Raspberry Pi
linux$ : shell commands to be given at a Linux machine (Ubuntu in my case)
docker$ : shell commands to be given at a docker container

Install Raspberry Pi Buster

Insert a new Raspberry Pi SD card in your Linux machine and check the device name. The following commands are useful for this:

linux$ sudo fdisk -l
linux$ lsblk


(the SD is sde in the example, but possibly you have another one, make sure to change that to not accidentally overwrite your data)

Download Rasbian Buster-lite at: 
(buster-lite is fine)

Unzip the downloaded file and put the Buster image to the Raspberry Pi SD card:

linux$ sudo dd if=2020-02-13-raspbian-buster-lite.img of=/dev/sd[e] status=progress bs=4M

SSH.
Create a file ‘ssh’ on the boot partition to enable SSH

lsblk
sde      8:64   1  29.7G  0 disk 
├─sde1   8:65   1   256M  0 part 
└─sde2   8:66   1   1.5G  0 part 

linux$ sudo mkdir /mnt/sd
linux$ sudo mount /dev/sde1 /mnt/sd
linux$ sudo touch /mnt/sd/ssh
linux$ sudo umount /mnt/sd

Put the SD card in the Pi and start it up.
Figure out the IP address and login over SSH

ssh pi@<ip address of Pi> 

username: pi
password: raspberry

One of the first things to do is to make the full capacity of the SD card available, do that as follow:

pi$ sudo raspi-config 

Choose : Advanced Options -> Expand Filesystem ….

Install the next software:
pi$ sudo apt-get update
pi$ sudo apt-get install git
pi$ sudo apt-get install cmake
pi$ sudo apt-get install libatlas-base-dev
pi$ sudo apt-get install python3-pip
pi$ sudo apt install libgtk-3-dev

pi$ pip3 install --upgrade pip
pi$ pip3 install numpy

Install OpenVINO

OpenVINO is an Intel toolkit that contains a copy of OpenCV and has the necessary drivers and tools to manage models and run them on the Intel Movidius. There is a runtime version for Raspberry Pi, follow the installation steps below. They are based on the following document: https://docs.openvinotoolkit.org/latest/_docs_install_guides_installing_openvino_raspbian.html



At the Raspberry Pi, do:
pi$ mkdir -p ~/projects/openvino
pi$ cd ~/projects/openvino
pi$ wget https://download.01.org/opencv/2020/openvinotoolkit/2020.1/l_openvino_toolkit_runtime_raspbian_p_2020.1.023.tgz

Untar the download:
pi$ tar -xzvf l_openvino_toolkit_runtime_raspbian_p_2020.1.023.tgz


Make the OpenVINO environment to initialize at the start of the Pi:

pi$ echo "source /home/pi/projects/openvino/l_openvino_toolkit_runtime_raspbian_p_2020.1.023/bin/setupvars.sh" >> ~/.bashrc
pi$ source ~/.bashrc 

[setupvars.sh] OpenVINO environment initialized


Finally add the USB rules:

pi$ sudo usermod -a -G users "$(whoami)"
pi$ sh ~/projects/openvino/l_openvino_toolkit_runtime_raspbian_p_2020.1.023/install_dependencies/install_NCS_udev_rules.sh

At this point, the NSC2 should work and I advise to test it before continuing:

Plug-in the Intel Movidius USB stick in the Pi, create a file "openvino_fd_myriad.py" and add the code below (as explained in “Run Inference of Face Detection Model Using OpenCV* API” at


import cv2 as cv

# Load the model.
net = cv.dnn_DetectionModel('face-detection-adas-0001.xml',
                            'face-detection-adas-0001.bin')

# Specify target device.
net.setPreferableTarget(cv.dnn.DNN_TARGET_MYRIAD)
# Read an image.
frame = cv.imread('face.jpg')
if frame is None:
    raise Exception('Image not found!')
# Perform an inference.
_, confidences, boxes = net.detect(frame, confThreshold=0.5)
# Draw detected faces on the frame.
for confidence, box in zip(list(confidences), boxes):
    cv.rectangle(frame, box, color=(0, 255, 0), thickness=3)
# Save the frame to an image file.
cv.imwrite('out.jpg', frame)

Make sure to download the following pre-trained Face Detection model instead of the one mentioned in the above document:

$ pi$ wget https://download.01.org/opencv/2019/open_model_zoo/R3/20190905_163000_models_bin/face-detection-adas-0001/FP32/face-detection-adas-0001.xml
$ pi$ wget https://download.01.org/opencv/2019/open_model_zoo/R3/20190905_163000_models_bin/face-detection-adas-0001/FP32/face-detection-adas-0001.bin


  • Upload a photo to the Pi, name it face.jpg
  • run the face detection code "pi$ python3 openvino_fd_myriad.py

If all works, the outcome (out.jpg) should be something like this:






Prepare a Docker image for conversion to IR

Converting a model to the OpenVINO Intermediate Representation (IR) needs to be done with a full OpenVINO-toolkit installation, and thus, not at the Raspberry Pi which has only a runtime version. In this post I use an Ubuntu Linux machine, it should work on Mac OS and Windows but the steps may differ.

To make things more manageable, and to not mess up your current Linux installation, I created a Dockerfile to do the conversion. It contains all the necessary components to convert. 


linux$ mkdir ~/projects
linux$ cd ~/projects
linux$ git clone https://github.com/brunokeymolen/devops.git
linux$ cd ~/projects/devops/docker-images/openvino-movidius


Download a full OpenVINO toolkit from the following link and put it next to the Dockerfile:


linux$:~/projects/devops/docker-images/openvino-movidius(master)$ ls -lGg
total 496316
-rw-r--r-- 1      2766 Apr 12 15:02 Dockerfile
-rw-r--r-- 1 508213676 Apr  5 11:04 l_openvino_toolkit_p_2020.1.023.tgz
-rw-r--r-- 1       431 Apr  6 20:07 README


Open the Dockerfile and, if needed, change:
ARG OPENVINO_TOOLKIT_NAME=l_openvino_toolkit_p_2020.1.023.tgz


Build the Docker image:
linux$ docker build -t openvino-movidius .

If all is fine the image should be something like this:
linux$ docker image ls | grep openvino-movidius
openvino-movidius            latest              89b2c076f373        2 weeks ago         3.86GB

Get the YOLOv3 scripts to convert the weights.

First, on the Linux host, download YOLOv3 (choose whether you want yolov3 or yolov3-tiny)

Because it is needed only temporarily, I installed it at the /tmp directory, adapt the location if you want.


linux$ mkdir /tmp/openvino
linux$ cd /tmp/openvino
linux$ git clone https://github.com/mystic123/tensorflow-yolo-v3.git
linux$ cd tensorflow-yolo-v3
linux$ git checkout ed60b90



Get the coco class names and download the model weights to the same directory:
linux$ wget https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names
linux$ wget https://pjreddie.com/media/files/yolov3.weights
linux$ wget https://pjreddie.com/media/files/yolov3-tiny.weights

The result should be similar to this:
linux$ :/tmp/openvino/tensorflow-yolo-v3((HEAD detached at ed60b90))$ ls -lGg
total 276868
-rw-r--r-- 1       625 Apr 26 09:32 coco.names
-rw-r--r-- 1      3219 Apr 26 09:30 CODE_OF_CONDUCT.md
-rw-r--r-- 1      1552 Apr 26 09:30 convert_weights_pb.py
-rw-r--r-- 1      1474 Apr 26 09:30 convert_weights.py
-rw-r--r-- 1      3225 Apr 26 09:30 demo.py
-rw-r--r-- 1     11357 Apr 26 09:30 LICENSE
-rw-r--r-- 1      2335 Apr 26 09:30 README.md
-rw-r--r-- 1     10595 Apr 26 09:30 utils.py
-rw-r--r-- 1      9306 Apr 26 09:30 yolo_v3.py
-rw-r--r-- 1      4030 Apr 26 09:30 yolo_v3_tiny.py
-rw-r--r-- 1  35434956 Apr 26 09:33 yolov3-tiny.weights
-rw-r--r-- 1 248007048 Apr 26 09:32 yolov3.weights




Run (and share the /tmp/openvino directory)
linux$ docker run -ti --device-cgroup-rule='c 189:* rmw' -v /dev/bus/usb:/dev/bus/usb -v /tmp/openvino:/mnt/host openvino-movidius




Convert to IR

There are two steps in the conversion to IR.
  • convert the yolo weight file to .pb file
  • convert the .pb file to IR

Convert the Yolo weight file to .pb file
This step uses the scripts from the tensorflow-yolo-v3.git repository, which is on the /tmp directory on the Linux Host but needs to be executed from within the docker image, we use the shared directory for that.

docker$ cd /mnt/host/tensorflow-yolo-v3/
docker$ python3 convert_weights_pb.py --class_names coco.names --data_format NHWC --weights_file yolov3.weights

In case you need Yolov3-tiny:
docker$ python3 convert_weights_pb.py --class_names coco.names --data_format NHWC --weights_file yolov3-tiny.weights --tiny

You will notice a bunch of warnings but at the end, the following message appears:
1186 ops written to frozen_darknet_yolov3_model.pb.

Check if the .pb file is created:
docker$:/mnt/host/tensorflow-yolo-v3# ls -lGg | grep frozen_darknet_yolov3_model.pb
-rw-r--r-- 1 248192514 Apr 26 07:45 frozen_darknet_yolov3_model.pb

Convert the .pb file to IR

docker$ cd /opt/intel/openvino_2020.1.023/deployment_tools/model_optimizer 
docker$ export MO_ROOT=`pwd`

YOLOv3:
docker$ python3 mo_tf.py --input_model /mnt/host/tensorflow-yolo-v3/frozen_darknet_yolov3_model.pb --tensorflow_use_custom_operations_config $MO_ROOT/extensions/front/tf/yolo_v3.json --batch 1 --generate_deprecated_IR_V7

YOLOv3-tiny:
docker$ python3 mo_tf.py --input_model /mnt/host/tensorflow-yolo-v3/frozen_darknet_yolov3_tiny_model.pb --tensorflow_use_custom_operations_config $MO_ROOT/extensions/front/tf/yolo_v3_tiny.json --batch 1 --generate_deprecated_IR_V7

The result is the YOLOv3 model in IR format.
docker$  ls -lGg
...
-rw-r--r-- 1 247691380 Apr 26 07:55 frozen_darknet_yolov3_model.bin
-rw-r--r-- 1     29847 Apr 26 07:55 frozen_darknet_yolov3_model.mapping
-rw-r--r-- 1    108182 Apr 26 07:55 frozen_darknet_yolov3_model.xml
...


Copy them to the host and scp them to the Raspberry Pi.

docker$ cp frozen_darknet_yolov3_model.bin /mnt/host/tensorflow-yolo-v3/.
docker$ cp frozen_darknet_yolov3_model.xml /mnt/host/tensorflow-yolo-v3/.

pi$ mkdir ~/models

linux$ cd /tmp/openvino/tensorflow-yolo-v3
linux$ scp frozen_darknet_yolov3_model.xml pi@<ip address>:~/models/.
linux$ scp frozen_darknet_yolov3_model.bin pi@<ip address>:~/models/.
linux$ scp coco.names  pi@<ip address>:~/models/.

Run YOLOv3 Demo on the Raspberry Pi with camera input

In this step, we use the previously converted Darknet YOLOv3 model.



OpenCV has the following object detection demo:

At the Raspberry Pi:
pi$ mkdir ~/yolov3demo
pi$ cd ~/yolov3demo
pi$ wget https://raw.githubusercontent.com/opencv/open_model_zoo/master/demos/python_demos/object_detection_demo_yolov3_async/object_detection_demo_yolov3_async.py

The code has video output, so if you connect over ssh, login as follow: 
linux$ ssh pi@<ip address> -Y

you might need to install X on the Pi for the ssh -Y option, I did that by installing xterm (that pulls all necessary libraries):
pi$ sudo apt install xterm


If a camera is attached to the Raspberry Pi:
pi$ cd ~/yolodemo
pi$ python3 object_detection_demo_yolov3_async.py -m ~/models/frozen_darknet_yolov3_model.xml --labels ~/models/coco.names -d MYRIAD -i cam -pc




Inference Server
If you don't have a camera, no worries, please check my next post. I'll explain how to turn your Raspberry Pi, with Intel Movidius, into an Inference server accessible over a web interface.




2 comments:

  1. Hey! Thanks for the tutorial, very well-written. So what was the frame-rate that you eventually got? Like does Yolo3 (not the tiny version) run in real-time when using NCS with a Pi?

    ReplyDelete
    Replies
    1. On my current Raspberry Pi 3, the forward pass is around 540ms (based on a few individual frames, not benchmarks), the postprcessing takes another 250ms but thats not optimized and it depends on the use case.


      Stats: {'forward-pass-ms': 538, 'postprocess-ms': 247, 'total-ms': 833, 'class-info': [[0, 'person', 1.0], [39, 'bottle', 0.98583984375], [56, 'chair', 0.97216796875], [41, 'cup', 0.82275390625]], 'roi': [{'top-left': (1295, 632), 'bottom-right': (1737, 1980), 'label': 'person', 'confidence': 1.0}, {'top-left': (404, 1625), 'bottom-right': (575, 2159), 'label': 'bottle', 'confidence': 0.98583984375}, {'top-left': (2303, 1292), 'bottom-right': (2975, 2288), 'label': 'chair', 'confidence': 0.97216796875}, {'top-left': (711, 2205), 'bottom-right': (910, 2303), 'label': 'cup', 'confidence': 0.82275390625}]}

      Delete