It has been a while since I published my last blog post. I am back! Life has been a little crazy lately. I am in the midst of my senior year in high school.
Object Detection with YOLO
When we talk about object detection, where are really two smaller tasks embedded in one larger task. First, the computer needs to recognize all the types of objects in the photo. Second, the computer needs to find the precise locations of those objects. To accomplish these two things well is harder than what you might have thought.
Since AlexNet, there have been many different great object detection deep learning models. In 2015, the groundbreaking ResNet became the front-runner in the ImageNet challenge. In 2017, K. He et al proposed Mask R-CNN for both classification and localization. In other words, Mask R-CNN can do detection and localization simultaneously — how great is that! But there is one catch. Implementing a Mask R-CNN on a budget self-driving car, especially my self-driving car, it’s next to impossible. The detection framerate on a GTX 1080 is about 0.75 frames a second! The desirable framerate would be around 30 fps.
This is where YOLO come in. According to the developers:
“You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Pascal Titan X it processes images at 30 FPS and has a mAP of 57.9% on COCO test-dev.”
Redmon, Joseph and Farhadi, Ali
Of course, YOLO’s detection accuracy is a little behind Fast Mask R-CNN (the newer version of the original paper). But, most developers are willing to make that trade-off. YOLO works surprisingly well in many cases at an incredible 30 fps. Did I mention that you can download pretrained models and integrate them into your system very quickly.
Integrating YOLO In ROS
Now, it’s time to integrate YOLO into ROS. If you are unfamiliar, ROS (Robot Operating System) is the robotics middleware that I use for the golf cart.
Convert YOLOv3 to Keras

The golf cart uses Python and the machine learning library Python. The first step is to convert the latest version of YOLO (v3) to Keras. I downloaded the pretrained YOLO models from the official website. Please note that I am using YOLOv3-tiny in my project. You can probably see that as the framerate goes up, the detection accuracy goes down. I simply don’t have the processing power to handle the larger (better) models. And then, I used an open source project for the conversion. If you want to do this on your own, you can find very good instructions on that project’s README page. I generated a *.h5 file that can be used by my Python code.
Creating the Object Detection Node

I created an object detection node in ROS. This node listens to the camera input and performs object detection. Also, the node publishes the detection results with a specific message type, and it can also publish detection visualization. The result looks something like this:

Conclusion
That’s how I implemented YOLO in ROS using rospy. Once again, the self-driving car project is open source on Github. You can also download the object detection node and integrate it into your own project. I would love to hear your project ideas if you end up doing that.
Please like this post if you enjoyed reading it. (I would really appreciate it). Also, feel free to check out these posts, documenting my process of building this self-driving golf cart. You can always reach me at contact@neilnie.com Thanks for checking out my blog!
- Driver by wire system (DBW)
- Engineering
- Robot Operating System