After our somewhat unsuccessful demo last Wednesday, my partner Michael Meng exclaimed, “there is no hope for deep learning”. The future is not that grim, but Michael is right to a certain degree. Deep learning has flaws and our deep learning powered self-driving golf cart certainly has lots of flaws. Today, I would like to discuss some of those flaws and how will we improve in the future.
If you are in the mood for some cheerful news, here is the post about our successful self-driving test and a video:
Deep Learning & Computer Vision
In case if you are wondering, the exact jargon for our algorithm is supervised machine learning and behavioral cloning with Convolutional Neural Networks. What do those words mean?! First, the computer learnings from labeled human driving data. Secondly, the algorithm does its own feature extraction, we don’t tell it what features to look for. As powerful as this is, it has many weaknesses.
Today, I will mainly focus on the steering ConvNet, if you want to check out my discussion about the segmentation network, you can find it here.
Our dataset gave us many issues. Simply put it, the knowledge that the algorithm learned from our training dataset can’t be transferred to our testing environment. Our dataset contains around 60,000 frames of driving data from California. The weather condition is mostly sunny and slightly overcast.
- New England winter weather is only ever so slightly different from California. 🤔
- Second, Based on our simulations, the network is good at detection & following lane lines. (not surprising). Bummer, Deerfield’s driveways don’t have lane lines. 😕
- Third, the network doesn’t perform well with shadows. Again, this is a common problem in deep learning and computer vision. There are methods such as image augmentation which tries to address the problem of overfitting. In our system, the cart performed well with no shadows and decent lighting. However, when the environment, especially the lighting condition, changes, the algorithms breaks apart. It pains me to write this because this problem is exactly what deep learning tries to address: that inherent limitation in human-extracted feature, old-school computer vision techniques. (I say this knowing that deep learning does out-perform these old techniques most of the time)
Just like what we do in the classrooms, we want to apply what we learned in the real world. Our model is bad at that.
Those problems show how limited a deep learning model can be. The lack of appropriate data is a common problem in deep learning. If you are into math, check out variance-bais trade-off. More and better training data, with some improvement in network architecture, will partially address these issues.
Lessons about Robotics & Engineering
I know this post is supposed to be about Deep Learning, but I just want to throw in some engineering stuff. There are lots to be learned about robotics and engineering from this project.
The most annoying experience is when you know your system once performance perfectly, but, it’s not reliable enough to replicate that perfect performance.
We realized the power and the importance of redundancy and reliability. (generally speaking, we try to avoid redundancy in life). In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the form of a backup or fail-safe, or to improve actual system performance.
First, we only had one camera on the front of the vehicle. We need to add more. We humans don’t simply rely on one eye to sense the world. Our vision, hearing, somatic sensation, taste and olfaction system work together to help us navigate this world. Not only from an engineering standpoint, increasing the amount and the diversity of sensors can improve the reliability and also the robustness of the cart’s decision-making process.
Secondly, even though we are proud of our custom design/made steering mechanism, it’s far from perfect. The imprecision in the steering system certainly prevented the vehicle from reliability controlling itself. If you only turn the gear by a small amount, it will not precisely translate into the correct wheel movement. In essence, the cart could only make large adjustments because small ones will not have any actual effect on the direction of the vehicle.
What will we do?
With all of this being said, I still have lots of hope in deep learning and computer vision. Some of the issues above are unique to our own system, some of those issues have puzzled researchers years. I really believe that with deep learning will be able to power autonomous vehicles, of course, not only with cameras. (Check out Apple’s paper on Lidar and deep learning.)
We are considering the following
- improve our steering mechanism.
- add more camera and other sensors. (radar, lidar?)
- collect our own dataset at Deerfield.
- improve network architecture. (try ConvLSTM?)
- use segmentation results to help steer the cart
- use GPS for global path planning
If you liked the post, please hit the like button below. If you have any questions, comments or concerns, please don’t hesitate to comment below or reach out to me at firstname.lastname@example.org. Thanks for stopping by!