Real-Time Detection of Stairs
Finding an object detection use case
During the ideation phase, our group decided to develop a use case that could serve society free of charge via an open source platform. We quickly realized that the best target audience for object recognition would be visually impaired people. One of their common problems is the detection of obstacles in urban environments. Therefore, we decided to develop an application capable of detecting stairs including their direction (up or down).
After the initial definition of our target use case, the technical implementation consisted of the following steps: data acquisition and labeling, data pre-processing, model training and selection, deployment on the Jetson nano device.
Model training and selection
For processing an object recognition task, the first requirement to start is always a set of images with the different objects to be recognized. To achieve this, we took our smartphone and collected a lot of images of stairs under different conditions (inside, outside, top, bottom, etc.). In addition, we retrieved images automatically by querying the Microsoft bing API or Wikidata and dbpedia.In total, over 1 000 images were collected. Below you can see some examples.
We decided to train multiple networks to compare both the prediction accuracy and recognition speed of multiple models for our particular use case. Thus, in accordance with the Jetson Nano inference benchmarks provided by NVIDIA, we decided to train a Faster R-CNN using ResNet50 as a feature extractor, an SSD Mobilenet-V2, and a Tiny YOLO network. In each case, we used a pre-trained model that we then fine-tuned using our collected and labeled images.
Due to the poor performance of Faster R-CNN, only SSD Mobilenet-V2 and TinyYoloV4 were considered for use on the Jetson Nano. As can be seen in the figure above, SSD Mobilenet-V2 outperformed TinyYoloV4 only in staircase detection. In contrast, TinyYoloV4 performs better when distinguishing between upward and downward stairs. Since the distinction between up and down is an important aspect for our use case, we decided to choose TinyYoloV4 as our main model and thus used the trained TinyYoloV4 model for deployment on the Jetson Nano.
Key takeaways
The cheap fisheye camera on the Jetson is quite pixelated, but it performs better than expected.
We have problems detecting stairs that are not well lit. Therefore, our method may need to use an infrared camera or additional sensors to work reliably in all lighting conditions.
The angle at which the staircase is seen plays an important role in staircase detection. In this regard, further research should be done to find the best position to wear the device.
It seems that sometimes the model recognizes the stair railing and not the stair itself. Therefore, the training images should be more accurately labeled with polygons, and additional images of stairs without railings should be taken.