Autonomous Mobile Robot?s Vision System
A Paper Presented
Priyanka Rahul Dahiwal
Submitted to the Faculty of Technische
Universit?t Kaiserslautern in fulfillment of the requirements of
the subject of Scientific Writing for the degree of
MASTER OF SCIENCE
COMMERCIAL VEHICLE TECHNOLOGY
Table of Contents
TOC o “1-2” h z u List of Figures PAGEREF _Toc14010062 h 3List of Acronyms PAGEREF _Toc14010063 h 4Abstract PAGEREF _Toc14010064 h 5Introduction PAGEREF _Toc14010065 h 6Autonomous Mobile Robot’s Vision System PAGEREF _Toc14010066 h 8Vision Sensors PAGEREF _Toc14010067 h 8Image Representation PAGEREF _Toc14010068 h 9Object Detection PAGEREF _Toc14010069 h 11Invariant Local Features PAGEREF _Toc14010070 h 17Object Recognition using SIFT (Scale Invariant Feature Transform) PAGEREF _Toc14010071 h 19Conclusion PAGEREF _Toc14010072 h 21References PAGEREF _Toc14010073 h 22
List of FiguresFigure 1: Classification of AMR- systemsFigure 2: Working Principle of CCD Camera Figure 3: Working Principle of CMOS Camera
Figure 4: Original Image with RGB FiltersFigure 5: HSV Color ModelFigure 6: YCbCr Color ModelFigure 7: Lab color model HYPERLINK l “_Edge_detection_of” Figure 8: Causes of discontinuities leading to Edge
Figure 9: Canny Edge Detection AlgorithmFigure 10: Corner detection using window functionFigure 11: Harris corner detection algorithmFigure 12: FAST AlgorithmFigure 13: Corner as Edge HYPERLINK l “_Object_Recognition_using” Figure 14: SIFT Algorithm
Figure 15: Object Recognition using SIFT algorithm
List of AcronymsAMR Autonomous Mobile Robot
CCD Charge Coupled Devices
CMOS Complementary MOSFET
RGB Red Green Blue
HSV Hue Saturation Value
YCbCr Chrominance blue Chrominance red
FAST Features from Accelerated Segment Tests
SIFT Scale Invariant Feature Transform
AbstractThis paper gives, overview of Autonomous Mobile Robot?s (AMR) vision system. It explains how AMR captures an image through vision sensors, charge coupled devices and complementary mosfet cameras (CCD and CMOS). After acquiring an image, the representation of image is done in various color spaces. These include Red Green Blue (RGB), Hue Saturation Value color space (HSV), Luminance Chrominance blue chrominance red color space (YCbCr). After acquiring an image and after its representation in respective color space, depending on application, object is detected through various algorithms. During detection of object, its edges or corners are detected to separate it out from an image. Cann? edge detection and Harris corner detection algorithms are generally used for autonomous mobile robots. Features from accelerated segments tests (FAST) is also a corner detection algorithm. The computing power of FAST algorithm is faster than other algorithms, hence used for real time applications of AMRs. As edges and corners are susceptible for rotation and scale variance respectively, some local features which shows invariance to these properties are chosen and then detected. Object recognition is done based on these local features using Scale Invariant Feature Transform (SIFT).
Introduction Autonomous Mobile Robot, is the robot which is able to navigate through environment autonomously while performing goal-oriented tasks (Berns, 2019). Now a days, these robots are gaining importance in every field from industrial area to personal areas. For AMR?s as the degree of autonomy of tasks execution increases, the degree of the unstructuredness of environment also increases. As the unstructuredness of environment increases it becomes more challenging for robots to safely navigate through environment without damaging people or surrounding as well as to robot itself.
Figure 1. Classification of AMR- systems. (Berns, 2019)
To overcome this challenge, various sensors systems and corresponding signal processing system are installed on robots. These sensor systems collect external and internal states of robot?s environment and helps robot to perform designated tasks without damaging surrounding, human or robot. These sensors can be classified as distance, position, vision, lasers. Out of these vision sensors are inexpensive and efficient to use.
As mobile robots also work in aware environment, where human interaction is unavoidable, AMR?s vision system must be precise and accurate. The system?s performance or accuracy depends on overall performance of every unit of vision system.
If we consider the example of autonomous guided vehicles which are part of autonomous transportation in industry. These robots transport goods from one place to another through aware environment. This environment consists of robot?s trajectories, human, machinery as well as other guided vehicles. In this scenario, robot should follow trajectory, capable of identifying starting point, destination and goods to transport. Along with these things it should recognize objects in moving environment, as human, machinery, other guided vehicles. After recognizing all these objects, it should act according. For example, human is recognized in a frame then it should stop or take another trajectory; after identifying destination
it should drop goods, if another guided vehicle comes close then it should stop or send stop signal to another vehicle.
To achieve this goal, robot?s vision system should be efficient, accurate and cost efficient. Depending upon requirements of application, various components along with various detection methods (edge or corner), respective algorithms must be chosen, such that above conditions get fulfilled.
The future for robot vision is vast and challenging. Before moving towards the future, understanding basic concepts of robot?s vision system will help to build pillars of this system for future advancements.
Autonomous Mobile Robot’s Vision SystemThe vision system of AMR’s consists of vision sensors and digital image processing unit. Vision sensors are the cameras, which capture the scene in the form of an image. This image is provided as input to the digital image processing unit. Based on the output, actuators receive signals and act according to the situation.
Vision SensorsVision sensors can be considered as eyes of AMR. Vision sensors are cameras installed on AMR. The main task of these cameras is, to continually capture images as AMR moves in the environment. This is a time-dependent environment in other words aware environment. In the aware environment, where human intervention is present, this task needs to be carried out error-free, to ensure safety. These sensors are exteroceptive and passive in nature. Meaning using these sensors external environment surrounded by the robot is analyzed, and no input is provided to obtain the output. There are two types of vision sensors used for AMR’s. They are
Charge coupled devices (CCD) camera. CCD is a monolithic device. It is made up from semiconductor material Silicon. A few Square centimeter areas on CCD element can represent 576 *487 pixels (Silva, year).
The CCD camera is built with several photodiodes. When photodiodes are exposed to photon (light), the charge is induced in these diodes. This charge depends on the intensity of several photons. Electrical signals are generated corresponding to charge intensity. Reading of these electrical signals, i.e., transferring of charge from photodiode to analog to digital converters, is done sequentially. Each photodiode can represent one pixel or combination of neighbor pixels. (can add advantages and disadvantages)
Figure 2. Working Principle of CCD Camera (Berns,2019)
CMOS camera. In CMOS camera, photodiodes are connected in series with a resistor and continuous conversion of photon beam into an electrical signal (voltage) is done. In CMOS camera, each photodiode is individually accessed and hence charge conversion to voltage is done parallelly, which in turn increases the speed of processing. (Berns,2019)
Figure 3. Working Principle of CMOS Camera (Berns,2019)
Image RepresentationThe image obtained through the camera is serve as input to object detection algorithms. Before further processing, images are preprocessed. Color images can be represented in various forms depending upon the application. This representation of images is called as color spaces. These are RGB color space, HSV color space, YCbCr color space, LAB color space, grayscale image.
RGB color space. RGB color space consists of a red, green, blue combination of colors. The human visual system works similar to this color model. Hence this color space is mostly used in computer vision. This color space has strongly correlated channels and is non-perceptual in nature.
Figure 4. Original Image with RGB Filters (Berns, 2019).
HSV color space. HSV color model appears as a cone of colors. The components of this model are hue, saturation, and value. Hue is the part of the color, indicating numbers from 0 to 360 degree. It starts with red (0 degrees) and ends with magenta (360). Saturation represents the percentage of gray color from 0 to 100 percent in particular color. By increasing saturation gray color in particular increases and hence faded color is produced. The saturation scale also represents 0 to 1, where 0 is only gray and 1 being the only color. The value indicates the brightness. Value, along with saturation, indicates the color intensity. The range of value varies from 0 to 100 percent, where 0 being black and 100 percent being full color. (Berns, 2019)
Figure 5. HSV Color Model (Berns, 2019)
YCbCr color space. The components of this model are luminance (Y), chrominance blue (Cb), and chrominance red (Cb). This color model is fast to compute, can be compressed and therefore used in TV. Luminance is sensitive to human eyes, but chrominance blue and chrominance red are not sensitive to eyes. While compression chrominance details can be removed, as it does not contain much information concerning human perception.
Figure 6. YCbCr Color Space (Berns, 2019)
Lab color space. This color space is independent of the device. Model of the lab is a three axes system and color of the object in this model is measured in a spectrophotometer. The lightness (L) axis is a vertical axis, which indicates white (+L) to black (-L). A axis runs horizontally with cyan color (-a) to red color (+a). B axis also runs horizontally with blue color (-b) to yellow color (+b).
Figure 7. Lab color model. (Berns, 2019)
Object DetectionAfter acquiring an image, analysis of the image is carried out. In this analysis, objects which are present in the scene are detected and recognized. Detection of objects means separating the interesting object from the rest of the image. For example, an industrial carrier robot captures a scene, in which, a person, its trajectory, its destination along with machinery are included. Following this situation, a robot needs to separate a person from machinery, a trajectory from other routes, destination point apart from other robot’s destination points. So, objects can be divided into a person, machinery, trajectory, and destination point. As well as recognition of these objects into its respective classes is done in the recognition phase. Classes, classified as person class, machinery class, route class, destination point class.
Objects can be detected based on various features. These features include geometric (size, shape), visual (color, texture, image features), physical (weight, temperature, motion), acoustic (noise, acoustic pattern), chemical (emission) (Berns,2019).
Visual features are detected using images captured by cameras (CCD and CMOS).
Visual features. Visual features can be extracted from images. Extracting visual features is fast and cheap. As these features contain much information, large data is available to process. These visual features include
shape (contour detection)
edges (edge filtering)
material (texture extraction, color thresholding)
motion (optical flow, temporal images)
distance (stereo-vision) (Berns,2019)
An object can be detected by detection of object’s edges, by detection of corners of object or detection of the special invariant visual feature of the object.
Object detection is based on two fundamental properties of the image. These are a discontinuity of intensity values and similarities of intensity values (Gonzalez & Woods, 2002).
Edge detection of object. Detection of edges of an object is based on the discontinuity of intensity values in an image. An edge can be defined as a boundary formed by a set of connected pixels between two regions. The shape information of an object is identified through edge detection of an object. Edges can be caused by various factors such as surface normal discontinuity, depth discontinuity, surface color discontinuity, illumination discontinuity. These discontinuities can be well understood from the following figure.
Figure 8. Causes of discontinuities leading to Edge (Berns, 2019)
Detection of edges can be done with various methods such as thresholding gradient of smoothed image, by Marr-Hildreth algorithm or Canny edge detection algorithm. Out of these methods, Canny’s Method is mostly used for AMRs.
Canny edge detector algorithm. The performance of this algorithm is superior to other edge detection methods. This method is, but results are worth of such complexity. The objectives of Canny’s method can be summarized as follows:
Low Error Rate: – Without false results, all edges should be detected.
Edge Points should be well localized: – Edges found should respond to true edges. There should be a minimum distance between detected points as an edge by the detector and true edge point.
Single Edge Point Response: – There should be only one edge point returned by edge detector for true edge point. In other words, no multiple identifications of points for a single true edge point (Gonzalez & Woods, 2018).
Steps for Canny Edge Detection Algorithm.
Gaussian filter smoothing of the input image.
Gradient magnitude and angle images are calculated.
Nonmaximal suppression is applied to the gradient magnitude image.
To detect and link edges double thresholding and connectivity analysis is performed (Gonzalez & Woods, 2018).
Figure 9. Canny Edge Detection Algorithm (Berns, 2019).
Problems related to edge detection in AMRs. As autonomous mobile robots are moving robots, their perception of the environment changes continuously. As a result, edges of objects also change their position, which in turn makes edges as a non-stable feature. Moving edges can lead to ambiguity because of aperture problems.
To avoid this situation, the object’s corner is detected, which improves detection of objects.