Image Annotation for Deep Learning
Innovators in artificial intelligence (AI) have developed ways for machines to think more like humans. Deep learning, a type of machine learning that imitates how the human brain works, uses artificial neural networks (ANNs) to classify, differentiate, solve problems, and even learn on their own based on feedback related to its success and errors. Image annotation for deep learning and other related methods employ a variety of interconnected processing elements that work together: the input layer that receives data, hidden layers that perform computations, and the output layer that outputs results or initiates action.
Training a Deep Learning Model
Deep learning neural networks need training to work as intended, and data scientists accomplish this using labeled data sets. For deep learning with a machine vision component, it takes more than teaching a model to identify an image, although that’s a part of it. A deep learning model may also need to detect how close or large something is, direct actions based on reaching a particular threshold, or make decisions based on probability.
Data scientists use the process of image annotation to prepare data to train models using computer vision and to validate that the model is producing the desired results.
In general, image annotation uses three processes to make images recognizable and usable to deep learning models:
- Classification: Image classification helps the model identify objects based on their properties. Data used to train a model to classify often have one main object in the image.
- Object detection: This type of image annotation trains the model to identify and find the accurate position of the object. Object detection data uses “bounding boxes.” When an image includes multiple items, drawing bounding boxes around specific items of interest trains the model to identify and find them.
There is skill to drawing bounding boxes effectively — they need to include the entire object but be drawn tightly enough not to include too much of the background that the model could learn as part of the image.
- Image segmentation: Image annotation for deep learning often requires detailed precision. Image segmentation tells the model down to pixel level what is part of a particular image and what is not. Unlike bounding boxes, which can overlap, image segmentation trains the model to identify images with high accuracy.
The precision necessary requires the right tools. Image annotators use a pen or stylus to outline the object and then shade it to differentiate it from other objects in the image. If the image was taken at night or in low light, the data annotator may have to adjust brightness and contrast of images to make it apparent which parts of the image belong to the object you are training the model to identify and classify. The image annotation process may also involve instance segmentation for nested images, which involves separate regions in the image to differentiate different objects.
How Image Annotation Delivers Results
When a deep learning model is trained with carefully labeled images, the technology can produce ground-breaking results.
A familiar example is self-driving cars. The deep learning models that power autonomous vehicles use image data from cameras as well as data from sensors, mapping and locationing systems, and the vehicle itself. How the model is trained is crucial for the safety of passengers and pedestrians. Within view of a self-driving car’s camera, there could be stationary objects such as utility poles or trees, moving objects such as other vehicles or animals, and people. A deep learning model must learn to classify them all and control the vehicle accordingly.
Deep learning has also proven to be an effective tool in medical diagnosis — sometimes delivering results equal to or better than humans can. Researchers at Johns Hopkins developed technology that detects age-related macular degeneration — and it’s as accurate as detection by human ophthalmologists.
A Duke University Health System app, Autism & Beyond, powered by the Apple Research Kit, is another example of a deep learning application. It used the front-facing camera of an iPhone and facial recognition algorithms in a study on diagnosing autism in young children.
Data Quality is Key
Training a deep learning model is wasted time if the data set you use is inaccurately labeled, or the image annotation process you use is careless and inconsistent. The model will produce results that reflect the quality of the data.
Take as much care in image annotation and developing training and validation data sets as you devote to building the deep learning ANN. The payoff will be a model you can count on for reliable performance.
Opinions expressed by Daivergent contributors are their own.