One of the most exciting applications of Artificial Intelligence & Machine Learning is in image processing; & incidently, we’re mapmakers! Let’s jump down a rabbit hole, shall we?
Artificial Intelligence (AI) and Machine Learning (ML) have been widely used in various applications, including image processing and analysis. In image processing, AI and ML detects missing roads and buildings from aerial images. This task is vital in many fields, such as urban planning, disaster management, and mobility.
Roads detected by Facebook’s ML algorithms, as seen on the RAPiD Editor.
Pre-processing
The first step in detecting missing roads and buildings from aerial images is to pre-process the images. This step includes tasks such as image resizing, normalisation, and segmentation. A good data scientist standardises all input images to the same size to avoid biases that may creep in due to the availability of detail or the lack of it.
The next pre-processing step involves standardising the brightness and contrast in images. Like any image taken even on-ground, different lighting conditions affect camera sensors differently. In cases where the sensors are not properly attuned to the prevalent lighting conditions, the highlights, shadows and mid-tones can appear unbalanced. Unbalanced images lead to images being either over-exposed, under-exposed or appearing washed out. A sensor error can also propagate these conditions through the entire series of images. Since aerial and satellite images are taken using passes over a given patch of land, re-capturing the images is often not an option or expensive in terms of time and money. Standardisation of images in terms of brightness and contrast helps in the following ways:
- Variations in brightness and contrast can affect the algorithm's ability to detect and classify objects. Normalisation improves the confidence with which objects are detected.
- Normalising the brightness and contrast can improve the overall quality of the images, making it easier for the algorithms to work with them. This normalisation can improve the algorithms' accuracy and the system's overall performance.
- Normalising the brightness and contrast can reduce the effects of lighting conditions, such as shadows and reflections, improving algorithm robustness and their resistance to variations in lighting conditions.
- Normalising the brightness and contrast can also reduce the effects of sensor noise and other image artefacts, which can be a significant source of error in image processing systems.
- Normalising the brightness and contrast can also improve the overall consistency of the images in a dataset, which is important for training ML models.
Extracting Objects & Image Segmentation
Once the Data Scientist pre-processes the images, the next step is to segment them to extract objects. Image segmentation separates the objects in the image, such as roads, buildings and trees, among other things. Once separated, the objects are taken aside, and the location is marked, which can then be further used to populate a map.
Several image segmentation algorithms can segment images, including:
- Thresholding: This is a simple and basic algorithm that segments an image based on a threshold value. The algorithm converts the image into a binary image by setting all pixels with a value greater than the threshold value to white and all pixels with a value less than the threshold value to black.
- Watershed algorithm: This algorithm floods the image from different markers, typically defined as the local minima of the image. The algorithm finds the local minima and then "floods" the image from these markers, separating the different regions of the image.
- Region-based algorithms: These algorithms segment an image by grouping pixels that are similar in some way, such as colour, texture, or intensity. These algorithms are typically based on clustering techniques, such as k-means or mean-shift.
- Edge-based algorithms: These algorithms segment an image by detecting edges, or boundaries, in the image. These algorithms are typically based on gradient-based techniques, such as the Canny edge detector or the Sobel operator.
- Object-based algorithms: These algorithms segment an image by detecting objects, such as roads, buildings, or people, in the image. These algorithms typically use object detection techniques, such as YOLO, R-CNN, or RetinaNet.
- Deep learning-based algorithms: These algorithms segment an image using a deep neural network, which is trained to segment the image using a dataset of labeled images. These algorithms can be more accurate but computationally expensive than the other techniques.
It's worth noting that the algorithm's choice depends on the images' characteristics, the required outcome, and the computational resources available.
Accuracy of AI and the role of the Community
AI algorithms depend a lot on rules they already know to detect and predict objects of interest. Two roads that nearly intersect each other will almost always connect. In reality, there may be a barrier which may not be clear enough to the algorithm. Roads and their connectivity to each other is a crucial component of navigation. In cases where these connections are incorrect, the navigation system will try and route the user through impossible routes, causing anguish and bad routing.
Generally speaking, a community knows their area better than anyone else. Open data and mapping projects are created with the same ethos, allowing the community to be in control via direct access or feedback loops to make maps that serve the people interacting with the area. The challenge with community inputs is the need for more standardisation, frequency and enthusiasm with which the community provides inputs to the maps that depict that area.
The Road Ahead
AI is growing to be more ubiquitous by the day, and the scale at which algorithms can process and output data will solve the most significant problem statements faced by maps today - concurrency, vintage and the ability to react to changes in the world. A reactive community that gives feedback to the algorithm where it goes wrong will only allow it to improve and increase the accuracy with which it detects and predicts objects on the map.
Cheers!
Top comments (0)