Information on entry-level machine learning
In this documentation, basic information about image recognition is explained with CoreML. For the sake of clarity, sample coding will be done on the subject.
What is CoreML?
Apple's machine learning framework.
With Core ML, you can integrate trained machine learning models into your app.
A trained model is the result of applying a machine learning algorithm to a set of training data. The model makes predictions based on new input data. For example, a model that's been trained on a region's historical house prices may be able to predict a house's price when given the number of bedrooms and bathrooms.
-Apple Documentation
Note: To use CoreML framework, you must have the minimum version of Xcode 9 or later.
What is Machine Learning?
Machine Learning (ML) is a branch of Artificial Intelligence (AI). Its purpose is to take any input data (images, text, speech, statistics) then predict the features of behaviors found in the data.
ML allows computers, in our case smartphones, to find hidden insights without being explicitly programmed where to look.
Example;
- Face detection
- Facial features detection
- Image Recognition
- Age prediction
input --> Trained Model --> output
Why is machine learning important?
ML can perform tasks that we thought only humans could do. It adds a human feeling to your apps. It makes apps feel βSmartβ.
Conclusion
- ML is not new (only native ios is new)
- ML field is growing exponentially
- ML can make your apps do more
Creating User Interface
For a basic application an imageview, a label and a button will suffice.
Note: Designed version of the application is available on Github repo.
Choosing Photo with UIImagePickerController
First of all, in the file ViewController.swift
, image view, label and button clicked function are defined.
@IBOutlet weak var imageView: UIImageView!
@IBOutlet weak var resultLabel: UILabel!
override func viewDidLoad() {
super.viewDidLoad()
// ...
}
@IBAction func selectBtnTapped(_ sender: Any) {
// ...
}
Before implementing the selectBtnTapped
method, you need to declare UIImagePickerControllerDelegate
and UINavigationControllerDelegate
next to UIViewController
to use Picker View.
class ViewController: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate {
...
It's ready to implement picker view now.
@IBAction func selectBtnTapped(_ sender: Any) {
let picker = UIImagePickerController()
picker.delegate = self
picker.allowsEditing = true
picker.sourceType = .photoLibrary
self.present(picker, animated: true, completion: nil)
}
And then, implement the didFinishPickingMediaWithInfo
function to complete the photo selection.
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
imageView.image = info[UIImagePickerControllerEditedImage] as? UIImage
self.dismiss(animated: true, completion: nil)
}
Finally, in the Info.plist
file we create a row named "Privacy - Photo Library Usage Description". In section "Value" you can write the message you want.
Downloading Model and Creating Functions
Download Places205-GoogleNet model from Apple/Machine-Learning page.
Let's GoogLeNetPLaces.mlmodel
embed into our project.
And then, import CoreML and Vision into ViewController.swift
.
import UIKit
import CoreML
import Vision
Implement the recognizeImage
function below didFinishPickingMediaWithInfo
. The function takes 1 parameter as CIImage
named image.
func recognizeImage(image: CIImage) {
// ...
}
Define selectedImage variable of type CIImage
.
@IBOutlet weak var imageView: UIImageView!
@IBOutlet weak var resultLabel: UILabel!
var selectedImage = CIImage()
Convert the image taken as UIImage
to CIImage
, and run recognizeImage() defined inside didFinishPickingMediaWithInfo
.
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
imageView.image = info[UIImagePickerControllerEditedImage] as? UIImage
self.dismiss(animated: true, completion: nil)
if let ciImage = CIImage(image: imageView.image!) {
self.selectedImage = ciImage
}
recognizeImage(image: selectedImage)
}
Creating Request with VNCoreMLRequest
First of all, assign "I'm investigating..." to resultLabel.text.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
}
Assign the return value of VNCoreMLModel
function to variable model.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
}
}
Create the request with VNCoreMLRequest
function and send the variable model we created above as parameter which is type model
.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
let request = VNCoreMLRequest(model: model, completionHandler: { (vnrequest, error) in
})
}
}
Assign the results of type VNClassificationObservation
to results.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
let request = VNCoreMLRequest(model: model, completionHandler: { (vnrequest, error) in
if let results = vnrequest.results as? [VNClassificationObservation] {
}
})
}
}
Assign the first result to topResult.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
let request = VNCoreMLRequest(model: model, completionHandler: { (vnrequest, error) in
if let results = vnrequest.results as? [VNClassificationObservation] {
let topResult = results.first
}
})
}
}
Implement the DispatchQueue.main.async
method that will work on the background to create an assumption.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
let request = VNCoreMLRequest(model: model, completionHandler: { (vnrequest, error) in
if let results = vnrequest.results as? [VNClassificationObservation] {
let topResult = results.first
DispatchQueue.main.async {
}
}
})
}
}
Compute the confidence rate of the value stored in topResult and assign it to confidenceRate. And then, for display purposes resultLabel.text is used.
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
let request = VNCoreMLRequest(model: model, completionHandler: { (vnrequest, error) in
if let results = vnrequest.results as? [VNClassificationObservation] {
let topResult = results.first
DispatchQueue.main.async {
let confidenceRate = (topResult?.confidence)! * 100
self.resultLabel.text = "\(confidenceRate)% it's \(String(describing: topResult?.identifier))"
}
}
})
}
}
Handler with VNImageRequestHandler
Firstly below model method, Create handler constant with VNImageRequestHandler
(Option with ciImage).
let handler = VNImageRequestHandler(ciImage: image)
And then, implement DispatchQueue.global(qos: .userInteractive).async
method that will work on the background as we did before while computing the confidence rate.
let handler = VNImageRequestHandler(ciImage: image)
DispatchQueue.global(qos: .userInteractive).async {
}
Create do-catch block and process request with handler.perform().
let handler = VNImageRequestHandler(ciImage: image)
DispatchQueue.global(qos: .userInteractive).async {
do {
try handler.perform([request])
} catch {
print("Err :(")
}
}
it's done. The final state of recognizeImage function:
func recognizeImage(image: CIImage) {
resultLabel.text = "I'm investigating..."
if let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) {
let request = VNCoreMLRequest(model: model, completionHandler: { (vnrequest, error) in
if let results = vnrequest.results as? [VNClassificationObservation] {
let topResult = results.first
DispatchQueue.main.async {
let confidenceRate = (topResult?.confidence)! * 100
let rounded = Int (confidenceRate * 100) / 100
self.resultLabel.text = "\(rounded)% it's \(topResult?.identifier ?? "Anonymous")"
}
}
})
let handler = VNImageRequestHandler(ciImage: image)
DispatchQueue.global(qos: .userInteractive).async {
do {
try handler.perform([request])
} catch {
print("Err :(")
}
}
}
}
Final Status and Trying
Simple CoreML application is completed using Places205-GoogLeNet.
Top comments (1)
Nice tutorial. It seems you don't resize images? I want to resize images using bilnear filtering (one method of resizing), because that's what my model expects.
How would I go about doing so?