This is a text version of this video: packagemain #4: Image Recognition in Go using Tensorflow.
Tensorflow is a computation library that represents computations with graphs. Its core is implemented in C++ and there are also bindings for different languages, including Go.
In the last few years the field of machine learning has made tremendous progress on addressing the difficult problem of image recognition.
One of the challenges with machine learning is figuring out how to deploy trained models into production environments. After training your model, you can "freeze" it and export it to be used in a production environment.
For some common-use-cases we're beginning to see organizations sharing their trained models, you can find some in the TensorFlow Models repo.
In this article we'll use one of them, called Inception to recognize an image.
We'll build a small command line application that takes URL to an image as input and outputs labels in order.
First of all we need to install TensorFlow, and here Docker can be really helpful, because installation of Tensorflow may be complicated. There is a Docker image with Tensorflow, but without Go, so I found an image with Tensorflow plus Go to reduce the Dockerfile.
FROM ctava/tensorflow-go
RUN mkdir -p /model && \
curl -o /model/inception5h.zip -s "http://download.tensorflow.org/models/inception5h.zip" && \
unzip /model/inception5h.zip -d /model
WORKDIR /go/src/imgrecognition
COPY . .
RUN go build
ENTRYPOINT [ "/go/src/imgrecognition/imgrecognition" ]
Let's start with simple main.go which will parse command line arguments and download image from URL:
func main() {
os.Setenv("TF_CPP_MIN_LOG_LEVEL", "2")
if len(os.Args) < 2 {
log.Fatalf("usage: imgrecognition <image_url>")
}
fmt.Printf("url: %s\n", os.Args[1])
// Get image from URL
response, e := http.Get(os.Args[1])
if e != nil {
log.Fatalf("unable to get image from url: %v", e)
}
defer response.Body.Close()
}
And now we can build and run our program:
docker build -t imgrecognition .
docker run imgrecognition https://www.iaspaper.net/wp-content/uploads/2017/10/Rabbit-Essay.jpg
Now we need to load our model. Model contains graph and labels in 2 files:
const (
graphFile = "/model/imagenet_comp_graph_label_strings.txt"
labelsFile = "/model/imagenet_comp_graph_label_strings.txt"
)
graph, labels, err := loadModel()
if err != nil {
log.Fatalf("unable to load model: %v", err)
}
func loadModel() (*tf.Graph, []string, error) {
// Load inception model
model, err := ioutil.ReadFile(graphFile)
if err != nil {
return nil, nil, err
}
graph := tf.NewGraph()
if err := graph.Import(model, ""); err != nil {
return nil, nil, err
}
// Load labels
labelsFile, err := os.Open()
if err != nil {
return nil, nil, err
}
defer labelsFile.Close()
scanner := bufio.NewScanner(labelsFile)
var labels []string
for scanner.Scan() {
labels = append(labels, scanner.Text())
}
return graph, labels, scanner.Err()
}
Now finally we can start using tensorflow Go package.
To be able to work with our image we need to normalize it, because Inception model expects it to be in a certain format, it uses images from ImageNet, and they are 224x224. But that's a bit tricky. Let's see:
func normalizeImage(body io.ReadCloser) (*tensorflow.Tensor, error) {
var buf bytes.Buffer
io.Copy(&buf, body)
tensor, err := tensorflow.NewTensor(buf.String())
if err != nil {
return nil, err
}
graph, input, output, err := getNormalizedGraph()
if err != nil {
return nil, err
}
session, err := tensorflow.NewSession(graph, nil)
if err != nil {
return nil, err
}
normalized, err := session.Run(
map[tensorflow.Output]*tensorflow.Tensor{
input: tensor,
},
[]tensorflow.Output{
output,
},
nil)
if err != nil {
return nil, err
}
return normalized[0], nil
}
// Creates a graph to decode, rezise and normalize an image
func getNormalizedGraph() (graph *tensorflow.Graph, input, output tensorflow.Output, err error) {
s := op.NewScope()
input = op.Placeholder(s, tensorflow.String)
// 3 return RGB image
decode := op.DecodeJpeg(s, input, op.DecodeJpegChannels(3))
// Sub: returns x - y element-wise
output = op.Sub(s,
// make it 224x224: inception specific
op.ResizeBilinear(s,
// inserts a dimension of 1 into a tensor's shape.
op.ExpandDims(s,
// cast image to float type
op.Cast(s, decode, tensorflow.Float),
op.Const(s.SubScope("make_batch"), int32(0))),
op.Const(s.SubScope("size"), []int32{224, 224})),
// mean = 117: inception specific
op.Const(s.SubScope("mean"), float32(117)))
graph, err = s.Finalize()
return graph, input, output, err
}
All operations in TensorFlow Go are done with sessions, so we need to initialize, run and close them. In makeTransformImageGraph we define the rules of normalization.
We need to init one more session on our initial model graph to find matches:
// Create a session for inference over modelGraph.
session, err := tf.NewSession(modelGraph, nil)
if err != nil {
log.Fatalf("could not init session: %v", err)
}
defer session.Close()
output, err := session.Run(
map[tf.Output]*tf.Tensor{
modelGraph.Operation("input").Output(0): tensor,
},
[]tf.Output{
modelGraph.Operation("output").Output(0),
},
nil)
if err != nil {
log.Fatalf("could not run inference: %v", err)
}
It will return list of probabilities for each label. What we need now is to loop over all probabilities and find label in labels
slice. And print top 5.
res := getTopFiveLabels(labels, output[0].Value().([][]float32)[0])
for _, l := range res {
fmt.Printf("label: %s, probability: %.2f%%\n", l.Label, l.Probability*100)
}
func getTopFiveLabels(labels []string, probabilities []float32) []Label {
var resultLabels []Label
for i, p := range probabilities {
if i >= len(labels) {
break
}
resultLabels = append(resultLabels, Label{Label: labels[i], Probability: p})
}
sort.Sort(Labels(resultLabels))
return resultLabels[:5]
}
That's it, we're able to test our image and find what program will say:
docker build -t imgrecognition .
docker run imgrecognition https://www.iaspaper.net/wp-content/uploads/2017/10/Rabbit-Essay.jpg
label: rabbit, probability: 86.72%
...
Here we used pre-trained model but it's also possible to train our models from Go in TensorFlow, and I will definitely do a video about it.
Top comments (3)
Hey Alex Pliutau,
My name is Erik O’Bryant and I’m assembling a team of developers to create an AI operating system. An OS like this would use AI to interpret and execute user commands (just imagine being able to type plain English into your terminal and having your computer do exactly what you tell it). You seem to know a lot about AI development and so I was wondering if you’d be interested in joining my team and helping me develop the first ever intelligent operating system. If you’re interested, please shoot me a message at erockthefrog@gmail.com and let me know.
Thanks for the intro.
The inception github paged does not exist anymore.
Also that would be much more useful if your tutorial did not use Decker.
Thanks for the post!
Waiting for the training model video :)