Image Recognition in Go using Tensorflow

#go #machinelearning #tensorflow #image

This is a text version of this video: packagemain #4: Image Recognition in Go using Tensorflow.

Tensorflow is a computation library that represents computations with graphs. Its core is implemented in C++ and there are also bindings for different languages, including Go.

In the last few years the field of machine learning has made tremendous progress on addressing the difficult problem of image recognition.

One of the challenges with machine learning is figuring out how to deploy trained models into production environments. After training your model, you can "freeze" it and export it to be used in a production environment.

For some common-use-cases we're beginning to see organizations sharing their trained models, you can find some in the TensorFlow Models repo.

In this article we'll use one of them, called Inception to recognize an image.

We'll build a small command line application that takes URL to an image as input and outputs labels in order.

First of all we need to install TensorFlow, and here Docker can be really helpful, because installation of Tensorflow may be complicated. There is a Docker image with Tensorflow, but without Go, so I found an image with Tensorflow plus Go to reduce the Dockerfile.

FROM ctava/tensorflow-go

RUN mkdir -p /model && \
  curl -o /model/inception5h.zip -s "http://download.tensorflow.org/models/inception5h.zip" && \
  unzip /model/inception5h.zip -d /model

WORKDIR /go/src/imgrecognition
COPY . .
RUN go build
ENTRYPOINT [ "/go/src/imgrecognition/imgrecognition" ]

Let's start with simple main.go which will parse command line arguments and download image from URL:

func main() {
    os.Setenv("TF_CPP_MIN_LOG_LEVEL", "2")

    if len(os.Args) < 2 {
        log.Fatalf("usage: imgrecognition <image_url>")
    }
    fmt.Printf("url: %s\n", os.Args[1])

    // Get image from URL
    response, e := http.Get(os.Args[1])
    if e != nil {
        log.Fatalf("unable to get image from url: %v", e)
    }
    defer response.Body.Close()
}

And now we can build and run our program:

docker build -t imgrecognition .
docker run imgrecognition https://www.iaspaper.net/wp-content/uploads/2017/10/Rabbit-Essay.jpg

Now we need to load our model. Model contains graph and labels in 2 files:

const (
    graphFile = "/model/imagenet_comp_graph_label_strings.txt"
    labelsFile = "/model/imagenet_comp_graph_label_strings.txt"
)

graph, labels, err := loadModel()
if err != nil {
    log.Fatalf("unable to load model: %v", err)
}

func loadModel() (*tf.Graph, []string, error) {
    // Load inception model
    model, err := ioutil.ReadFile(graphFile)
    if err != nil {
        return nil, nil, err
    }
    graph := tf.NewGraph()
    if err := graph.Import(model, ""); err != nil {
        return nil, nil, err
    }

    // Load labels
    labelsFile, err := os.Open()
    if err != nil {
        return nil, nil, err
    }
    defer labelsFile.Close()
    scanner := bufio.NewScanner(labelsFile)
    var labels []string
    for scanner.Scan() {
        labels = append(labels, scanner.Text())
    }

    return graph, labels, scanner.Err()
}

Now finally we can start using tensorflow Go package.
To be able to work with our image we need to normalize it, because Inception model expects it to be in a certain format, it uses images from ImageNet, and they are 224x224. But that's a bit tricky. Let's see:

func normalizeImage(body io.ReadCloser) (*tensorflow.Tensor, error) {
    var buf bytes.Buffer
    io.Copy(&buf, body)

    tensor, err := tensorflow.NewTensor(buf.String())
    if err != nil {
        return nil, err
    }

    graph, input, output, err := getNormalizedGraph()
    if err != nil {
        return nil, err
    }

    session, err := tensorflow.NewSession(graph, nil)
    if err != nil {
        return nil, err
    }

    normalized, err := session.Run(
        map[tensorflow.Output]*tensorflow.Tensor{
            input: tensor,
        },
        []tensorflow.Output{
            output,
        },
        nil)
    if err != nil {
        return nil, err
    }

    return normalized[0], nil
}

// Creates a graph to decode, rezise and normalize an image
func getNormalizedGraph() (graph *tensorflow.Graph, input, output tensorflow.Output, err error) {
    s := op.NewScope()
    input = op.Placeholder(s, tensorflow.String)
    // 3 return RGB image
    decode := op.DecodeJpeg(s, input, op.DecodeJpegChannels(3))

    // Sub: returns x - y element-wise
    output = op.Sub(s,
        // make it 224x224: inception specific
        op.ResizeBilinear(s,
            // inserts a dimension of 1 into a tensor's shape.
            op.ExpandDims(s,
                // cast image to float type
                op.Cast(s, decode, tensorflow.Float),
                op.Const(s.SubScope("make_batch"), int32(0))),
            op.Const(s.SubScope("size"), []int32{224, 224})),
        // mean = 117: inception specific
        op.Const(s.SubScope("mean"), float32(117)))
    graph, err = s.Finalize()

    return graph, input, output, err
}

All operations in TensorFlow Go are done with sessions, so we need to initialize, run and close them. In makeTransformImageGraph we define the rules of normalization.

We need to init one more session on our initial model graph to find matches:

// Create a session for inference over modelGraph.
session, err := tf.NewSession(modelGraph, nil)
if err != nil {
    log.Fatalf("could not init session: %v", err)
}
defer session.Close()

output, err := session.Run(
    map[tf.Output]*tf.Tensor{
        modelGraph.Operation("input").Output(0): tensor,
    },
    []tf.Output{
        modelGraph.Operation("output").Output(0),
    },
    nil)
if err != nil {
    log.Fatalf("could not run inference: %v", err)
}

It will return list of probabilities for each label. What we need now is to loop over all probabilities and find label in labels slice. And print top 5.

res := getTopFiveLabels(labels, output[0].Value().([][]float32)[0])
for _, l := range res {
    fmt.Printf("label: %s, probability: %.2f%%\n", l.Label, l.Probability*100)
}

func getTopFiveLabels(labels []string, probabilities []float32) []Label {
    var resultLabels []Label
    for i, p := range probabilities {
        if i >= len(labels) {
            break
        }
        resultLabels = append(resultLabels, Label{Label: labels[i], Probability: p})
    }

    sort.Sort(Labels(resultLabels))
    return resultLabels[:5]
}

That's it, we're able to test our image and find what program will say:

docker build -t imgrecognition .
docker run imgrecognition https://www.iaspaper.net/wp-content/uploads/2017/10/Rabbit-Essay.jpg
label: rabbit, probability: 86.72%
...

Here we used pre-trained model but it's also possible to train our models from Go in TensorFlow, and I will definitely do a video about it.

Full code of this program

Top comments (3)

erikobryant • Dec 19 '23

Hey Alex Pliutau,

My name is Erik O’Bryant and I’m assembling a team of developers to create an AI operating system. An OS like this would use AI to interpret and execute user commands (just imagine being able to type plain English into your terminal and having your computer do exactly what you tell it). You seem to know a lot about AI development and so I was wondering if you’d be interested in joining my team and helping me develop the first ever intelligent operating system. If you’re interested, please shoot me a message at erockthefrog@gmail.com and let me know.