Making the computer have eyes is no easy task. Yes, you can grab a webcam feed but that doesn't mean the computer can parse what it's looking at.
Recent developments did push the field forward. With Deep Learning technology they can now do basic observations of different objects in many different positions.
So if you don't know anything about deep learning or neural networks, how do you get started in the field of computer vision?
Of course you cannot started with the most complicated concepts and work your way backwards. You have to start at the basics.
At the most basic level, you can do pattern recognition. To reduce complexity, I recommend starting out by learning Python as opposed to C++.
Character recognition (OCR) is a very basic task of Computer Vision.
We can recognize basic characters (a,b,c) from an image. This is named "Optical Character Recognition". Tesseract is a free OCR engine.
apt-get install tesseract-ocr
In the terminal you can do:
tesseract example.png output.txt cat output.txt
where example.png is this image:
You can use Python to interact with Tesseract. Install the modules pillow and pytesseract
pip install Pillow pip install pytesseract
Then you can run this code which will translate the text on the image to text in the terminal:
#!/usr/bin/python3 from PIL import Image import pytesseract def ocr_core(filename): text = pytesseract.image_to_string(Image.open(filename)) return text print(ocr_core('example.png'))
This is a very basic example of computer vision. There's a lot more you can do using all kinds of techniques. However, I think that any introduction to a field should be as simple as possible.