My internship organization, BOPRC, has a large amount of water meter data, but the current method of data entry is cumbersome and inefficient, requiring employees to manually take pictures and enter the information into an Excel spreadsheet. This approach is time-consuming and error-prone, increasing labour costs.
To solve this challenge, we try to develop a new approach based on AI technology. By integrating a GPT chatbot into the web front end, employees can upload photos of water meters and get automatically recognized text information. In this way, they can more quickly enter data into the back-end database and automate the office.
This project not only improves the efficiency of data entry but also reduces BOPRC's labor costs. In the future, we hope to further optimize the recognition accuracy and expand the system's functionality to make it more intelligent and easy to use.
First of all, I made a lot of modelling attempts, but unfortunately, the OpenAI API needs to be paid for (I'm a bit shy to apply for funds from my unit), but fortunately when I tried to deploy the LLaMA2 model introduced by FaceBook, I found that the model has good language understanding and generation ability, and it can be useful in multiple language tasks, which is a versatile and highly customizable language model. model. Most importantly, it can be deployed locally on my M1 MacBook Pro.
So my idea was to build a front-end page using my self-taught knowledge of React and deploy the LLaMA2 model locally as a back-end program. Implement a simple demo first.
First of all, there are several ways to deploy LLaMA locally, in general using llama.cpp is the most efficient, it also supports M1 GPU calls.
First, we need to clone the library from Github:
git clone https://github.com/ggerganov/llama.cpp.git
Since LLaMA2 has a lot of models, if you want to run them locally (and are not looking for performance for the moment) I recommend 7B-Chat, which is about 4Gb in size.
curl -L https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf --output ./models/llama-2-7b-chat.Q4_K_M.gguf
LLaMA models are now in '.gguf' format, the previous '.bin' format can no longer be used, if you need other models, you can go to this URL to view and download. Please put the models in the models directory.
Once we are ready we can try LLaMA in a local deployment：
./server -m ./models/llama-2-7b-chat.Q4_K_M.gguf
So how do you connect an AI model running on the back-end to a front-end web page? We need a couple of tool libraries: 'FastAPI' and 'llama_cpp(Python)', the
After installing these two we can start by writing our backend Python program:
Since I'm using a Python virtual environment, the code to start the back-end program looks like this:
Now we can start with the front-end tweaks, here I used a React page that I made myself (I wanted to use the OpenAI API before but I had to pay for it) You can find the library on my Github, I'm not going to expand too much on the front-end code as my knowledge of the front-end is very scarce:
This Demo is just a demonstration of the experiments I have conducted for the time being, next I will do more research and try to connect the local AI model to the database to meet the specific needs of the customer, I welcome all partners to make suggestions on my Demo code, I will take the time to reply and make changes! Thanks for reading and may you all have a great day!