In today’s tech-driven world, automation is revolutionizing recruitment. Imagine having a virtual IT interviewer that not only interacts intelligently but also communicates verbally with candidates. This post will guide you through building an IT interviewer using Ollama and Python, integrating audio capabilities for a more immersive experience.
📚 Introduction
Finding the right talent can be challenging and time-consuming. With advancements in AI and audio processing, it's possible to automate the initial interview phase. This project showcases how to create an interactive IT interviewer that asks questions and processes answers through voice, using Ollama and Google Cloud's Speech-to-Text and Text-to-Speech APIs.
🚀 What You Will Learn
- How to set up Ollama for conversation handling.
- Integrate Google Cloud’s Speech-to-Text and Text-to-Speech APIs for audio capabilities.
- Structure a Python project to automate interviews.
🛠️ Prerequisites
- Python 3.7+
- Google Cloud Account: For Speech-to-Text and Text-to-Speech APIs.
- Ollama Account: For conversational AI.
📂 Project Setup
1. Clone the Repository
Start by cloning the project repository:
git clone https://github.com/josmel/ollama-it-interviewer.git
cd ollama-it-interviewer
2. Create and Activate a Virtual Environment
Set up a virtual environment to manage dependencies:
python -m venv venv
source venv/bin/activate
3. Install Dependencies
Install the required Python packages:
pip install -r requirements.txt
4. Configure Google Cloud
a. Enable the APIs
Enable the Speech-to-Text and Text-to-Speech APIs in your Google Cloud Console.
b. Create Service Account and Download JSON Key
- Go to IAM & Admin > Service accounts.
- Create a new service account, grant it the necessary roles, and download the JSON credentials file.
c. Set the Environment Variable
Set the environment variable to point to your credentials file
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-file.json"
Replace /path/to/your/service-account-file.json
with the actual path to your credentials file.
Prepare Audio Files
Add sample audio files in the audio_samples/ directory. You need a candidate-response.mp3 file to simulate a candidate's response. You can record your voice or use text-to-speech tools to generate this file.Update Configuration
Edit src/config.py to configure your Ollama credentials:
OLLAMA_API_URL = 'https://api.ollama.com/v1/conversations' # Or replace with your Ollama local
OLLAMA_MODEL = 'your-ollama-model' # Replace with your Ollama model
7. Run the Project
Run the interviewer script:
# Option 1: Run as a module from the project root
python3 -m src.interviewer
or
# Option 2: Ensure PYTHONPATH is set and run directly
export PYTHONPATH=$(pwd)
python3 src/interviewer.py
📝 Detailed Explanation
interviewer.py
The main script orchestrates the interview process:
from pydub import AudioSegment
from pydub.playback import play
from src.ollama_api import ask_question
from src.speech_to_text import recognize_speech
from src.text_to_speech import synthesize_speech
from dotenv import load_dotenv
import os
# Load environment variables
load_dotenv()
# Configure FFmpeg for macOS/Linux
os.environ["PATH"] += os.pathsep + '/usr/local/bin/'
def main():
question = "Tell me about your experience with Python."
synthesize_speech(question, "audio_samples/question.mp3")
question_audio = AudioSegment.from_mp3("audio_samples/question.mp3")
play(question_audio)
candidate_response = recognize_speech("audio_samples/candidate-response.mp3")
ollama_response = ask_question(candidate_response)
print(f"Ollama Response: {ollama_response}")
synthesize_speech(ollama_response, "audio_samples/response.mp3")
response_audio = AudioSegment.from_mp3("audio_samples/response.mp3")
play(response_audio)
if __name__ == "__main__":
main()
ollama_api.py
Handles interaction with Ollama API:
import requests
from src.config import OLLAMA_API_URL, OLLAMA_MODEL
def ask_question(question):
response = requests.post(
OLLAMA_API_URL,
json={"model": OLLAMA_MODEL, "input": question}
)
response_data = response.json()
return response_data["output"]
Converts audio to text using Google Cloud:
from google.cloud import speech
import io
def recognize_speech(audio_file):
client = speech.SpeechClient()
with io.open(audio_file, "rb") as audio:
content = audio.read()
audio = speech.RecognitionAudio(content=content)
config = speech.RecognitionConfig(
encoding=speech.RecognitionConfig.AudioEncoding.MP3,
sample_rate_hertz=16000,
language_code="en-US",
)
response = client.recognize(config=config, audio=audio)
for result in response.results:
return result.alternatives[0].transcript
text_to_speech.py
Converts text to audio using Google Cloud:
from google.cloud import texttospeech
import os
def synthesize_speech(text, output_file):
# Verify that the environment variable is set
assert 'GOOGLE_APPLICATION_CREDENTIALS' in os.environ, "GOOGLE_APPLICATION_CREDENTIALS not set"
client = texttospeech.TextToSpeechClient()
synthesis_input = texttospeech.SynthesisInput(text=text)
voice = texttospeech.VoiceSelectionParams(
language_code="en-US",
ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3
)
response = client.synthesize_speech(
input=synthesis_input, voice=voice, audio_config=audio_config
)
with open(output_file, "wb") as out:
out.write(response.audio_content)
print(f"Audio content written to file {output_file}")
🎉 Conclusion
By integrating Ollama and Google Cloud’s audio capabilities, you can create a virtual IT interviewer that enhances the recruitment process by automating initial candidate interactions. This project demonstrates the power of combining conversational AI with audio processing in Python.
Give it a try and share your thoughts in the comments! If you encounter any issues or have suggestions, feel free to ask.
📂 Project Structure
ollama-it-interviewer/
│
├── audio_samples/
│ ├── candidate-response.mp3
│
├── src/
│ ├── interviewer.py
│ ├── ollama_api.py
│ ├── speech_to_text.py
│ ├── text_to_speech.py
│ └── config.py
│
├── requirements.txt
├── README.md
└── .gitignore
🛠️ Resources
- Ollama
- Google Cloud Speech-to-Text
- Google Cloud Text-to-Speech
- Python pydub
💬 Questions or Comments?
Feel free to leave any questions or comments below. I’m here to help!
Repository : https://github.com/josmel/ollama-it-interviewer
Top comments (2)
Wow. If a company does this I'm out immediately. Why should I waste my time with AI crap if the company does not value my time? It's a little project to try those technologies, but please don't actually use it.
Thank you for sharing your perspective! I completely understand your concern. This project is designed more as an exploration of the technical capabilities of Ollama AI and real-time audio integration, rather than as a replacement for the human interview process.