DEV Community

Cover image for 🤖Dueling AIs: Questioning and Answering with Language Models🚀
Prashant Iyer for LLMWare

Posted on

🤖Dueling AIs: Questioning and Answering with Language Models🚀

You've probably asked a question to a language model before and then had it give you an answer. After all, this is what we most commonly use language models for.

But have you ever received a question from a language model? While not as common, this application of AI has diverse use cases in areas like education, where you might want a model to give you practice questions for a test, and in sales enablement, where you question your business's sales team about your products to improve their ability to make sales.

Now, what if we had a face off⚔️ between two different models: one that asked questions about a topic and another that answered them? All without human intervention?

In this article, we're going to look at exactly that. We'll provide a sample passage about OpenAI's AI safety team as context to our models. We'll then let our models duel it out! One model will ask questions based on this passage, and another model will respond!

Duel GIF


Our AI Models🤖

Intoducing, slim-q-gen-tiny-tool. This will be our question model, capable of generating 3 different types of questions:

  • Multiple choice questions
  • Boolean (true/false) questions
  • General open-ended questions

Facing off against this will be bling-phi-3-gguf! This will be our answer model, giving appropriate responses to any of the above types of questions.

One important note is that both these models are GGUF quantized. This means that they are smaller and faster versions of their original counterparts. What this means for us is that we can run them on just a CPU, with no need for resources like GPUs!


Step 1: Providing input parameters✏️

This is what our function signature for this example looks like.

def ask_and_answer_game(source_passage, q_model="slim-q-gen-tiny-tool", number_of_tries=10, question_type="question",
                        temperature=0.5):
Enter fullscreen mode Exit fullscreen mode
  • source_passage is the text input that we will provide our models,
  • q_model is our questioning model,
  • number_of_tries is the number of questions we will attempt to generate (more on this later!)
  • question_type can be either "multiple choice", "boolean" or "question" corresponding to each of the types of questions we saw above,
  • temperature is a value ranging from 0 to 1 that determines how much variance we will see in our generated questions. Here, the value of 0.5 is relatively high so that we get a good variety of questions with little repetition!

Step 2: Loading in our models🪫🔋

With the inputs taken care of, let's now load in both our models.

q_model = ModelCatalog().load_model(q_model, sample=True, temperature=temperature)
Enter fullscreen mode Exit fullscreen mode

Notice that we have sample=True to increase variety in our model output (the questions generated).

Now, for the answer model.

answer_model = ModelCatalog().load_model("bling-phi-3-gguf")
Enter fullscreen mode Exit fullscreen mode

We won't mess with the sample or temperature options here because we want concise, fact-based answers from this model.


Step 3: Generating our questions🤔💬

We'll try to generate questions number_of_tries times, which in this case is 10. We'll then then update our questions list with only the unique questions, to avoid repetitions.

questions = []

# Loop number_of_tries times
for x in range(0, number_of_tries):
    response = q_model.function_call(source_passage, params=[question_type])
    new_q = response["llm_response"]["question"]

    # Check to see that the question generated is unique
    if new_q and new_q not in questions:
        questions.append(new_q)
Enter fullscreen mode Exit fullscreen mode

An important function here is q_model.function_call(). This is how the llmware library lets you prompt language models with just a single function call. Here, we pass in the source text and question type as arguments.

The function returns response, a dictionary with a lot of information about the call, but we're only interested in the question key, which is located inside the llm_response dictionary.


Step 4: Responding to our questions📝

Now that the questions have been generated, the duel is on! Let's use our answering model to now respond to these questions. We'll loop through our questions list, pass in the source passage as context to the model and ask each question.

# Loop through each question
for i, question in enumerate(questions):
    # Print out the question
    print(f"\nquestion: {i} - {question}")

    # Validate the question list and run inference
    if isinstance(question, list) and len(question) > 0:
        response = answer_model.inference(question[0], add_context=test_passage)

        # Print out the answer
        print(f"response: ", response["llm_response"])
Enter fullscreen mode Exit fullscreen mode

It is important to note that our question model returns each question as a list, with the first element (question[0]) containing the actual string corresponding to the question.

For each question, we then need to perform some validation:

  • Check to see that the question is of the correct data type (list)
  • Check to see that the question is not empty.

Then, the answer_model.inference() function will ask our model the question, passing in the test_passage as context.

Finally, we print out the response.


Results!✅

Let's quickly look at our sample passage. This passage was taken from a CNBC news story in May 2024 about OpenAI's work with safety and security.

"OpenAI said Tuesday it has established a new committee to make recommendations to the company’s board about safety and security, weeks after dissolving a team focused on AI safety. In a blog post, OpenAI said the new committee would be led by CEO Sam Altman as well as Bret Taylor, the company’s board chair, and board member Nicole Seligman. The announcement follows the high-profile exit this month of an OpenAI executive focused on safety, Jan Leike. Leike resigned from OpenAI leveling criticisms that the company had under-invested in AI safety work and that tensions with OpenAI’s leadership had reached a breaking point."

Now, let's see what our output looks like!

Sample output

We can see all the questions that were asked about the passage, as well as concise, fact-based responses given to them!

Note that there are only 9 questions here while we provided number_of_tries=10. This means that one question generated was a duplicate and was ignored.


Conclusion

And with that, we're done with this example! Recall that we used the llmware library to:

  1. Load in a question and answer model
  2. Generate unique questions about a source passage
  3. Respond to each question with accuracy.

And remember that we did all of this on just a CPU! 💻

Check out our YouTube videon on this example!

If you made it this far, thank you for taking the time to go through this topic with us ❤️! For more content like this, make sure to visit our dev.to page.

The source code for many more examples like this one are on our GitHub. Find this example here.

Our repository also contains a notebook for this example that you can run yourself using Google Colab, Jupyter or any other platform that supports .ipynb notebooks.

Join our Discord to interact with a growing community of AI enthusiasts of all levels of experience!

Please be sure to visit our website llmware.ai for more information and updates.

Top comments (2)

Collapse
 
aidev555 profile image
Sam Glassman

Very interesting!

Collapse
 
noberst profile image
Namee • Edited

Great article and a cool use case