McDeepNet: Training TensorFlow Model on McDonald’s Reviews

#python #programming #openai #ai

Introduction

Ever wondered what a Machine Learning (ML) model would say about a Big Mac or fries? Let’s find out.. McDeepNet is a playful ML experiment trained on a dataset of McDonald’s reviews.
Check out the source code and the app:

The Dataset Overview

The Kaggle dataset (McDonald’s Store Reviews) comprises 20,000 reviews of various McDonald’s stores.

Link to Dataset

The Technology Powering McDeepNet

TensorFlow and Keras: framework and functionalities needed for model building and training.
Pandas and NumPy: data processing and manipulation, enabling efficient handling of the review dataset.
Streamlit: used to convert our RNN model into an interactive web application, allowing for easy demonstration and user interaction.
Plotly Express: visualizing the insights derived from our model, making the data analysis both accessible and engaging.

Unraveling McDeepNet's Functionality

Let’s break it down..

a. Input Layer: Represents the initial input of text data (e.g., McDonald's reviews).
b. Tokenizer: This stage converts the input text into numerical sequences, making it understandable for the neural network.
c. Pre-trained RNN Model: This is the core of McDeepNet where the actual processing of sequential data takes place. It could be further detailed to show:
c.1 An Embedding Layer: Converts the sequences into dense vectors of fixed size.
c.2 LSTM Layers: Part of the RNN architecture, responsible for learning from the sequence data.
e. Output Layer: Generates the final output, which could be the transformed text or predictions based on the input.
Output: The final result produced by the model, such as generated text or analysis results.
g. Application Layer: Streamlit web app in McDeepNet's case.

Training Phase of McDeepNet

1.1 Crafting the Model Architecture:
The cornerstone of McDeepNet is its RNN model, tailored to handle sequences up to 442 characters in length.

Code Snippet:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

# Set the maximum sequence length
max_length = 442

# Specify the vocabulary size (unique words in the dataset)
vocab_size = 10000  # Example value

# Define the embedding dimension
embedding_dim = 256  # Example value

# Constructing the RNN model architecture
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(128))
model.add(Dense(64, activation='relu'))
model.add(Dense(vocab_size, activation='softmax'))

# Compiling the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Summarizing the model
model.summary()

1.2 Preparing the Training Data:
The process of training McDeepNet on 20,000 McDonald's reviews is a meticulous one, involving several crucial preprocessing steps.

Code Snippet for Data Preparation:

import pandas as pd
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Importing the dataset
df = pd.read_csv('mcdonalds_reviews.csv')  # Replace this with the path to your dataset
reviews = df['review'].tolist()  # Assuming the reviews are stored in a column named 'review'

# Text preprocessing steps
# Cleaning involves lowercasing and removing non-alphabetic characters
cleaned_reviews = [re.sub(r'[^a-zA-Z\s]', '', review.lower()) for review in reviews]

# Tokenizer initialization and configuration
tokenizer = Tokenizer(num_words=10000)  # Setting the limit to the top 10,000 words
tokenizer.fit_on_texts(cleaned_reviews)

# Transforming text into integer sequences
sequences = tokenizer.texts_to_sequences(cleaned_reviews)

# Padding sequences to ensure uniform length
max_length = 442
padded_sequences = pad_sequences(sequences, maxlen=max_length, padding='post')

# The padded sequences are now ready to be fed into the RNN model

1.3 The Model Training Process:
This stage is vital as it involves actual training of the McDeepNet model.

Code Snippet for Model Training:

from tensorflow.keras.utils import to_categorical

# Assuming the padded sequences and tokenizer have been previously defined
# Convert the sequences into categorical data, suitable for classification tasks
labels = to_categorical(padded_sequences)

# Dividing the data into training and validation subsets
# Here, we use an 80% training and 20% validation split as an example
train_size = int(len(padded_sequences) * 0.8)
X_train, X_val = padded_sequences[:train_size], padded_sequences[train_size:]
y_train, y_val = labels[:train_size], labels[train_size:]

# Setting the number of epochs for the training process
epochs = 10  # This is an illustrative figure

# Commencing the training
history = model.fit(X_train, y_train, epochs=epochs, validation_data=(X_val, y_val))

# Option to save the model post-training
model.save('text_generation_model.h5')

Model Loading Phase

2.1 Retrieving the Trained Model:
After training, the model is preserved for future applications.

Code Snippet for Model Loading:

from tensorflow.keras.models import load_model

# Specify the file path of the saved model
model_path = 'text_generation_model.h5'

# Reloading the pre-trained model
model = load_model(model_path)

# The model is now primed for inference tasks
# You can employ methods like model.predict() for your specific needs

Reloading the Tokenizer

2.2 Accessing the Pre-Trained Tokenizer:
The tokenizer, an integral component trained alongside the dataset, plays a critical role in converting new textual inputs into structured sequences that the model can understand.

Code Snippet for Tokenizer Loading:

import pickle
from tensorflow.keras.models import load_model

# Reinstating the model (optional if already loaded)
model = load_model('text_generation_model.h5')

# Opening and loading the saved tokenizer
with open('tokenizer.pickle', 'rb') as handle:
    tokenizer = pickle.load(handle)

Preparing for Inference

3.1 Inference Readiness of the Model:
Having successfully loaded both the model and the tokenizer, your setup is now fully equipped to undertake text generation tasks using new inputs. Just need a UI..

Developing a Streamlit Web Application Interface

Crafting a user-friendly interface with Streamlit is a straightforward process.
This interface will allow users to input initial text (seed text), adjust generation settings, and then receive custom-generated reviews.

Code Snippet for Streamlit Interface:

import streamlit as st

# Setting up the title and subtitle for the web application
st.title("🍔 McDeepNet 🍔")
st.subheader("Experience AI-Generated McDonald's Reviews")

# Creating a form for user inputs
with st.form(key='user_input_form'):
    seed_text = st.text_input(label='Enter Seed Text for Review Generation')
    num_words = st.number_input('Select Number of Words to Generate', min_value=1, max_value=100, value=5)
    temperature = st.slider('Adjust Generation Creativity (Temperature)', min_value=0.1, max_value=3.0, value=1.0, step=0.1)
    submit_button = st.form_submit_button(label='Generate Review')

# Additional instructions or user guidance can be added here

3.1 Visualizing the Generated Results

Once the user inputs their preferences, McDeepNet produces a unique review.

Code Snippet for Result Generation and Visualization:

# Set up the UI
st.title("🍔 McDeepNet 🍔")
st.subheader("Trained on 20k McDonald's Reviews")
st.write("Welcome to McDeepNet! This project uses a Machine Learning (ML) model trained on 20,000 McDonald's reviews. It's an interesting application that employs Recurrent Neural Networks (RNNs) to learn patterns from these reviews and, subsequently, generates a unique review of its own. The model can produce varying types of output based on a seed text and a temperature parameter provided by the user.")
st.markdown("""
- [Checkout my GitHub](https://github.com/zanepearton)
- [My dev.to Article](https://dev.to/zanepearton/mcdeepnet-training-tensorflow-on-mcdonalds-reviews-21e)
""")

# Form to take user inputs
with st.form(key='my_form'):
    seed_text = st.text_input(label='Enter the seed text for sentence completion')
    num_words = st.number_input(label='Enter the number of words to generate', min_value=1, max_value=100, value=50)
    temperature = st.slider(label='Set temperature', min_value=0.1, max_value=3.0, value=1.0, step=0.1)
    submit_button = st.form_submit_button(label='Generate Text')

# Generate and display the output on form submission
if submit_button:
    sentence, word_probs = generate_sentence(model, tokenizer, max_length, seed_text, num_words, temperature)

    st.text_area("Generated Text", value=sentence, height=150)

    # Count word frequencies
    word_freq = Counter(sentence.split())

    # Create and display the tree diagram
    fig_tree = create_tree_diagram(word_probs)
    st.plotly_chart(fig_tree)

    # Create a DataFrame for the frequencies
    freq_df = pd.DataFrame(list(word_freq.items()), columns=['Word', 'Frequency'])

    # Create a Plotly Express scatter plot
    fig_scatter = px.scatter(freq_df, x='Word', y='Frequency', size='Frequency', title='Word Frequencies', 
                             hover_name='Word', size_max=60)
    st.plotly_chart(fig_scatter)