DEV Community

Akshay Keerthi
Akshay Keerthi

Posted on

Building Environmental Analyzer using Lyzr SDK

The Environmental Data Navigator is not just another data analysis tool; it’s a gateway to unlocking the insights hidden within vast environmental datasets. With an intuitive user interface powered by Streamlit, users can seamlessly upload their data and embark on a journey of exploration and discovery.

Image description

One of the key features of the app is its ability to simplify the complexities of data analysis. Users can upload CSV files containing environmental data with ease, and the app takes care of the rest. Leveraging the power of Lyzr’s DataAnalyzr agent, the app performs insightful analyses and presents users with descriptive summaries and actionable queries.

Why use Lyzr SDK’s?

With Lyzr SDKs, crafting your own GenAI application is a breeze, requiring only a few lines of code to get up and running swiftly.

Checkout the Lyzr SDK’s

Lets get Started!
Create a new file and use that

import os
from pathlib import Path
import streamlit as st
import pandas as pd  # Import pandas module
from utils import utils
from lyzr import DataConnector, DataAnalyzr
Enter fullscreen mode Exit fullscreen mode

This Python script sets up a Streamlit web application for data analysis. It imports necessary libraries such as Streamlit and Pandas, along with custom utility functions from a module called “utils”. The “lyzr” module is also imported, suggesting integration with Lyzr for data analysis capabilities. The script defines functions for data uploading, analysis, and visualization. It also configures the Streamlit app layout and handles user interactions.

Subsequently, it initializes the OpenAI API key using Streamlit’s secrets management. By accessing the specific key stored securely in Streamlit’s secrets, where the OpenAI API key is securely stored, it replaces the placeholder “OPENAI_API_KEY”. This ensures secure access to the OpenAI API within the Streamlit application.

# Set OpenAI API key
os.environ["OPENAI_API_KEY"] = st.secrets["apikey"]
Enter fullscreen mode Exit fullscreen mode
# create directory if it doesn't exist
data = "data"
plot = 'plot'
os.makedirs(data, exist_ok=True)
os.makedirs(plot, exist_ok=True)
Enter fullscreen mode Exit fullscreen mode

This code snippet creates directories named “data” and “plot” if they don’t already exist. It uses the os.makedirs() function to create the directories. The exist_ok=True argument ensures that the function does not raise an error if the directories already exist; it simply skips creating them.

def data_uploader():
    st.subheader("Upload Data file")
    # Upload csv file
    uploaded_file = st.file_uploader("Choose csv file", type=["csv"])
    if uploaded_file is not None:
Enter fullscreen mode Exit fullscreen mode

This function, data_uploader(), is responsible for allowing users to upload a CSV file. It first displays a subheader "Upload Data file" using Streamlit's st.subheader() function. Then, it provides a file uploader component using st.file_uploader() with a label "Choose csv file" and restricts the file type to CSV using the type parameter.

If a file is uploaded (uploaded_file is not None), it calls a function utils.save_uploaded_file() from a module named "utils" to save the uploaded file.

If no file is uploaded, it calls functions utils.remove_existing_files(data) and utils.remove_existing_files(plot) to remove any existing files in the "data" and "plot" directories, respectively. This ensures that previous files are cleared if no new file is uploaded.

def analyzr():
    path = utils.get_files_in_directory(data)
    path = path[0]

    dataframe = DataConnector().fetch_dataframe_from_csv(file_path=Path(path))
    analyzr_instance = DataAnalyzr(df=dataframe, api_key=st.secrets["apikey"])

    return analyzr_instance
Enter fullscreen mode Exit fullscreen mode

The analyzr() function orchestrates the data analysis process for the uploaded CSV file within the application. It begins by retrieving the file path of the uploaded file, likely from the "data" directory. This path is then used to fetch the data, which is loaded into a Pandas DataFrame using the DataConnector class. Subsequently, an instance of the Lyzr DataAnalyzr class is created, passing the DataFrame and the API key for authentication.

This instance encapsulates the data analysis operations, allowing for insights and queries to be generated. Finally, the function returns the analyzr_instance, providing the results of the data analysis for further processing or display.

def file_checker():
    file = []
    for filename in os.listdir(data):
        file_path = os.path.join(data, filename)

    return file
Enter fullscreen mode Exit fullscreen mode

The file_checker() function iterates through the files in the "data" directory to verify their existence. It initializes an empty list file to store file paths. Then, for each file in the directory obtained via os.listdir(data), it constructs the full file path using os.path.join() and appends it to the file list. Finally, the function returns the list of file paths.

# Function to display the dataset description
def display_description(analyzr):
    description = analyzr.dataset_description()
    if description is not None:
        st.subheader("Dataset Description:")

# Function to display queries
def display_queries(analyzr):
    queries = analyzr.ai_queries_df()
    if queries is not None:
        st.subheader("These Queries you can run on the data:")
Enter fullscreen mode Exit fullscreen mode

The display_description(analyzr) function presents the dataset description generated by the Lyzr DataAnalyzr instance passed as an argument. It calls the dataset_description() method of the analyzr object to obtain the description, and if it's not empty, it displays it with a subheader "Dataset Description" and writes it to the Streamlit app interface using st.write().

Similarly, the display_queries(analyzr) function showcases the queries available for the dataset. It invokes the ai_queries_df() method of the analyzr object to fetch the queries, and if there are any, it presents them with a subheader "These Queries you can run on the data" and writes them to the Streamlit app interface using st.write().

# Modify DataConnector class to specify encoding when reading CSV file
class DataConnector:
    def fetch_dataframe_from_csv(self, file_path):
        Fetches a Pandas DataFrame from a CSV file.

            file_path (Path): Path to the CSV file.

            dataframe (DataFrame): Pandas DataFrame containing the data from the CSV file.
            # Specify encoding as 'latin1' when reading the CSV file
            dataframe = pd.read_csv(file_path, encoding='latin1')
            return dataframe
        except Exception as e:
            raise RuntimeError(f"Error occurred while reading CSV file '{file_path}': {str(e)}")
Enter fullscreen mode Exit fullscreen mode

The DataConnector class has been modified to include a specific encoding parameter when reading CSV files. In the fetch_dataframe_from_csv() method, the encoding parameter is set to 'latin1' to handle characters that may not be encoded correctly with the default encoding. This modification ensures that the method can handle a wider range of characters and prevents potential encoding errors when reading CSV files. If an error occurs during the reading process, a RuntimeError is raised, providing information about the file path and the encountered error for debugging purposes.

In conclusion, we’ve embarked on an exciting journey to build an Environmental Data Navigator using Streamlit and Lyzr. Our app serves as a testament to the transformative potential of technology in simplifying complex data analytics tasks. By empowering users with the tools and insights needed to navigate environmental datasets, we’re fostering a culture of data-driven decision-making towards a sustainable future.

As we wrap up our tutorial, let’s not forget that this is just the beginning. There are endless possibilities for further enhancing our Environmental Data Navigator. Consider adding features such as interactive visualizations, advanced analytics algorithms, or integration with real-time environmental data sources. The sky’s the limit!

App link:

Source Code:

Connect with Lyzr
To learn more about Lyzr and its SDK’s, visit our website or get in touch with our team:

Book a Demo: Book a Demo
Discord: Join our Discord community
Slack: Join our Slack channel

Top comments (0)