DEV Community

Cover image for I got tired of procrastination, so I built this AI tool to make me productive! 🫡
Sunil Kumar Dash for Composio

Posted on • Updated on

I got tired of procrastination, so I built this AI tool to make me productive! 🫡

TL;DR

Lately, I have been procrastinating a lot, binge-watching Netflix, doom-scrolling social media, etc.

Finally, I decided to get rid of my procrastination. What better way than to code an AI agent that keeps me on track and periodically reminds me if I should start indulging in media consumption?

cat-ai-gif

So, here's how I did this.

  • Configure OpenAI GPT-4o, multi-modal AI model.
  • Use a screen analyzer tool to monitor the screen.
  • Pass the screenshots of the screen to GPT at regular intervals.
  • Rendering the message from GPT as a notification in the system.

Composio - Your AI Agent Tooling Platform

Here’s a quick introduction about us.

Composio is an open-source tooling infrastructure for building robust and reliable AI applications. We provide over 100+ tools and integrations across industry verticals from CRM, HRM, and Sales to Productivity, Dev, and Social Media.

They also provide local tools such as CodeAnalyser, RAG, SQL, etc.

Guy Struggling gif

Please help us with a star. 🥹

It would help us to create more articles like this 💖

Star the Composio.dev repository ⭐


Tech Stack

To successfully complete the project, you will need the following.

  • OpenAI SDK and API key: To interact with the LLM.
  • Composio: For accessing image analysing tool.
  • PyAutoGUI: To automate interactions on the screen.
  • Osascript: To execute AppleScript commands for controlling macOS applications.

So, let’s get started.


Let's Get Started 🔥

Begin by creating a Python virtual environment.



python -m venv ai-friend
cd ai-friend
source bin/activate


Enter fullscreen mode Exit fullscreen mode

Now, install the following dependencies.



pip install composio-core
pip install composio-openai openai
pip install pyautogui


Enter fullscreen mode Exit fullscreen mode

Next, Create a .env file and add environment variables for the OpenAI API key.



OPENAI_API_KEY=your API key


Enter fullscreen mode Exit fullscreen mode

To create an OpneAI API key, go to the official site and create an API key in the dashboard.

OpenAI API key dashboard

Set Up Composio

You can use the CLI to set up Composio easily.

First, log in to your account by running the following command.



composio login


Enter fullscreen mode Exit fullscreen mode

This will redirect you to login/signup to Composio.

Composio login page

Upon logging in, a screen with a key will appear.

login page

Copy it and paste it into the terminal.

Now, update apps.



composio apps update


Enter fullscreen mode Exit fullscreen mode

Now, you are ready to move to the coding part.


Building the AI Friend

Now that you have set up the environment, let's hop on to the coding part.

First, import the libraries and initialize the toolsets.



import dotenv
from openai import OpenAI

from composio_openai import App, ComposioToolSet
from composio.utils.logging import get as get_logger

logger = get_logger(__name__)


# Load environment variables from .env
dotenv.load_dotenv()

# Initialize tools.
openai_client = OpenAI()
composio_toolset = ComposioToolSet()

# Retrieve actions
actions = composio_toolset.get_tools(apps=[App.SYSTEMTOOLS, App.IMAGEANALYSERTOOL])


Enter fullscreen mode Exit fullscreen mode

So, in the above code block,

  • We imported all the required libraries and modules.
  • Loaded the variables defined in the .env file.
  • Created an instance of OpenAI() and ComposioToolSet.
  • Retrieved the Actions from SYSTEMTOOLS and IMAGEANALYSERTOO.

So, here is what these tools do.

  • SYSTEM TOOLS: The system tools have two Actions: push notifications and screen capture.
  • IMAGEANALYSERTOOL: This tool has only one Action: analyzes images using multi-modal LLMs like GPT-4o and Claude Sonnet, etc.

If you want to examine the code and how it works, check the code files for system tools and the image analyser tool.

Note: Actions in Composio are tasks that your agent can perform, such as clicking a screenshot, sending a notification, or sending a mail.

Set Up OpenAI Assistant

Now, define a clear and concise prompt for the agent. This is crucial for agent performance. You can alter the prompts based on your requirements.



assistant_instruction = (
    """You are an intelligent and proactive personal productivity assistant.
    Your primary tasks are:
    1. Regularly capture and analyze screenshots of the user's screen.
    2. Monitor user activity and provide timely, helpful interventions.

    Specific responsibilities:
    - Every few seconds, take a screenshot and analyze its content.
    - Compare recent screenshots to identify potential issues or patterns.
    - If you detect that the user is facing a technical or workflow problem:
        - Notify them with concise, actionable solutions.
        - Prioritize non-intrusive suggestions that can be quickly implemented.
    - If you notice extended use of potentially distracting websites or applications (e.g., social media, video streaming):
        - Gently remind the user about their productivity goals.
        - Suggest a brief break or a transition to a more focused task.
    - Maintain a balance between being helpful and not overly disruptive.
    - Tailor your interventions based on the time of day and the user's apparent work patterns.

    Operational instructions:
    - You will receive a 'CHECK' message at regular intervals. Upon receiving this:
        1. Take a screenshot using the screenshot tool.
        2. Then, analyse that screenshot using the image analyser tool.
        3. Then, check if the user uses distracting websites or applications.
        4. If they are, remind them to do something productive.
        5. If they are not, check if the user is facing a technical or workflow problem based on previous history.
        6. If they are, notify them with concise, actionable solutions.
        7. Try to maintain a history of the user's activity and notify them if they are doing something wrong.

    Remember: Your goal is to enhance productivity while respecting the user's autonomy and work style."""
)
assistant = openai_client.beta.assistants.create(
    name="Personal Productivity Assistant",
    instructions=assistant_instruction,
    model="gpt-4-turbo",
    tools=actions,  # type: ignore
)
# create a thread
thread = openai_client.beta.threads.create()
print("Thread ID: ", thread.id)
print("Assistant ID: ", assistant.id)



Enter fullscreen mode Exit fullscreen mode

In the above code block,

  • A detailed assistant instruction is provided.
  • Created a new assistant instance with the previously defined instruction, model name, and previously defined actions.
  • Finally, create a thread for interaction with the models.

Define and Run the Assistant

Now, define a function for running the assistants.



def check_and_run_assistant():
    logger.info("Checking and running assistant")

    # Send 'CHECK' message to the assistant
    message = openai_client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user",
        content="CHECK",
    )

    # Execute Agent
    run = openai_client.beta.threads.runs.create(
        thread_id=thread.id,
        assistant_id=assistant.id,
    )

    # Execute function calls
    run_after_tool_calls = composio_toolset.wait_and_handle_assistant_tool_calls(
        client=openai_client,
        run=run,
        thread=thread,
    )

# Run the assistant check every 10 seconds
while True:
    check_and_run_assistant()



Enter fullscreen mode Exit fullscreen mode

Here’s what is going on in the above code.

  • Send a 'CHECK' Message: This sends a "CHECK" message to the assistant in the specified thread to ensure the model is responsive.
  • Execute Agent: Creates a run for the assistant using the specified thread and assistant IDs.
  • Handle Tool Calls: Waits for and handles tool calls made by the assistant using the Composio toolset.
  • Loop the Agent: Loop the agent so it runs and monitors your workflow continuously.

Finally, execute the file by running the Python file and letting your new AI friend keep you focused on your goals.

The agent monitors your screen and sends a notification when it sees you doing something you should not.

The complete code can be found here

Here is an example of the agent in action.👇


Next Steps

In this article, you built your personalised AI friend that monitors your activity. However, adding external integrations such as a Calendar or Gmail tool can be even more helpful. This lets you know if you have some events to attend or essential emails to respond to.

You can easily do it with Composio’s wide array of integrations, from GitHub and Calendar to Slack, Discord, and more.

If you want to see more AI-related articles, let me know in the comments and give us a star on GitHub.

 
 

star the repo
Star the Composio repository ⭐

 
 

Thank you for reading!

Top comments (13)

Collapse
 
anna_lapushner profile image
anna lapushner

Binge watching Netflix and doom scrolling lol
You're good to encourage AI adoption and welcoming AI as your friend.
Thank you for the personalization code snippets. They look useful!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thank you so much, @anna_lapushner.

Collapse
 
z2lai profile image
z2lai

Super creative AI use case! I'm curious is there any privacy or security concerns of having "AI monitor your screen"? Like what if it picked up sensitive information?

Collapse
 
atsag profile image
Andreas • Edited

I would write exactly the same @z2lai . We are advancing technology, but we are building on moving ground... which means that we may eventually either have to embrace vendor lock-in or stay behind :(
Unless, of course, we change our attitude and focus more on the solidity of a solution than its flashy appeal.

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

The screenshots go to OpenAI, so if they are true to their word, it shouldn't be a problem. :)

Collapse
 
andrewbaisden profile image
Andrew Baisden

Nice one, I have lost count of the amount of hours that get lost in a month because of doom scrolling on websites like Instagram 😂

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

thanks Andrew.

Collapse
 
rnsjey profile image
JM Arenas

This is great, I will give it a try.

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thank you so much.

Collapse
 
time121212 profile image
tim brandom

This is great, I will give it a try.

Collapse
 
kamran2121 profile image
kamran2121

That's a great use case of multi-modal LLMs. Thanks for the post.

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks Kamran.

Collapse
 
reed1 profile image
reed1

I have a simpler idea. How about monitoring activity by listening to window class and window title change? vscode should be working, reddit chrome should be not working. It's simple to make the listener on my wm (i3wm). I can make the script to make something like this:

window_title,window_class,timestamp
project ABC,vscode,12:02:02
reddit,chrome,12:03:50
IDLE,IDLE,12:06:00
project ABC,vscode,12:08:32

from that data, can I leverage Composio to help me?