shrey vijayvargiya

So chat GPT can hear now!!

Under the Hood

This all began when I read the tweet from Sam Altman, the founder of Open AI.

Okay, now chat GPT can take images and listen to our voice as a type prompt.
This is cool, watch this tweet from the Open AI team.
Open AI team tweet

How one can actually use the image upload feature in chat GPT to solve some real-life problem ASAP?

Jarvis days are near

Being a developer I have so much fascination with Jarvis in Iron Man.

I mean I eagerly want my personal Jarvis to help me write code, develop websites, send emails, write these blogs and do much more.

It will save time, and cost and I can literally think of business ideas instead of doing repetitive work.

But how this will lead to create such advanced AI-powered assistants like Jarvis?

Well, it's simple, chat GPT can now listen and take images, so basically it can watch. We can easily integrate our laptop camera with chat GPT and with microphone access and hence our major work of proving prompt is completed.

Our jarvis can now watch via the camera, listen from the microphone and give output from the laptop speaker, so hence it can easily work as jarvis. 

Semi or Pseudo Jarvis

Recently, I reached out to this product called Hyperwrite AI.
HyperWrite - Your Personal AI Writing Assistant

Hyperwrite can basically do anything in your browser by just providing simple text.
For example, If I write post a tweet on my behalf, 

  • It will open the browser
  • Write the tweet message in the text box 
  • Click the post-tweet button
  • Post the tweet in the end on the user's behalf.

Wow, this is crazy, watch this Instagram video to get more details.
AI Agent that Operates Browser like Human🤯

How it works under the hood is simple, 

  • First, it takes the user's prompt
  • Then use web scraping, testing toolkits like selenium and other libraries to scrap data, do web scraping
  • Finally, it performs the task after reaching the targeted website in the browser.

This is crazy, now using one command I can literally schedule my threads.
Yeah, threads can't be scheduled but I can say this AI tool to write a Twitter thread for me and post it after one hour, it will work.

Let's try something different, I watch the Musicgen

This is another crazy application, we can now create music from the text!!!

Can you imagine, we can tell jarvis to give 

  • YouTube background music from our scripts
  • IG reels music
  • Short video clips music
  • Actual music 

And whatnot else all from one text or so-called prompt.
Try it now!!!

I am not stopping here, I am consuming content like a mad man for the past 2 weeks and watching all the AI-based products.
Here is the third one


Ali Abdaal post a tweet about it and share about it on his YT video.

FireCut - Your Lightning-Fast AI Video Editor

This product basically helps the user to automatically generate videos directly in Premiere Pro using AI or GPT or prompt.
Yes, we can literally enter the script and Firecut will create videos, edit videos, add images and animations in between frames and so on all in one go.

This is crazy, If I am able to integrate Hyperwrite into Premiere Pro along with Firecut, I will have my own video editor who can edit videos in seconds.
Making IG reels, and YT videos become so easy now, Jarvis will create YouTube videos in seconds and every hyperwrite can post it on my own YT channel if given the instructions properly.

What else do we need?

Is there anything left for us to do?

I mean, after 4/5 years downs the line, everyone can see AI can do almost everything for us in advance, writing, designing, coding, creating videos and god knows what not else.

This is the future, but we do need solid regulations for it.
We can't just leave developers open like this in the market to develop whatever they want.

Time to switch back to LLM and AI and GPT models, may be my job is in danger.

Keep developing

