DEV Community

Cover image for VideoToBlogAI: Transform Your Videos Into Technical Blog Post Using AI
Bakkesh KS
Bakkesh KS

Posted on

VideoToBlogAI: Transform Your Videos Into Technical Blog Post Using AI

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

VideoToBlogAI is an AI-powered web application that automates the creation of technical blog posts from video and audio. This application convert the uploaded media into text format and process the transcriptions by generating an well structured technical blog posts.

Key Features:

  • Credit-Based System: It employs an credit based system means before processing any media, the platform verifies the remaining credit balance of the user.

  • Code Extraction: This app can automatically extract code snippets from videos, audio files saving time and effort for developers and technical content creators.

  • Advanced Analysis: It offers word count, character count, and a dynamic table of contents to help users refine their blog posts.

How it Works:

1. Upload Your Content: Users can register and upload MP4 videos or MP3 audio files (up to 30 MB) which is send to the backend. Before processing any media, the platform verifies the user credit balance and proceed.

User Schema:

- email: String, required, unique
- password: String, required
- username: String, required, unique
- secondsRemaining: Number, default: 1200
- role: String, enum: ["user", "admin"], default: "user", required
Enter fullscreen mode Exit fullscreen mode

Blog Post Schema:

- blogPostId: String, required, unique
- userId: ObjectId (ref: "User"), required
- videoUrl: String, required
- text: String, required
- createdAt: Date, default: Date.now
- status: String, default: "completed"
Enter fullscreen mode Exit fullscreen mode

2. AI-Powered Transcription: The backend will upload media into uploads folder and send the media to AssemblyAI speech-to-text API which will converts the uploaded media into text format.

3. Blog Generation: After that the text is sent to Google Gemini language model to process the transcription to generate the technical blog post. The blog post is saved in the mongodb database.

Demo

Project Link: https://shark-app-n5snu.ondigitalocean.app
Demo Video:

Source code:

GitHub logo bakkeshks / VideoToBlogAI

Convert your video to technical blog post using AI

VideoToBlogAI

Project Proposal for the AssemblyAI Challenge

Overview

VideoToBlogAI is a project designed to generate technical blog posts from various sources such as local videos, and audio files (with a 30 MB limit). It leverages the Google Gemini AI API for language model tasks and AssemblyAI API for speech-to-text functionality.

Main Image

Upload Image EditPost Image

Features

  • User registration and sign-in with error handling.
  • Admin analytics: blog post generation, total hours processed.
  • Content uploads: MP4 videos, MP3 audio (max 30 MB).
  • Credit check: Verify user credits before generating blog posts and manage available time for processing.
  • AI-generated blog posts: view, edit, delete, save.
  • Automatic code extraction from videos, audios and youtube url.
  • Features: Word count, character count, dynamic table of contents, semantic analysis.
  • Rendering: Markdown format with Next.js for a user-friendly interface.
  • Transcription services:
    • Video/audio: AssemblyAI's speech-to-text API.
  • Google Gemini API: Transform transcriptions into blog posts.

Technologies Used

Screenshots:
Login Page:

LoginPage

Dashboard:
Dashboard

Uploadvideo:
 Uploadvideo

Viewpost:
 Viewpost

Editpost:
 Editpost

Adminanalytics:
Adminanalytics

Technologies Used:

Frontend: Next.js, Shadcn/UI, Tailwind CSS, Highlight.js
Backend: Node.js, Express.js, MongoDB
AI APIs: Google Gemini AI API (for language model tasks), AssemblyAI API (for speech-to-text)
Authentication: JWT (JSON Web Tokens)

Journey

Building VideoToBlogAI has been an great project. The most challenging part was implementing a video-to-text API that could accurately convert videos into text. Once I achieved this, I was able to leverage Google Gemini API to generate technical blog posts.

VideoToBlogAI leverages the robust capabilities of Universal-2, AssemblyAI's state-of-the-art speech-to-text model, to accurately transcribe audio and video content. This integration significantly enhances the platform ability to process diverse media formats and generate high quality blog posts.

Price categories:
Sophisticated Speech-to-Text
Team Members:
@bakkeshks

Top comments (0)