Generate audio from text
Project Description
Developed an AI-powered Text-to-Audio Generation System that converts written text into natural-sounding speech using a pre-trained deep learning model.
The project integrates the suno/bark model from Hugging Face for high-quality audio synthesis. The backend is implemented using Python Flask, which hosts the model and exposes REST APIs. GPU acceleration is enabled using PyTorch to optimize audio generation performance.
The frontend is developed using HTML and JavaScript, allowing users to input text and receive dynamically generated audio output through the web interface.
Features
Converts user input text into realistic audio output.
Utilizes suno/bark for high-quality voice synthesis.
Uses PyTorch with CUDA support for faster audio generation.
Frontend and backend communicate via HTTP APIs.
Simple and user-friendly UI built with HTML and JavaScript.
Generated audio can be played directly in the browser.
Tech Stack
Python 3.x
Flask
Transformers (Hugging Face library)
PyTorch
HTML5
CSS3
JavaScript
HTML, JavaScript (Frontend)
System Requirements
Python 3.8+
Anaconda
Libraries: numpy, pandas, scikit-learn, matplotlib
Good to have GPU
Deliverables
Project Code
Setup over video call
WhatsApp support
WhatsApp For Project