Skip to content

Latest commit

 

History

History
160 lines (119 loc) · 4.42 KB

README.md

File metadata and controls

160 lines (119 loc) · 4.42 KB

WhatsApp Voice Message Transcription and AI Assistant


A microservices-based WhatsApp bot that automatically transcribes voice messages and provides AI-powered responses.

Table of Contents

Features

  • Real-time voice message transcription
  • AI-powered responses (powered by Groq API)
  • Integration with WhatsApp
  • Microservices architecture for scalability

Technologies Used

  • Go
  • Python
  • Flask
  • SQLite
  • WhatsApp API (via whatsmeow)
  • Faster-Whisper (for speech recognition)
  • Groq API

Setup

Prerequisites

To run this application, make sure you have the following installed:

Important

In order to use the Whatsmeow library on Windows, ensure that you have GCC installed.

Alternatively, you can use WSL for running the Go code. The Python code can run normally on any machine.

Installation

Go

go mod tidy

Python

  1. Create a virtual environment
python -m venv .venv
  1. Activate the virtual environment
# For Linux
source .venv/bin/activate

# For Windows
.venv\Scripts\activate
  1. Install the required packages
pip install -r requirements.txt

Usage

Note

Before you start, make sure to configure your IP Address in main.go for sending the audio data to the Flask server.

To start, you need to start the Go application as well as the Flask server.

  1. Start the Flask server
python main.py

Note down the IP Address mentioned in the terminal, as you would need it to configure the Go application for sending the audio data.

  1. Start the Go application
go run main.go
  1. Scan the QR Code displayed in the terminal to log into WhatsApp

  2. Once logged in, any audio message sent to you will be transcribed and you will receive the response from the AI model sent back as a WhatsApp message.

Configuration

  1. You would need to change your IP Address in main.go that sends the POST request to the Flask server.
  2. The transcription model can be changed in the transcribe.pyfile by modifying the model_name parameter.
  3. To use a different AI model, update the Model field in the RequestPayload struct within the groq/groq.go file.

API Endpoint

The transcription service exposes a single endpoint:

  • POST /transcribe

Accepts binary audio data in the request body.
Returns a JSON object with transcription and language fields

Contributing

  1. Create a new issue
  2. Fork the repository
  3. Create a new branch
  4. Commit your changes
  5. Push to the branch
  6. Create a new Pull Request

Acknowledgements

  1. YASSERMD/whatsmeow-groq
  2. hoehermann/whatsmeow-transcribe