The Save Gmail to Google Drive add-on helps you to routinely obtain e-mail messages and file attachments from Gmail to your Google Drive. It can save you the e-mail messages as PDF whereas the attachments are saved of their authentic format.
Transcribe Gmail Attachments
The most recent model of the Gmail add-on provides help for transcribing audio and video attachments in Gmail messages. The transcription is finished with the assistance of OpenAI’s Whisper API and the transcript is saved as a brand new textual content file in your Google Drive.
Right here’s a step-by-step information on how one can transcribe audio and video attachments in Gmail messages to textual content.
Step 1. Set up the Save Gmail to Google Drive add-on from the Google Workspace market. Open sheets.new to create a brand new Google Sheet. Go to the Extension menu > Save Emails > Open App to launch the add-on.
Step 2. Create a brand new workflow and specify the Gmail search standards. The add-on will scan the matching e-mail message for any audio and video recordsdata.
OpenAI’s speech-to-text API helps a variety of audio and video codecs together with MP3, WAV, MP4, MPEG, and WEBM. The utmost file dimension is 25 MB and also you’ll at all times be within the restrict since Gmail doesn’t will let you ship or obtain recordsdata bigger than 25 MB.
Step 3. On the subsequent display, examine the choice that claims Save Audio and Video Attachments as textual content and select the file format, textual content or PDF, wherein you want to save the transcript.
You may embody markers within the file identify. As an illustration, if you happen to specify the file identify as {{Topic}} {{Sender E mail}}
, the add-on will exchange the markers with the precise sender’s e-mail and the e-mail topic.
You’ll additionally must specify the OpenAI API key that you could get from the OpenAI dashboard. OpenAI expenses you $0.006 per minute of audio or video transcribed, rounded to the closest second.
Save the workflow and it’ll routinely run within the background, transcribing messages as they land in your inbox. You may examine the standing of the workflow within the Google Sheet itself.
Additionally see: Speech to Text with Dictation.io
Speech to Textual content with Google Apps Script
Internally, the add-on makes use of the Google Apps Script to hook up with the OpenAI API and transcribe the audio and video recordsdata. Right here’s the supply code of the Google Script that you could copy and use in your personal tasks.
// Outline the URL for the OpenAI audio transcription API
const WHISPER_API_URL = 'https://api.openai.com/v1/audio/transcriptions';
// Outline your OpenAI API key
const OPENAI_API_KEY = 'sk-putyourownkeyhere';
// Outline a operate that takes an audio file ID and language as parameters
const transcribeAudio = (fileId, language) => {
// Get the audio file as a blob utilizing the Google Drive API
const audioBlob = DriveApp.getFileById(fileId).getBlob();
// Ship a POST request to the OpenAI API with the audio file
const response = UrlFetchApp.fetch(WHISPER_API_URL, {
technique: 'POST',
headers: {
Authorization: `Bearer ${OPENAI_API_KEY}`
},
payload: {
mannequin: 'whisper-1',
file: audioBlob,
response_format: 'textual content',
language: language
}
});
// Get the transcription from the API response and log it to the console
const knowledge = response.getContentText();
Logger.log(knowledge.trim());
};
Please exchange the OPENAI_API_KEY worth with your personal OpenAI API key. Additionally, make it possible for the audio or video file you need to transcribe is saved in your Google Drive and that you’ve got not less than view (learn) permissions on the file.
Transcribe Massive Audio and Video Information
The Whisper API solely accepts audio recordsdata which might be lower than 25 MB in dimension. When you have a bigger file, you should use the Pydub
Python package deal to separate the audio file into smaller chunks after which ship them to the API for transcription.
If the video file is massive in dimension, you could extract the audio observe from the video file utilizing FFmpeg and ship that to the API for transcription.
# Extract the audio from video
ffmpeg -i video.mp4 -vn -ab 256 audio.mp3
## Break up the audio file into smaller chunks
ffmpeg -i large_audio.mp3 -f phase -segment_time 60 -c copy output_percent03d.mp3
FFmpeg will cut up the enter audio file into a number of 60-second chunks, naming them as output_001.mp3, output_002.mp3, and so forth, relying on the period of the enter file.