VoiceToSub's cloud mode uses OpenAI's Whisper API for fast, high-quality video translation. This guide walks you through creating an API key and connecting it to the desktop app.
Do I need an API key? No. VoiceToSub works entirely for free using local processing. The OpenAI API is an optional upgrade for faster translation (5-8 second delay vs 15-30 seconds locally).
Step 1: Create an OpenAI Account
- Go to platform.openai.com/signup
- Sign up with your email, Google, or Microsoft account
- Verify your email address
- Complete the account setup
If you already have a ChatGPT account, you can use the same login for the API platform.
Step 2: Add Billing (Pay-as-You-Go)
The API uses pay-as-you-go billing. You only pay for what you use.
- Go to platform.openai.com/account/billing
- Click "Add payment method"
- Enter your credit or debit card details
- Add an initial credit balance (e.g., $5 or $10)
How much does it cost?
OpenAI's Whisper API is priced per minute of audio. At current rates, roughly:
- $0.36/hour of video
- ~$1.80 for a 5-hour binge session
- ~$5 covers roughly 14 hours of video
A $5 initial balance will last most users weeks or even months. Always check OpenAI's current pricing as rates may change.
Tip: Set a usage limit. Go to Settings → Limits and set a monthly budget (e.g., $10) to avoid surprises. OpenAI will stop API access when the limit is reached.
Step 3: Generate an API Key
- Go to platform.openai.com/api-keys
- Click "Create new secret key"
- Give it a name like
VoiceToSub - Click "Create secret key"
- Copy the key immediately — it starts with
sk-and you won't be able to see it again
Keep your API key secret. Don't share it, post it online, or commit it to code. Anyone with your key can use it and incur charges on your account. If compromised, revoke it immediately on the API keys page and create a new one.
Step 4: Configure VoiceToSub
Now connect your API key to the desktop app:
- Click the VoiceToSub menu-bar icon and choose Translate Video…
- When the file picker opens, select a video file
- The app will prompt for your API key on first use — paste your
sk-...key - The key is saved locally in
~/Library/Application Support/VoiceToSuband never sent anywhere except directly to OpenAI
Step 5: Start Translating
- Click the VoiceToSub menu-bar icon → Translate Video…
- Pick any video file from the file picker
- Translation starts automatically using the OpenAI API
- Progress shows in the menu bar — when done, the subtitled MKV opens in Finder
The app extracts audio from the video file and sends it to OpenAI's Whisper API. The API detects the language automatically, translates to English, and returns timestamped subtitles that are embedded into the output MKV.
Frequently Asked Questions
Is my audio stored by OpenAI?
According to OpenAI's data usage policy, API data is not used to train their models. Audio sent via the API is processed and discarded. You can review their API data usage policies for full details.
Can I switch between local and OpenAI?
Yes. The desktop app uses local Whisper by default; passing an OpenAI API key switches to cloud mode. Your API key is remembered between sessions.
Does the desktop app also support the OpenAI API?
Yes. The VoiceToSub macOS desktop app can use the OpenAI API for its "Translate Video" feature. This lets you translate local video files using the cloud API for faster, higher-quality results. The desktop app produces an MKV file with embedded optional English subtitles.
What if I run out of API credits?
Translation will fail with an error. Simply add more credits on the billing page, or switch back to free local processing.
Can I use a different API provider?
Currently VoiceToSub supports OpenAI's Whisper API. Since VoiceToSub is open source, you can modify the server to use any speech-to-text provider you prefer.
Is the API key stored securely?
Your API key is stored locally in ~/Library/Application Support/VoiceToSub on your machine. It is sent directly to OpenAI and never touches any VoiceToSub servers.