This Xcode project demos OpenAI's Realtime API with WebRTC (Advanced Voice Mode). It's an iOS application built with SwiftUI, AVFoundation, and the WebRTC package. It supports full AVM capabilities including interrupting the audio, sending text events manually, and controlling options such as the system message, realtime audio model, and voice.
ScreenRecording.mov
This video demos the iOS application running on MacOS
- iOS 16.0 or later
- OpenAI API Key
-
Clone the the Repository:
git clone https://github.com/PallavAg/VoiceModeWebRTCSwift.git
-
Setup API Key:
- Replace the placeholder
API_KEY
in the code with your OpenAI API key:let API_KEY = "your_openai_api_key"
- Alternatively, you can specify the OpenAI API Key in the app itself
- Replace the placeholder
-
Run the App:
- Go to the Signing & Capabilities section to first specify your account.
- Build and run the app on your iOS device, MacOS device, or simulator.
-
Start Connection:
- Launch the app and enter your API key in Settings if not specified already.
- Select your preferred AI model and voice, then press 'Start Connection' to begin the conversation.
-
Interact:
- Use the text input field or speak into the microphone to interact with the Realtime API.
ContentView
:- The primary UI that orchestrates conversation, input, and connection controls.
WebRTCManager
:- Handles WebRTC connection setup, data channel communication, and audio processing.
OptionsView
:- Allows customization of API keys, models, and voice settings.
- Microphone Permission:
- Ensure the app has microphone access in iOS settings.
- Connection Issues:
- Check API key validity and server accessibility.
This project is licensed under the MIT License. See the LICENSE
file for details.