Using OpenAI APIs: Using Image & Audio APIs

Mohlomi Pholoana

Skillsoft issued completion badges are earned based on viewing the percentage required or receiving a passing score when assessment is required. DALL-E and Whisper are OpenAI’s image and audio-based model offerings. DALL-E, an image generation model, demonstrates the ability to create visually striking images based on textual prompts. Whisper represents a state-of-the-art automatic speech recognition (ASR) system. With its high accuracy in transcribing spoken words, Whisper finds utility in various applications, from voice assistants to transcription services. You will begin this course by generating images using OpenAI’s DALL-E model. You will generate images using text prompts, create variations of existing images, and perform image inpainting using natural language. Then, you will work with the Whisper model, which caters to speech transcription and translation. You will transcribe and translate audio in different languages and accents, and you will evaluate the performance of these models.

Issued on

October 1, 2024

Expires on

Does not expire