A text-to-speech platform known as speech Engine, which OpenAI developed, is available to a limited number of users. It can produce a synthetic voice from a 15-second audio clip. Text instructions can be read aloud by the AI-generated voice in many languages or in the same language as the speaker upon command. "Our approach, safety measures, and ideas about how Voice Engine could be used for good across various industries are being informed by these small-scale deployments," OpenAI stated in a blog post.
Some of the companies that have access are: Age of Learning, a visual storytelling platform; Livox, a provider of AI communication apps; Dimagi, a maker of frontline health software; and health system Lifespan.
According to OpenAI, it started working on Voice Engine in late 2022 and has since powered text-to-speech API preset voices as well as ChatGPT's Read Aloud feature. Voice Engine's product team member for OpenAI, Jeff Harris, stated in a TechCrunch interview that the model was trained using "a mix of licensed and publicly available data." Ten developers would be able to use the approach, according to OpenAI, which told the journal.
OpenAI claims that its partners have committed to abiding by its usage regulations, which prohibit using Voice Generation to impersonate individuals or organizations without authorization. In addition, the partners must not develop features that allow individual users to record their own voices, obtain the original speaker's "explicit and informed consent," and notify listeners that the voices are artificial intelligence (AI) creations. In order to track its source and keep an eye on how the audio is being used, OpenAI also watermarked the audio snippets.
In order to reduce the risks associated with tools such as these, OpenAI proposed a number of measures, such as the elimination of voice-based authentication for bank account access, the creation of policies to safeguard the use of human voices in AI, increased awareness of AI deepfakes, and the establishment of tracking systems for AI content.