In the wake of Google’s presentation on converting text to video, OpenAI, a company known for its developments, scandals, and leadership, introduced a similar model named Sora on Thursday. The announcement has already created a significant buzz worldwide, dividing people’s opinions in less than a day.
An Overview of Sora
Amidst the development of ChatGPT and internal controversies leading to major scandals, the OpenAI team seems to have achieved a new milestone.
While the buzz around Sam Altman’s recent comments on ChatGPT-5 is still fresh, OpenAI introduced an AI model named Sora for converting text to video, which has stirred up the global community, similar to a recent announcement by Google.
Like Google’s Lumiere, Sora’s availability will be limited according to the announcements. The most prominent feature that seems to put Sora in the spotlight is its ability to produce videos up to one minute long, as mentioned during its introduction.
The moves made by OpenAI are thought to significantly contribute to the company’s ability to stand out among competitors like Google and Microsoft in an industry projected to reach $1.3 trillion by 2032.
In this context, OpenAI appears to be focusing consumer interest on the company by effectively managing its capable and potentially expandable AI technology, not settling with just ChatGPT.
Having made waves with both ChatGPT and Dall-E, OpenAI disclosed the conditions under which Sora will be tested. The company aims to test Sora from different angles, targeting experts in misinformation, hate content, and bias, in an effort to uncover any negatives before public release.
The company also plans to gather feedback from artists, designers, and filmmakers currently active in their professional lives and seems to be preparing to take new steps in Sora’s development.
These diverse tests are expected to contribute to addressing issues related to deepfakes, which have been increasingly emerging with the use of AI for creating images and videos.
Sora’s Strong Points
One of the most significant features highlighted by the company is Sora’s ability to interpret and visualize summaries of up to 135 words.
In the shares made by OpenAI on Thursday, several examples were provided, and the company’s leader, Sam Altman, took the matter to another level by asking users to create their texts and then converting them into videos. This demonstrated the team’s confidence in the application.
The influence of Dall-E and ChatGPT on OpenAI’s Sora is also being discussed. Dall-E, which powerfully converts texts into visuals, emerged in September.
Another notable aspect of Sora is its use of Dall-E 3’s summarization technique, which OpenAI mentioned can “produce highly descriptive titles for visual training data.”
Another statement from OpenAI was as follows:
Sora can create complex scenes with multiple characters, specific types of movement, and accurate background details. The model understands not only what the user’s command prompt asks for but also how these things exist in the physical world.
During this process, the realism of the videos shared by OpenAI and Altman has caused astonishment among users. It was also announced that Sora could create videos from still images and expand existing videos by filling in missing frames.
The statement continued:
Sora lays the groundwork for models that can understand and simulate the real world; we believe this capability will be a significant milestone toward achieving AGI.
Potential Weaknesses of Sora
OpenAI did not shy away from discussing the current weaknesses of the project. The company received feedback that Sora’s current version struggles with depicting the development of a complex scene and establishing cause-and-effect relationships.
An example given in the statement was as follows:
For instance, a person might take a bite of a cookie, but the cookie may not show a bite mark afterward. It may still have trouble understanding which is on the left. Sora also confuses right and left!
Another issue mentioned by OpenAI was the release date of Sora. The company did not share when the application will be available, linking it to the necessity of taking “several important security steps.” The statement was as follows:
Despite comprehensive research and testing, we cannot predict all the beneficial ways people will use our technology or all the ways they might misuse it. Therefore, we believe that learning from real-world usage is a critical component of creating and launching increasingly safe AI systems over time.