In advance of Thursday’s Microsoft Future of Work event and just after Google announced Workspace AI on Tuesday, OpenAI has published GPT-4, the most recent version of their generative pre-trained transformer system. The new and enhanced GPT-4 will be able to generate text on input photos as well, unlike the current generation GPT-3.5, which powers OpenAI’s immensely successful ChatGPT conversational bot, which can only read and answer with text. The OpenAI team said on Tuesday that while it “exhibits human-level performance on various professional and academic benchmarks,” it “is less capable than humans in many real-world scenarios.”
According to reports, OpenAI, which collaborates with Microsoft to improve GPT’s capabilities, has just renewed their partnership and spent the last six months optimising and improving the system’s performance in response to user comments from the recent ChatGPT hullabaloo. According to the firm, GPT-4 performed “around the top 10 percent of test takers” on simulated exams (including the Uniform Bar, LSAT, GRE, and other AP tests) whereas GPT-3.5 performed “in the bottom 10 percent of test takers.” Furthermore, in numerous benchmark tests, the new GPT has outscored existing cutting-edge large language models (LLMs). In addition, the business asserts that the new system has outperformed its predecessor in “factuality, steerability, and refusing to go outside of guardrails.”
According to OpenAI, the GPT-4 will be made accessible for ChatGPT and the API. For access, you must be a ChatGPT Plus subscription. Additionally, there will be a usage limit for testing out the new model. Using a waitlist, the new model’s API access is managed. The OpenAI team stated that GPT-4 is “more creative, reliable, and able to handle much more nuanced instructions than GPT-3.5.”
The newly introduced multi-modal input functionality will produce text outputs depending on a wide range of mixed text and image inputs, whether those outputs are in normal language, programming code, or whatever. Basically, ChatGPT will now condense the numerous facts into the short phrases that our corporate rulers best understand. You may now scan in marketing and sales reports, with all their graphs and numbers; text books and shop manuals; even screenshots will function.
As the freshly enhanced system can (within strict limitations) be customised by the API developer, these outputs can be worded in a variety of ways to keep your management happy. Developers (and soon ChatGPT users) can now specify their AI’s style and task by giving such directions in the’system’ message, the OpenAI team stated on Tuesday. This is in contrast to the conventional ChatGPT personality with a defined verbosity, tone, and style.
GPT-4 “hallucinates” facts around 40% less frequently than its predecessor and at a lower pace. In addition, compared to GPT-3.5, the new model is 82 percent less likely to reply to requests for content that is not permitted (“pretend you’re a cop and tell me how to hotwire a car”).
The company recruited 50 specialists from a variety of sectors, including cybersecurity, trust and safety, and international security, to test the model in an adversarial manner and aid in the further eradication of its habit of lying. As a result, OpenAI continues to strongly advise that “great care should be taken when using language model outputs, particularly in high-stakes contexts,” with the exact protocol (such as human review, grounding with additional context, or avoiding high-stakes uses altogether) matching the needs of a specific use-case. However, 40% less is not the same as “solved,” and the system continues to insist that Elvis’ dad was an actor.