OpenAI claims ChatGPT-4 was able to pass a bar exam with a score around the top 10% of test takers. The chatbot is not only better at doing mental math but also explaining its reasoning.
The newest model also will accept images now, and the chatbot will describe the visual. On a livestream on March 14, OpenAI’s president, Greg Brockman, demonstrated himself uploading a photo into ChatGPT. The photo was of a handwritten note describing what he wants a website to look like. The tool generated code that he could use to build that exact site.
Like previous models, ChatGPT-4 is trained on a vast amount of data culled from the internet. The model was trained to predict the next word in a document using publicly available data and licensed data, according to the company. The vast majority of its data, however, is from before September 2021.
Already, companies are integrating the large language model into their operations. A legal AI firm called Casetext announced that its AI legal assistant CoCounsel is powered by ChatGPT-4, with the company claiming it has passed multiple-choice and written portions of the Uniform Bar Exam. Examples of using AI in the legal field include drafting contracts to summarizing complex laws.
OpenAI, which is based in San Francisco, is also behind the visual art generator Dall-e. Tech giants like Microsoft and Google are working on creating their own generative AI tools, meanwhile. The latest release from OpenAI could pressure these big companies to catch up.
Despite all the buzz around ChatGPT, the technology still has a slew of problems. Similar to the earlier models, OpenAI said GPT-4 “hallucinates” facts and makes reasoning errors, but it said errors relative to past models have been reduced.
ChatGPT, the company noted in a blog post, can be “overly gullible in accepting obvious false answers from a user” and it “can also be confidently wrong in its prediction, not taking care to double-check work when it’s likely to make a mistake.”
To improve the models, OpenAI said it is selecting and filtering pre-training data, among other things. Notably, the company warned the additional capabilities of GPT-4 could lead to new risks, which requires outside expertise to evaluate.
To understand the extent of the risks, the startup said it is incorporating feedback and data from 50 experts in areas including AI alignment risks, cybersecurity, bio-risk, trust and safety, and international security to poke holes at the model. The findings are then fed into improvements for the model. For instance, out of this engagement, it has collected data to improve ChatGPT’s ability to refuse requests on how to synthesize dangerous chemicals.