New Method Could Rein in Overconfident AI Models That Are Giving Wrong Answers

Large language models are used for different tasks, including pinpointing financial fraud and translating articles. Despite their many capabilities, they may, at times, generate inaccurate results. In addition to this, models are sometimes underconfident about correct answers or overconfident even when their responses are wrong. This makes it hard for a user to know when to trust the answers generated by a model.

Normally, a model that’s well calibrated shouldn’t be very confident about an incorrect response, and vice versa. To ensure a machine-learning model’s confidence level matches its accuracy, researchers often calibrate it. However, since large language models can be used for a variety of tasks, conventional calibration techniques are ineffective.

Recently, scientists from the MIT-IBM Watson AI Lab developed a new calibration technique for large language models. Their technique, dubbed the Thermometer, uses a smaller auxiliary model running on top of a large language model to calibrate it.

Unlike other methods, this approach needs less power-hungry computation while still being accurate and allowing the model to produce calibrated responses even on tasks it hasn’t performed before. By allowing for efficient calibration for large language models for different tasks, this technique may assist users in identifying situations where models are overconfident about false predictions. This prevents them from deploying said model in a situation it may fail.

With their new technique, the scientists used temperature scaling to calibrate a large language model efficiently for a new task. Based on this, temperature is a scaling parameter used to adjust the confidence of a model so it can align with the accuracy of their forecasts.

In addition, the scientists trained an auxiliary model running on top of a large language model to forecast the temperature required to calibrate it for a new task automatically. According to the scientists involved, the Thermometer technique requires little access to the inner workings of a model to forecast the right temperature to calibrate it for a certain task. They also determined that this method didn’t need multiple training runs and only slowed down models slightly.

Additionally, since temperature scaling didn’t modify forecasts made by a model, Thermometer was able to preserve its accuracy.

The scientists are now focused on adapting their new technique to even larger models and challenging text-generating tasks. Their findings were recently presented at the International Conference on Machine Learning.

The study, led by Maohao Shen of the school of Electrical Engineering and Computer Science, was partly funded by the MIT-IBM Watson AI Lab.

With AI hardware makers such as NVIDIA Corp. (NASDAQ: NVDA) developing more advanced chips to power the AI revolution, it is likely to become easier to equip large language models with systems to check the confidence levels of the model while giving users different answers.

About AINewsWire

AINewsWire (“AINW”) is a specialized communications platform with a focus on the latest advancements in artificial intelligence (“AI”), including the technologies, trends and trailblazers driving innovation forward. It is one of 60+ brands within the Dynamic Brand Portfolio @ IBN that delivers: (1) access to a vast network of wire solutions via InvestorWire to efficiently and effectively reach a myriad of target markets, demographics and diverse industries; (2) article and editorial syndication to 5,000+ outlets; (3) enhanced press release enhancement to ensure maximum impact; (4) social media distribution via IBN to millions of social media followers; and (5) a full array of tailored corporate communications solutions. With broad reach and a seasoned team of contributing journalists and writers, AINW is uniquely positioned to best serve private and public companies that want to reach a wide audience of investors, influencers, consumers, journalists, and the general public. By cutting through the overload of information in today’s market, AINW brings its clients unparalleled recognition and brand awareness.

AINW is where breaking news, insightful content and actionable information converge.

To receive SMS alerts from AINewsWire, text “AI” to 888-902-4192 (U.S. Mobile Phones Only)

For more information, please visit www.AINewsWire.com

Please see full terms of use and disclaimers on the AINewsWire website applicable to all content provided by AINW, wherever published or re-published: https://www.AINewsWire.com/Disclaimer

AINewsWire
Los Angeles, CA
www.AINewsWire.com
310.299.1717 Office
Editor@AINewsWire.com

AINewsWire is powered by IBN

Archives

Select A Month