Google AI Developed a Language Model to Solve Quantitative Reasoning Problems

Jul 14, 2022 2 min read

Google AI developed a deep learning language model called Minerva which could solve mathematical quantitative problems using step-by-step reasoning.

In the recently published paper related to Minerva, researchers explained the development of this deep learning Model. They achieved a state-of-the-art solution by training a deep learning model on a large training dataset that contains quantitative reasoning with symbolic expressions. The final model, Minerva, could solve quantitative mathematical problems on STEM reasoning tasks.

Minerva parses the question using natural language processing and mathematical notation processing techniques. It recalls the relevant formulas, constants, and step-by-step solutions involving numerical calculation. It generates solutions that include symbolic manipulation and numerical computation without relying on a calculator to get the final answers. By generating different answers for the problem with different assigned probabilities, Minerva used majority voting to select the final answer. The following picture shows a sample of Minerva’s output for a quantitative mathematical problem.

Sample Minerva’s answer to a mathematical problem

Minerva was built on the Pathways Language Model (PaLM, 540-billion parameter, densely activated, transformer language model) with more mathematical datasets like arXiv, text containing LaTeX and MathJax, or other mathematical formats. To train the model on symbolic data, symbolic mathematical notations are preserved in the training dataset. This process is shown in the following diagram.

Symbolic mathematical expressions are preserved for training Minerva

To benchmark Minerva’s performance, STEM benchmarks ranging from grade school level to graduate level were used. Researchers used datasets like MATH (High school math competition level problems), MMLU-STEM (massive multitask language understanding benchmark focused on STEM, covering topics like engineering, chemistry, math, and physics at high school and college level), and GSM8k (grade school math problems involving basic arithmetic operations solvable by a talented middle school student). It shows significant performance on MATH and MMLU-STEM as it is shown in the following graphs:

Minerva's performance

One of the important limitations of Minerva is that the model’s answers could not be evaluated automatically. As it is stated in the blog post :

Our approach to quantitative reasoning is not grounded in formal mathematics. Minerva parses questions and generates answers using a mix of natural language and LaTeX mathematical expressions, with no explicit underlying mathematical structure. This approach has an important limitation, in that the model’s answers cannot be automatically verified. Even when the final answer is known and can be verified, the model can arrive at a correct final answer using incorrect reasoning steps, which cannot be automatically detected. This limitation is not present in formal methods for theorem proving (e.g., see Coq, Isabelle, HOL, Lean, Metamath, and Mizar).

To evangelize NLP models for quantitative reasoning, Google AI shared an interactive sample explorer for the public to explore Minerva’s capabilities.

Using natural language processing and deep learning in mathematical reasoning is a challenging research area. There are other papers with the source codes in this area like the graph to tree learning, Goal-Driven Tree-Structured Neural Model for Math Word Problems. Paper with code also has some other papers with source code in this domain for further reading.

About the Author

Reza Rahimi

Reza is a senior machine learning engineering manager at Dropbox, leading personalization, consumer targeting, and revenue-related ML products. He has more than 17 years of industrial work experience in big data management, and machine learning usage in the different business segments (software as a service, consumer web, cloud services, consumer electronics, and healthcare insurance). He got his Ph.D. from UC Irvine in computer science and published about 20 books, journals, and conference papers (cited: 1190+, Erdös Number is 4) in the area of machine learning, cloud computing, and distributed systems. He is a Harvard business review advisory council member, MIT technology review global panel member, and senior member of IEEE.

Google AI Developed a Language Model to Solve Quantitative Reasoning Problems

Google AI Developed a Language Model to Solve Quantitative Reasoning Problems

About the Author

Reza Rahimi

Recommend

Crypto A to Z: Cryptocurrency Glossary

企业的发展离不开软文营销，那么如何做好软文营销发布呢？

Sage the Barista Express review

5 ways AI is detecting and preventing identity fraud

Magic Leap 2将于9月30日推出售价3299美元起

Webinar recap: The missing story with every cloud breach

Binance Celebrates 5th Anniversary with ZERO FEES Trading on Bitcoin Pairs

阿里云安全中心之漏洞修复优秀实践

从流量到用户，冰川眼里的游戏企业进化之路

你的手机号关联了多少互联网账号？工信部“一证通查2.0”来了！

About Joyk