[2301.13779] FLAME: A small language model for spreadsheet formulas
source link: https://arxiv.org/abs/2301.13779
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
[Submitted on 31 Jan 2023]
FLAME: A small language model for spreadsheet formulas
The widespread use of spreadsheet environments by billions of users presents a unique opportunity for formula-authoring assistance. Although large language models, such as Codex, can assist in general-purpose languages, they are expensive to train and challenging to deploy due to their large model sizes (up to billions of parameters). Moreover, they require hundreds of gigabytes of training data. We present FLAME, a T5-based model trained on Excel formulas that leverages domain insights to achieve competitive performance with a substantially smaller model (60M parameters) and two orders of magnitude less training data. We curate a training dataset using sketch deduplication, introduce an Excel-specific formula tokenizer for our model, and use domain-specific versions of masked span prediction and noisy auto-encoding as pretraining objectives. We evaluate FLAME on formula repair, formula auto-completion, and a novel task called syntax reconstruction. FLAME (60M) can outperform much larger models, such as Codex-Davinci (175B), Codex-Cushman (12B), and CodeT5 (220M), in 6 out of 10 settings.
Subjects: | Programming Languages (cs.PL); Artificial Intelligence (cs.AI); Software Engineering (cs.SE) |
Cite as: | arXiv:2301.13779 [cs.PL] |
(or arXiv:2301.13779v1 [cs.PL] for this version) | |
https://doi.org/10.48550/arXiv.2301.13779 |
Recommend
-
88
vim-formula - Trick out vim
-
70
GitHub is where people build software. More than 27 million people use GitHub to discover, fork, and contribute to over 80 million projects.
-
57
README.md FPSheet A Spreadsheet program with Haskell as the scripting language
-
37
Programming .NET? Check out our .NET Spreadsheet...
-
19
Computer Vision Basics in Microsoft Excel By Alok Govil , Principal Engineer, Amazon Collaborator:
-
13
-
3
Spreadsheet Formulas for Personal Finance Aug 2021 I love spreadsheets. Spreadsheet programs like Microsoft’s Excel, Apple’s Numbers and Google Sheets are the secret heroes of our civilization. I’ve also been inter...
-
12
JeevesAsk questions, get answers, without spreadsheet formulasAsk questions, get answers, WITHOUT the spreadsheet formulas. Whether you're spreadsheeting at work, budgeting at home, or powering throu...
-
6
Support is great. Feedback is even better."We’re just getting started and iterating on the product, so we’re really excited to hear your thoughts."
-
1
Beyond Self-Attention: How a Small Language Model Predicts the Next TokenJan 29, 2024 · 17754 words · 84 minute readI trained a small (~10 million parameter)
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK