Show HN: Verify LLM Generated Code with a Spreadsheet

Show HN: Verify LLM Generated Code with a Spreadsheet Show HN: Verify LLM Generated Code with a Spreadsheet 23 points by narush 3 hours ago | hide | past | favorite | 3 comments

Hey HN! Been a minute. We launched Mito here last year (https://news.ycombinator.com/item?id=32723766).

Mito is a spreadsheet that generates Python code as you edit it. We've spent the past three years trying to lower the startup cost to use Python for data work. In doing so, we’ve been thrust into the middle of many Python transition processes at larger enterprises, and we’ve seen up-close how non-technical folks interact with generated code.

The Mito AI chatbot lives inside of the Mito spreadsheet (https://www.trymito.io/>. The obvious benefit of this is that you can use the chatbot to transform your data and write a repeatable Pythons script. The less obvious (but equally important) benefit is that by connecting a spreadsheet and chatbot, Mito helps you understand the impact of your edits and verify LLM generated code. Every time you use the chatbot, Mito highlights the changed data in the spreadsheet. You can see a quick demo here (https://www.tella.tv/video/clibtwssv00000fl65oky13nu/view).

Three main insights shaped our approach to LLM code generation:

# Consumers of generated code don't know enough Python to verify and correct the code

Mito users span the range of Python experience. For new programmers, generating code using LLMs is an easy step one. Ensuring the generated code is correct is the forgotten step two.

In practice, LLMs often generate incorrect code, or code with unexpected side effects. A user will prompt an LLM to calculate a total_revenue column from price and quantity columns. The LLM correctly calculates total_revenue = price * quantity but then mistakenly deletes price and quantity.

New programmers find it almost impossible to verify generated code by reading it alone. They need tooling designed for their skillsets.

# Not everyone knows how to use a chat interface for transformations

We were surprised to learn that many Mito users a) had no experience with ChatGPT, and b) didn’t understand the chat interface at all! Mito AI presents users a few example prompts and an input field. A surprising number of users thought the example prompts were all they could use Mito AI for.

AI chatbots are new. Us builders might be using them for natural language interactions, but users are still learning how to use them in new contexts. This stands in stark contrast to spreadsheets, where pretty much ever business user has experience. Shout out 40 years of Excel dominance!

# The more context a prompt has about the user’s data + edits, the better the LLM results

For the LLM to generate code that can execute correctly, the prompt should include the names of the dataframes, the column headers, (some) dataframe values, and a few previous edits as examples. Duh.

But there’s no reason users should be responsible for writing this prompt. No one loves writing long chats, and in practice Mito AI users expect to be able to write ~12 words. Spreadsheets are well-suited to building the rest of the prompt for you - they have all of your data context, and know your recent edits.

With these three insights, it became very clear to us what role a spreadsheet could play in LLM based code-gen: a spreadsheet is the prompt builder, and a spreadsheet is the code verifier.

Mito AI builds an effective prompt by supplementing your input with the context of your data and recent edits.

Mito AI then helps you to verify the LLM generated code by highlighting the added, modified, and removed data within the chat interface - and within the spreadsheet. This way, you can ensure your LLM generated code is correct.

Give it a spin. Let us know what you think of the recon and how we can make it more helpful!

Also, if you like what we’re doing, we’re hiring – come help us build! (https://www.ycombinator.com/companies/mito/jobs)

Recommend

TCL "extended reality" smart glasses simulate a 130-inch screen | Tech...

Twitter head of trust and safety Ella Irwin has resigned

深层解析美国银行危机的成因、影响与解决方案

Apple Card Savings users complain about delayed withdrawals

英特尔发布Arc显卡31.0.101.4382 beta驱动：支持《暗黑破坏神4》和《街霸6》

D-Link shows a router upgrade may be the right move with DSR-250V2 VPN Router

“史上最卷”618：从低价到AI，新老玩家捉对“厮杀”

再创新高！2023年全国高考报名人数1291万人

Sequin Banking Membership

9年10次访华，马斯克都来做了什么？

About Joyk