Gemma 4 for Data Analysis: Can It Replace ChatGPT for Spreadsheet Work?

Coding Liquids blog cover featuring Sagnik Bhattacharya for Gemma 4 vs ChatGPT for Data Analysis and Spreadsheet Work, with AI model comparison visuals and Excel grids.
Coding Liquids blog cover featuring Sagnik Bhattacharya for Gemma 4 vs ChatGPT for Data Analysis and Spreadsheet Work, with AI model comparison visuals and Excel grids.

Every time Google releases a new open-weight model, the same question lands in my inbox from students and workshop participants: "Can I use this instead of paying for ChatGPT?" With Gemma 4, that question deserves a genuinely thorough answer — because for the first time, a free, locally-run model is good enough at spreadsheet tasks to make the comparison meaningful.

I have spent the past two weeks putting Gemma 4 and ChatGPT through a structured set of real spreadsheet challenges — the same kinds of tasks I encounter in my corporate Excel training sessions. Not synthetic benchmarks, but actual messy data, actual broken formulas, and actual VBA requirements from real projects. Here is what I found.

The Setup: How I Tested Both Models

For consistency, I ran Gemma 4 (27B parameter version) locally through Ollama on my machine and tested ChatGPT using GPT-4o through the web interface. Both models received identical prompts, word for word. I used the same five spreadsheet scenarios I regularly use in my Excel + AI training workshops, so the tasks reflect genuine business use cases rather than contrived examples.

Each test was scored on three criteria: accuracy of the output (did the formula or code actually work?), quality of the explanation (did it help the user understand what was happening?), and handling of edge cases (did it account for blanks, errors, or unusual data?). Let me walk through each one.

Test 1: Data Cleaning — Messy CSV With Inconsistent Dates and Mixed Formats

The scenario: a 2,000-row CSV export from a legacy CRM system. Dates appeared in at least four formats — dd/mm/yyyy, mm-dd-yyyy, yyyy.mm.dd, and plain text like "15 March 2025". Phone numbers mixed country codes with local formats. Product names had random capitalisation and trailing spaces. This is exactly the kind of file I see in every data cleaning workshop I run.

I gave both models the same prompt: "I have a CSV with a Date column that mixes dd/mm/yyyy, mm-dd-yyyy, yyyy.mm.dd, and written dates like '15 March 2025'. I need a single formula approach to normalise everything to dd/mm/yyyy in Excel. Also suggest a strategy for cleaning phone numbers that mix +91-XXXXXXXXXX with 0XX-XXXXXXXX formats and product names with inconsistent capitalisation."

Gemma 4's response was solid. It suggested a nested approach using DATEVALUE combined with TEXT and SUBSTITUTE, and correctly identified that written dates like "15 March 2025" would need a separate handling path. It recommended a helper column strategy — parse each format with IFERROR wrappers and combine them. For phone numbers, it suggested SUBSTITUTE chains to remove hyphens and spaces, then RIGHT to extract the last 10 digits. For product names, it correctly recommended PROPER(TRIM()).

ChatGPT's response was more polished. It provided a single nested formula using LET to define intermediate variables, making the formula more readable. It also proactively suggested a Power Query approach as an alternative, which Gemma 4 did not mention. For the phone number cleaning, ChatGPT additionally flagged that some Indian mobile numbers starting with certain digits might be misinterpreted and suggested a validation step.

Verdict: ChatGPT wins this round, but not by a landslide. Gemma 4's approach was functional and correct — it would work in production. ChatGPT's answer was more comprehensive, better structured, and showed more awareness of real-world edge cases. If you are already comfortable with data cleaning strategies, Gemma 4 gives you enough to work with.

Test 2: Pivot Table Logic — Designing and Building Pivot Table Formulas

The scenario: a sales dataset with columns for Region, Salesperson, Product Category, Quarter, and Revenue. The task was to ask each model to suggest an appropriate pivot table structure and then provide the equivalent formulas for users who need a formula-based approach (common when the data source updates frequently and you want formulas that recalculate automatically).

I prompted both models: "I have a sales table with Region (North/South/East/West), Salesperson (names), Product Category (Electronics/Furniture/Software), Quarter (Q1-Q4), and Revenue. Suggest a pivot table layout to analyse revenue by region and category, then give me the SUMIFS formulas to replicate this as a formula-based summary table."

Gemma 4 suggested a clean two-dimensional layout with regions as rows and categories as columns, which is exactly what most analysts would want. The SUMIFS formulas it generated were correct, with proper absolute and relative references. It also suggested adding a Grand Total row and column using SUM.

ChatGPT produced the same layout but went further — it suggested using GETPIVOTDATA for users who prefer actual pivot tables, offered a SUMPRODUCT alternative for older Excel versions, and included conditional formatting suggestions to highlight the highest revenue cells. It also recommended a slicer-based approach for interactive dashboards.

Verdict: ChatGPT wins again, primarily on depth and the extra suggestions that a less experienced user would find valuable. Gemma 4's core answer was accurate and usable — the SUMIFS formulas worked perfectly. The gap here is in the "what else should I consider" territory rather than correctness.

Test 3: VBA Macro Generation — Multi-Sheet Consolidation

This is where I expected ChatGPT to pull ahead significantly, and it did — but Gemma 4 surprised me. The task: write a VBA macro that loops through all worksheets in a workbook (except a "Summary" sheet), copies data from a consistent range (A2 to the last row in column D) on each sheet, and pastes it sequentially onto the Summary sheet with a source sheet name column added.

Gemma 4 produced a working macro. It correctly used For Each ws In ThisWorkbook.Worksheets, included the If ws.Name <> "Summary" check, found the last row using Cells(Rows.Count, 1).End(xlUp).Row, and appended data to the Summary sheet. The source sheet name was added in column E. The code ran without errors on my test workbook.

ChatGPT produced a more robust version. It included error handling with On Error Resume Next guards around sheet operations, added a confirmation message box at the end showing how many rows were consolidated, cleared the Summary sheet before writing (with a user prompt to confirm), and added comments explaining each section. It also suggested a Application.ScreenUpdating = False wrapper for performance.

Verdict: ChatGPT wins on production quality. Gemma 4's macro worked, which is genuinely impressive for a free local model. But ChatGPT's version is what you would actually want to deploy in a business setting — with error handling, user feedback, and performance optimisation. For anyone learning VBA through AI-assisted macro generation, both models are useful starting points.

Test 4: Chart and Visualisation Recommendations

I described a dataset to both models: monthly revenue and customer count for four product lines over two years, with a goal of presenting trends to a non-technical executive audience. I asked each model what chart types to use and how to structure the visualisation.

Gemma 4 recommended a line chart for revenue trends over time (one line per product), a clustered bar chart for comparing product lines by quarter, and a combo chart (line + bar) to show revenue against customer count on dual axes. Solid, conventional recommendations.

ChatGPT gave the same three recommendations but added a sparkline suggestion for an executive summary table, recommended a waterfall chart to show year-over-year revenue changes, and provided specific formatting advice — colour palette suggestions, axis label formatting, and a note about avoiding 3D charts for executive presentations. It also suggested a dashboard layout with the charts arranged in a logical reading order.

Verdict: ChatGPT wins on presentation awareness. If you are already experienced with charts and visualisations, Gemma 4's recommendations are perfectly adequate. ChatGPT's advantage is in the design and communication layer — the kind of advice that makes the difference between a chart that is technically correct and one that actually communicates effectively to stakeholders.

Test 5: Formula Debugging — Diagnosing a Broken Formula

I pasted this deliberately broken formula into both models and asked them to identify and fix all the issues:

=IFERROR(VLOOKUP(A2,Sheet2!B:F,5,TRUE),"Not Found")+IF(C2>"100",D2*0.1,D2*0.05)

There are multiple problems here: the VLOOKUP match type should probably be FALSE for an exact match; the IF condition compares C2 to the text string "100" instead of the number 100; and the IFERROR wraps only the VLOOKUP but the entire expression can still error if the IF part fails.

Gemma 4 caught two of the three issues. It correctly identified the TRUE/FALSE match type problem and the text-vs-number comparison in the IF statement. It missed the incomplete IFERROR coverage.

ChatGPT caught all three issues. It rewrote the formula with the IFERROR wrapping the entire expression, changed TRUE to FALSE, removed the quotes around 100, and additionally suggested using XLOOKUP as a modern replacement if the user was on Microsoft 365. It also explained why each change was necessary.

Verdict: ChatGPT wins clearly here. Formula debugging requires the model to reason about multiple interacting parts of an expression, and ChatGPT's deeper analytical capability shows. That said, Gemma 4 catching two out of three issues is still genuinely useful — for many users, those two fixes alone would have resolved their problem. For more complex formula debugging, see my guide on advanced Excel formulas.

Results Comparison Table

Task Gemma 4 ChatGPT Winner Notes
Data cleaning (messy CSV) Good — functional formulas, correct approach Very good — LET formula, Power Query suggestion ChatGPT Gemma 4 workable for experienced users
Pivot table logic Good — accurate SUMIFS, clean layout Very good — added GETPIVOTDATA, conditional formatting ChatGPT Core formulas identical in quality
VBA macro generation Good — working macro, basic structure Excellent — error handling, user feedback, performance ChatGPT Gemma 4 output runs correctly as-is
Chart recommendations Good — solid standard recommendations Very good — design advice, dashboard layout ChatGPT Gemma 4 sufficient for experienced analysts
Formula debugging Adequate — caught 2 of 3 issues Excellent — caught all 3, offered modern alternative ChatGPT Multi-layer reasoning favours ChatGPT
Privacy and data control Excellent — fully local, no data leaves your machine Adequate — cloud-based, data sent to OpenAI Gemma 4 Critical for sensitive business data
Cost Free Free tier limited; Plus costs $20/month Gemma 4 No subscription, no usage caps
Speed (after initial load) Fast on GPU, slower on CPU Consistently fast Tie Depends on local hardware

When Gemma 4 Is the Better Choice for Spreadsheet AI

Despite ChatGPT winning on raw output quality across most tests, there are clear scenarios where Gemma 4 is the smarter choice:

  • Sensitive data — If you are working with financial records, employee data, client information, or anything governed by data protection policies, Gemma 4 running locally means your data never leaves your machine. This is not a minor advantage — it is often a hard requirement in enterprise settings.
  • High-volume routine tasks — If you need to generate 50 similar formulas in a session, ChatGPT's rate limits on the free tier become a real friction point. Gemma 4 has no usage caps.
  • Offline work — Gemma 4 works without an internet connection. If you travel, work on secure networks, or simply have unreliable connectivity, it is always available.
  • Budget-conscious teams — For a team of 10 analysts, ChatGPT Plus would cost $200/month. Gemma 4 costs nothing after the initial hardware investment. Over a year, that is $2,400 saved.
  • Learning and experimentation — When you are exploring prompts and iterating rapidly, having no cost per query encourages experimentation. Students in my courses often learn faster when they are not worried about burning through API credits.

If you want to get started with running Gemma 4 on your own machine, I have written a complete beginner's guide to running Gemma 4 locally that walks through the entire setup process.

When ChatGPT Still Wins

I want to be honest about where ChatGPT remains the stronger tool, because pretending otherwise would not serve you well:

  • Complex multi-step analysis — When a task requires the model to hold a large context and reason through several dependent steps (like designing a complete financial model), ChatGPT's larger context window and deeper reasoning produce noticeably better results.
  • Code Interpreter and file upload — ChatGPT can directly ingest your spreadsheet file, run Python code on it, and return results. Gemma 4 running through Ollama is text-in, text-out — you cannot upload a file to it. This is a significant workflow difference for data analysis.
  • Production-quality VBA — While Gemma 4 writes working macros, ChatGPT consistently produces code with better error handling, user prompts, and edge case coverage. For macros that will be used by others in your organisation, ChatGPT's output requires less manual refinement.
  • Explanation depth — ChatGPT generally provides more thorough explanations of why it chose a particular approach, which is valuable for learning. Gemma 4 tends to be more terse.
  • Cutting-edge Excel features — ChatGPT's training data tends to include more recent Excel feature releases, so it handles newer functions like GROUPBY, PIVOTBY, and PERCENTOF more reliably.

For a broader comparison that includes Claude and other models, see my detailed AI model comparison for 2026.

Practical Recommendations

After running these tests and using both models extensively in my training sessions, here is my practical advice:

  1. Start with Gemma 4 for routine tasks. Formula writing, basic data cleaning logic, simple VBA — Gemma 4 handles these well and costs nothing. Install it through Ollama and keep it running in the background.
  2. Escalate to ChatGPT for complex work. When you hit a task that requires deep multi-step reasoning, file analysis, or production-grade code, switch to ChatGPT. The free tier handles most of these needs; Plus is worth it if you use it daily.
  3. Use both for learning. Ask Gemma 4 for a formula, then ask ChatGPT the same question. Compare the approaches. This is one of the most effective learning strategies I recommend to my students.
  4. Default to Gemma 4 for sensitive data. If you are unsure whether your data should go to a cloud service, the answer is to use the local model. You can always re-prompt ChatGPT with anonymised or sample data if you need its extra capability.

For a head-to-head comparison of how Gemma 4 handles Excel formulas specifically versus GPT-4o and Llama 4, see my detailed formula comparison test. And if you want practical prompts to use with either model, my collection of 60 AI prompts for Excel works well with both Gemma 4 and ChatGPT.

Frequently Asked Questions

Can Gemma 4 actually replace ChatGPT for Excel and spreadsheet work?

For many common spreadsheet tasks — formula writing, data cleaning logic, and basic VBA — Gemma 4 performs surprisingly close to ChatGPT. It handles standard Excel functions well and runs completely free on your own machine. However, ChatGPT still has an edge for very complex multi-step analysis, advanced VBA with error handling, and tasks that benefit from file upload and Code Interpreter. The best approach is to use Gemma 4 for routine spreadsheet AI tasks and switch to ChatGPT when you need its advanced capabilities.

Which Gemma 4 model size should I use for Excel and data analysis tasks?

For spreadsheet formula generation and data cleaning prompts, the 12B parameter version of Gemma 4 running through Ollama is a good balance of speed and quality. If you have a GPU with 16GB or more VRAM, the 27B version will give noticeably better results on complex analytical tasks like pivot table design and VBA macro generation. The smaller 4B version works for simple formula lookups but struggles with multi-step reasoning.

Is my spreadsheet data safe when using Gemma 4 for analysis?

Yes — this is one of Gemma 4's strongest advantages. Because Gemma 4 runs entirely on your local machine through tools like Ollama, your data never leaves your computer. Nothing is sent to external servers. This makes it ideal for sensitive financial data, proprietary business information, or any spreadsheet work where data privacy is a concern. With ChatGPT, your prompts are processed on OpenAI's servers, which may not be acceptable under certain compliance frameworks.

How do I get Gemma 4 to understand my specific spreadsheet layout?

The key is providing clear context in your prompt. Describe your column headers, the type of data in each column, and what row your data starts on. For example: "I have a spreadsheet with columns A (Date, dd/mm/yyyy format), B (Product Name, text), C (Quantity, integer), D (Unit Price, decimal). Data starts at row 2. Headers are in row 1." This level of detail helps Gemma 4 generate accurate formulas and references, just as it would with any cloud-based AI model.

Sources and Further Reading

Related Posts

Ready to level up your Excel skills?

The Complete Excel Course with AI Integration takes you from formulas to production-grade spreadsheets, with real projects and AI-assisted workflows.

Explore the Excel + AI course