Excel / CSV Calculations

Hi everyone,

I’m building a chatbot-style knowledge base agent.
Part of the internal knowledge and data lives in several small-to-medium Excel sheets.
They are not perfectly structured or strongly typed (mixed types, etc.).

I’m trying to figure out the most reliable approach for performing accurate calculations from these sheets—things like counts, sums, filtering by conditions, etc.

Here’s what I’ve tried so far:

  1. Datasource (Vector DB):
    Works well for text questions and general lookup, but it breaks down when the question involves calculations, especially counting or aggregating. Results are often inaccurate especially when more then one chunk in vector db exists.

  2. Analyze CSV:
    Better for structured queries, but the sheets aren’t always perfectly consistent, and I’m not sure it scales well when I have several files.

  3. Hybrid approach:

    • Use Datasource for semantic/textual questions.

    • For mathematical questions:

      1. Convert natural language → SQL-like query.

      2. Run the query against a temporary in-memory DB loaded from the Excel file.
        This seems promising but feels like a lot of plumbing.

Before I over-engineer this, I’d love to hear from the community:

What is the best practice for handling Excel-based data when you need reliable calculations inside a chatbot workflow?
Should I normalize the Excel sheets first (and can’t skip this)? Use Analyze CSV more heavily? Combine both?
Or is there a recommended pattern or tool in MindStudio for this use case?

Any guidance or examples would be greatly appreciated!

Thanks!