Data source query

Hi,

I have an odd problem with Data source. I uploaded a full PDF technical manual and It seems that it cant focus on the query and give irrelevant response. I tried even a small Word document with one page but when I asked for “Print ribbon problem” (Error 109) and got the entire page. See image below. Data source id - 9707ce80-17af-4e6b-b283-892abffaf9ed

Hi @Eransi, welcome to the community!

How Data Sources work:
When you upload a file to the Data Sources, the text gets extracted and split into chunks. Short documents can end up as a single chunk, so when your query matches it, the whole page comes back.

The Query Data Source block only retrieves chunks and it doesn’t generate an answer on its own. It’s best to pair it with a Generate Text block and prompt the AI to answer the user’s specific question using only the retrieved content. Your workflow would then look like this:

  1. User enters a query
  2. Query Data Source block retrieves the most relevant chunks
  3. Generate Text block uses those chunks as context to answer the user’s question

The idea behind Data Sources is to find the most relevant pieces of text from your own files so the LLM replies are based on your data, rather than its own knowledge. You can learn more about Data Sources here:

And here’s a template you can copy to explore the setup:
https://app.mindstudio.ai/agents/data-source-query-template-2afad181/remix

Can you share a bit more about what you are building?

Thanks. Your reply helped me. Using the tutorial I understood that the problem was in the way I processed the results. Now it’s OK.

I am trying to create a smart site that store many different manuals (same product different sources) and address user’s questions.

Q:

1. Is there a way to get also the figures in the manual and combine the within the DS answer?

  1. I noticed that the results are incomplete. I asked to get all error messages but I get only some. is there a way to solve this?

Hi @Eransi,

Data Sources only extract text from uploaded files, so figures and images in your manuals won’t be included in the results.

The Query Data Source block uses semantic search, which means it looks for content that’s similar in meaning to your query, rather than scanning for every mention of a specific word or phrase. Asking for “all error messages” won’t pull every instance the way a keyword search would.

Could you share a bit more about what a typical user query would look like? Would a user enter an error code to have the AI walk them through troubleshooting steps from the manual?

Thanks again.

  1. Following your reply is there a way to get images as well?
  2. actually I am just playing and right now Gemini NotebookLM solves that tissue perfect when I ask him to list all codes he does it. Actually I don’t now if that a relevant Q. for now I am going to stick to Gemini.

Hi @Eransi,

Unfortunately images aren’t supported in Data Sources, it’s text only for now.

If NotebookLM fits your use case, that’s perfect. Feel free to explore workflow automation use cases in MindStudio and good luck with the project!