How to provide good answers beyond my data sources

How do I create a chat agent that can answer questions beyond my data sources, yet still aligned with my philosophies?

For example, we teach people living in extreme poverty how to grow their own food using biointensive gardening principles. I have 100,000 words in my data sources, but if someone asks how to control aphids, that’s probably not in the data sources.

Hi @philthrive,

That’s a great question.

Data Sources can help an AI base its responses on your existing data, but if users ask questions beyond that scope, it might not be enough.

Here are a few ways to improve output quality:

1. System Prompt:
Use it to define the Agent’s framework: its role, approach, and guidelines. Since it acts as an overarching prompt, it helps keep LLM behavior consistent across your workflow, including the Chat block

2. Perplexity models:
Consider using Perplexity models in the Generate Text, Chat, or with the Search Perplexity blocks. These are currently the only models that can search the web in real time, which helps improve responses with more recent data

3. Deep Research Agent:
Make a copy of the Deep Research Agent in the Ready to Use section of your workspace to explore how it’s structured and reuse some of its components

4. Google Search
Consider adopting a setup like below:

  • Use User Input / Launch Variable / a Dynamic Tool in the Chat block to collect the user’s query
  • Add a Generate Text block to refine the query and generate several Google Search–friendly versions
  • Add a Run Workflow block with a Google Search block in a sub-workflow to perform multiple searches
  • Use a Scrape URL block to scrape content from the search results URLs
  • Add a final Generate Text block to analyze findings and generate the response to the user’s query

These are some of the approaches I use, but I’d love to hear if anyone else has found other useful setups

I’m back at this. Thanks for your suggestions. More simply, though, how do I just have the Chat use the data sources first, then, if necessary, use the model as a fallback while adhering to my system prompt? Is there something I have to do to get it to use the model as the fallback?

Oh, and since I’ll have different Chats for different topics, do I put my guardrails in the Chats’ templates rather than the main system prompt?

Hi @philthrive,

You’re right, you can add instructions like the example below in the Template under Message Processing in the Chat block:

“Rely on the provided data to generate an output. Only use your own knowledge if there’s no relevant information in the provided data.”

That way, each time a user sends a message and relevant chunks are returned, those instructions are included in every call to the LLM.