I have a fairly complex workflow, and the biggest performance hit is when the Model does not return valid JSON, and has to convert to JSON before progressing to next node in the flow. I’m using Llama 3.1 8B Instant at 0.2° with a maximum response size of 8,192 tokens.
Two questions:
- Is there a preferred model that performs best for text-only analysis that returns JSON most efficiently?
- Is there a specific prompt I should be including in the instructions of my nodes to remedy this?