Remy is consuming a large amount of expensive tokens (Clause 4.7 Opus specifically). Over $400 in 7 days. See attached. Is it possible to control what model Remy uses to reduce the development expense? Any recommendations on keeping this under control?
Hi @DavidE,
Thanks for the post!
You can change the models directly in the app. Here’s how:
- Go to the Processes tab
- Open the Models section
- Change the models and click Save in the bottom right corner:
Please note that changing a model will reset your current chat with Remy (the app itself stays intact). Also, the prompts Remy uses have been tailored to the Claude models, so results may vary with others.
Apart from that, here are a couple of commands that help if your chat gets long:
- Type /compact to summarize your chat history so fewer tokens are used with each message
- Alternatively, type /clear to reset the conversation, then /familiarize to get Remy back up to speed on your app
Thanks for your reply, Alex. Based on your comment, I take it that it is not recommended to change the default models as it may introduce undesirable behavior.
I ran the /compact command, and that alone cost $4.36.
It just seems that every interaction is consuming an inordinate amount of $. Are there any improvements in the pipeline that may help reduce costs, or is this what should be expected?
Hi @DavidE,
Thanks for getting back to me!
The way /compact works is that it sends your entire chat history to the model for summarization, so there’s an upfront token cost, but the next message you send will be considerably cheaper than if you’d continued without it. The longer the chat, the higher that initial cost, but the more you save going forward.
Remy processes a large amount of context with each interaction to factor in as many aspects of your app as possible. That thoroughness drives both the quality of the output and the inference costs. I’m afraid I don’t have a roadmap to share while Remy’s in alpha, but we’re gathering all feedback. You can also submit it directly to our engineers by clicking Connected in the top left corner, then Send Feedback.
Alex, thank you.
FYI, I did change the Remy model to Claude 4.6 Sonnet as I am now in the debugging phase. It helped reduce the cost of fixing bugs and making small adjustments significantly. I will send this as feedback as suggested above.

