API Speed Throttling?

I have been testing a simple page extraction agent (d0f9126b-d4bb-4356-8283-50ae866c9ee6) through an API within a Google Chrome extension. While using Llama 8B Instant, I’ve noticed a significant decline in performance. The tokens per second have dropped dramatically from an initial rate of 235.231 tokens/second to around 38.953 tokens/second now. This slowdown appears to be consistent over time, even though I haven’t made any major changes to my workflow or application. Are you throttling the performance on your end?

Hey! Let me check on my end, and I will get back to you!

Any updates here?

Hi there, we don’t throttle performance on our end in anyway. This is most likely to do with the service provider APIs.