What's more, they show a counter-intuitive scaling limit: their reasoning effort and hard work boosts with problem complexity as many as a degree, then declines Irrespective of possessing an sufficient token budget. By comparing LRMs with their regular LLM counterparts under equal inference compute, we discover three general performance regimes: (1