AsianFin— Alibaba Cloud’s large model service platform Bailian on Tuesday announced price cuts for context caching on certain models, further lowering costs for developers and enterprise users leveraging its AI services.
According to the notice, when requests hit the cache for specific models, the input tokens will now be billed at the cached_token rate of 20% of the standard input_token price, down from the previous 40%. Input tokens that are not cached will continue to be billed at the standard rate.
The adjustment is aimed at improving cost-efficiency for AI application development, particularly for scenarios with frequent repeated prompts, such as chatbots, enterprise knowledge bases, and customer service solutions.
Alibaba Cloud has been positioning Bailian as a comprehensive AI development and deployment platform, offering access to its proprietary large language models alongside third-party offerings. By reducing context caching fees, the company is seeking to enhance adoption and strengthen its competitive edge in China’s fast-evolving AI infrastructure market.