Moonshot AI

Beijing Moonshot AI Technology Co., Ltd.

Native name	北京月之暗面科技有限公司
Company type	Private
Industry	Information technology
Founded	March 2023; 2 years ago (2023-03)
Founders	Yang Zhilin Zhou Xinyu Wu Yuxin
Headquarters	Beijing, China
Key people	Yang Zhilin (CEO)
Products	Kimi K1.5 Kimi-VL Kimi-Dev-72B Kimi K2
Number of employees	200 (2024)
Website	moonshot.cn

Background

Moonshot was founded in March 2023 by Yang Zhilin, Zhou Xinyu and Wu Yuxin. It was launched on the 50th anniversary of Pink Floyd's The Dark Side of the Moon which was Yang's favorite album and the inspiration for the company's name.^[1]^[2]

Yang has stated his goal for founding Moonshot AI is to build foundational models to achieve AGI.^[3] Yang's three milestones are long context length, multimodal world model, and a scalable general architecture capable of continuous self-improvement without human input.^[3]

In October 2023, the company released its chatbot, Kimi, which is capable of processing up to 200,000 Chinese characters per conversation.^[4]

In June 2024, it was reported that Moonshot was planning to enter the US market. An insider revealed Moonshot was developing products for the US market, including an AI role-playing chat application called Ohai as well as a music video generator called Noisee. In response, Moonshot stated it had no plans to develop and release overseas products.^[5]

Remove ads

Funding and investments

Moonshot was valued at $300 million when it received its initial funding of $60 million and had 40 employees.^[2]^[6]

In February 2024, Alibaba Group led a $1 billion funding round for Moonshot, which gave it a valuation of $2.5 billion.^[6] It was reported that Yang and related individuals allegedly cashed out $40 million worth of shares, considered unusually large for a company's first year.^[7]

In August 2024, Tencent and Gaorong Capital joined as investors in a $300 million funding round that valued Moonshot at $3.3 billion.^[8] While several firms continued to support the company, some investors, including GSR Ventures, reduced their involvement amid concerns related to shareholder disputes and allegations of premature profit-taking.^[9] In November 2024, a group of investors filed for arbitration against the company’s co-founder and Chief Technology Officer, alleging that funding rounds were conducted without obtaining required consent from some AI-focused investors.^[9]

Remove ads

Products and research

Summarize

Perspective

Kimi

In October 2023, Moonshot launched its first AI chatbot, Kimi which got its moniker from Yang's English name. It had emerged as the closest rival to Baidu's Ernie Bot.^[1]^[10]

In March 2024, Moonshot claimed Kimi could handle 2 million Chinese characters in a single prompt which was a significant upgrade from the previous version that could only handle 200,000. Due to the increased number of users, on 21 March, Kimi suffered an outage for two days and Moonshot had to issue an apology.^[10]^[11]

On 20 January 2025, Kimi k1.5 was released. Moonshot claimed it matched the performance of OpenAI o1 in mathematics, coding, and multimodal reasoning capabilities.^[12]

In July 2025, the company released the weights for Kimi K2, a large language model with one-trillion total parameters.^[13] The model uses a mixture-of-experts (MoE) architecture, where 32 billion parameters are active during inference. K2 was trained on 15.5 trillion tokens of data and is released under a modified MIT license.^[14]^[15]

Kimi has six tiers of plans ranging from 5.2 yuan for four days to 399 yuan for a year of priority use.^[16]

Mooncake serving platform

Mooncake is the platform that serves Moonshot’s Kimi chatbot and processes 100 billion tokens daily.^[17] Moonshot was awarded the Erik Riedel Best Paper Award at the USENIX FAST conference for the paper detailing the architecture of Mooncake.^[17]

Scaling Muon optimizer

In the Moonshot and UCLA joint paper “Muon is Scalable for LLM Training”, the researchers claim to have successfully scaled the Muon optimizer, which was previously known to have strong results in training small language models, to train a 3B/16B-parameter mixture of expert large language model.^[18] The researchers indicate that Muon improves computational efficiency by a factor of 2 compared to the standard optimizer, AdamW, in training large models.^[18] The researchers have open sourced their Muon optimizer implementation and the pretrained and instruction-tuned checkpoints.^[3]

Scaling reinforcement learning with LLMs

In their technical report on the Kimi K1.5 model, Moonshot researchers outline their reinforcement learning methods, which they claim enabled the model to achieve state-of-the-art reasoning capabilities on par with OpenAI’s o1 model.^[19] The researchers note that long context scaling and improved policy optimization methods were key, without relying on complex techniques like Monte Carlo tree search, value functions, and process reward models.^[19]

Remove ads

Background

Funding and investments

Products and research

Kimi

Mooncake serving platform

Scaling Muon optimizer

Scaling reinforcement learning with LLMs

See also

References

External links

Wikiwand - on