8

Tencent launches supercomputing cluster to aid LLM training in China - PingWest

 1 year ago
source link: https://en.pingwest.com/w/11603
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Tencent launches supercomputing cluster to aid LLM training in China

Tencent launches supercomputing cluster to aid LLM training in China

April 14, 2023 3:29 pm

On April 14, Tencent, China’s gaming and social giant, launched a new High-Performance Computing Cluster (HCC) that aims to address the computational bottleneck faced in large-scale model training in China. The cluster uses Tencent's self-developed StarLake server and features Nvidia's latest generation H800 GPU, making it the first in the country to be equipped with this technology. The inter-server bandwidth is 3.2T, providing high performance, high bandwidth, and low latency for large model training, autonomous driving, and scientific computing.

Large model training has become a vital area of research in recent years, but it requires a significant amount of computational power. This means that companies need to connect multiple servers through a high-performance network to create a large-scale computational cluster. This demand for computational power results in an exponential increase in hardware investment, creating a cash flow burden for many companies.

One solution is to use cloud computing as it provides cost-effective, scalable and easily deployable computational resources. In addition, cloud-based resources can be pooled and used on demand, making it easier for businesses to access the necessary computational power required for their projects

While using advanced chips is critical for large model training, Tencent argues that this alone is not enough. High-performance computing relies on a series of interdependent factors such as computation, storage, and network bandwidth. Therefore, any bottleneck in these factors can lead to a significant reduction in computing power.

Tencent's new HCC cluster has been launched with Nvidia's H800 chips, which are based on the Hopper architecture and are designed to run deep recommendation systems. The launch comes as the industry has been grappling with the shortage of high-performance chips that has limited the growth of large-scale models in China.

SHARE


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK