每日英语听力

当前播放

大语言模型"荀子"饱读经书,算力十足

Unravel the past

Thousands of years ago, texts appeared on animal bones, bronzes, bamboo slips, and silk brocades before they were written on paper.

But now these ancient Chinese texts have a "new container" in the modern age.

Recently, a research team from Nanjing Agricultural University has rolled out Xunzi, a large language model (LLM) and XunziChat in association with Gulian, a leading ancient Chinese text publisher.

Wang Dongbo, the leader of the research team, said that the large language model was named after Xunzi because Xunzi was not only a prominent Confucian philosopher during the late Warring States Period (475-221 BC), but also a pioneer in presenting and explaining theories of linguistics in ancient China.

When asked why he and his partners made the large language model, Wang explained that "traditional Chinese characters, vertical layout, the absence of pausing and punctuation are all obstacles that readers have to overcome when they read traditional texts".

To create Xunzi the LLM, Wang and his partners first needed to do a lot of research.

Since 2013, his team has worked tirelessly to digitize Chinese classics like the Siku Quanshu, or the Complete Library in Four Sections.

"The hard work involves a large-scale corpus of two billion Chinese characters, which has laid a solid foundation for the large language model," said Wang.

But their efforts seem to have paid off.

下载全新《每日英语听力》客户端,查看完整内容
点击播放