New Delhi, April 7 -- After shaking up Silicon Valley with AI models earlier this year, Chinese startup DeepSeek is working on another innovation to help reduce operational costs. The company, led by Liang Wenfeng, has been working with researchers at Tsinghua University to develop a new approach called generative reward modelling (GRM), which rewards the AI model for following human preferences.
The new approach, first revealed in a pre-print paper (via Bloomberg), discusses the use of a technique called self-principled critique tuning (SPCT) to make AI models smarter and more efficient in a self-improving way.
The Chinese startup is calling these new models DeepSeek-GRM and plans to release them on an open source basis, just like its ...
Click here to read full article from source
To read the full article or to get the complete feed from this publication, please
Contact Us.