ALEXANDRIA, Va., Aug. 12 -- United States Patent no. 12,387,053, issued on Aug. 12, was assigned to International Business Machines Corp. (Armonk, N.Y.).
"Large-scale text data encoding and compression" was invented by Zhong Fang Yuan (Xi'an, China), Tong Liu (Xi'an, China), Wen Wang (Beijing), Chen Gao (Xi'an, China) and Xiang Yu Yang (Xi'an, China).
According to the abstract* released by the U.S. Patent & Trademark Office: "Embodiments of the present invention provide an approach for compressing data, and more particularly, to large-scale text data encoding and compression using absolute overfitting on pre-trained language models. Large-scale data is parsed into sentences. A unique token is generated for each sentence to form a token li...