ALEXANDRIA, Va., Oct. 28 -- United States Patent no. 12,456,012, issued on Oct. 28, was assigned to Google LLC (Mountain View, Calif.).
"Inference methods for word or wordpiece tokenization" was invented by Xinying Song (Bellevue, Wash.) and Yang Song (Bellevue, Wash.).
According to the abstract* released by the U.S. Patent & Trademark Office: "Systems and methods for performing inference for word or wordpiece tokenization are disclosed using a left-to-right longest-match-first greedy process. In some examples, the vocabulary may be organized into a trie structure in which each node includes a precomputed token or token_ID and a fail link, so that the tokenizer can parse the trie in a single pass to generate a list of only those tokens or...