ALEXANDRIA, Va., Dec. 16 -- United States Patent no. 12,499,144, issued on Dec. 16, was assigned to GOOGLE LLC (Mountain View, Calif.).
"LLM latency reduction via bridging multiple LLMS of differing sizes" was invented by Brett Barros (San Mateo, Calif.).
According to the abstract* released by the U.S. Patent & Trademark Office: "Implementations utilize a smaller LLM to generate content responsive to a user query and cause a portion of the generated content to be rendered as an immediate response to the user query. Implementations further utilize a larger LLM to generate content that starts with the portion of the generated content and that includes a refined portion succeeding the portion of the generated content. The refined portion can...