ALEXANDRIA, Va., Sept. 30 -- United States Patent no. 12,431,131, issued on Sept. 30, was assigned to Amazon Technologies Inc. (Seattle).
"Cache techniques for large language model processing" was invented by Kartik Balasubramaniam (Framingham, Mass.), Venkata Siva Sai Krishna Balakavi (Jersey City, N.J.) and Austin Doolittle (Roslindale, Mass.).
According to the abstract* released by the U.S. Patent & Trademark Office: "Techniques for cache management for reducing latency in LLM inferencing are described. In some embodiments, a system caches encoded data of portions of a prompt so that the encoded data is available for use by the LLM across dialog turns of a dialog session. Within a dialog session, a portion of the LLM prompt may be the ...