CIPHER: An effective retrieval-based AI algorithm that infers user preferences by querying the LLMs

Language models based on Large Language Models (LLMs) have been developed for several applications, followed by new advances in improving LLMs. However, LLMs lack customization and personalization to a specific user and task. Users often provide feedback to LLM-based agents by editing and editing their responses before final use. In contrast, standard fine-tuning feedback, such as the comparison-based preference feedback in RLHF, is collected by giving model answers to annotators and asking them to rank them, making such feedback a costly option for improving alignment.

The researchers examined the interactive learning of voice agents as a function of user changes to the agent’s output. In tasks such as typing assistants, the user and the voice agent interact with each other to generate a response based on a context and edit the agent’s response to make it more personal and improve correctness based on their latent preferences. In addition, in this article, researchers introduced PRELUDE, a learning framework for performing PREference learning from direct changes made by the user, which presents details about the user’s latent preferences. However, user preferences can be complicated and subtle, and contextual changes lead to learning problems.

A team of researchers from the Department of Computer Science at Cornell University and Microsoft Research New York presented CIPHER, a powerful algorithm that takes into account the complexity of user preferences. CIPHER uses a rich language model to infer user preferences in a given context based on user changes. It retrieves inferred preferences from the closest historical contexts and combines them to generate answers. Compared to algorithms that directly fetch user changes without learning descriptive preferences or those that learn context-independent preferences, CIPHER excels at achieving the lowest edit removal cost.

GPT-4 was used by researchers as the base LLM for CIPHER and all baselines. Additionally, with GPT-4 no fine-tuning is done and no additional parameters are added to the model. All methods use a prompt-driven GPT-4 agent that uses a single prompt and simple decoding to generate responses. In addition, CIPHER and the baselines are extended to more complex language agents. CIPHER is evaluated against baselines that either learn nothing, only learn preferences that are not influenced by context, or use methods that leverage previous user changes to generate responses without learning preferences.

CIPHER achieves the lowest edit removal cost, reducing edits by 31% on the summarizing task and 73% on the email writing task. This is achieved by retrieving and combining five preferences (k=5). Furthermore, CIPHER achieves the highest preference accuracy and demonstrates its potential to learn preferences consistent with ground truth preference compared to other document sources. It also outperforms the ICL Edit and Continual LPI baselines in terms of cost reduction. CIPHER is inexpensive, highly efficient, and easier to understand than other basic methods.

In summary, the researchers introduce the PRELUDE framework, which focuses on learning preferences from user editing data and generating an agent response accordingly. However, to manage user compilations, they introduced CIPHER, an effective retrieval-based algorithm that infers user preferences by querying the LLM, retrieving relevant past examples, and aggregating induced preferences to generate context-specific answers. Compared to other basic methods, CIPHER outperforms them in cost reduction.

Visit the Paper. All credit for this research goes to the researchers of this project. Also don’t forget to follow us Twitter. Join our… Telegram channel, Discord channelAnd LinkedIn Grupp.

If you like our work, you will love ours Newsletter..

Don’t forget to join our 41k+ ML SubReddit

Sajjad Ansari is a final year student at IIT Kharagpur. As a technology enthusiast, he is concerned with the practical applications of AI, with a focus on understanding the impact of AI technologies and their impact on the real world. His goal is to formulate complex AI concepts clearly and understandably.

Source link