PRELUDE

Code for Aligning LLM Agents by Learning Latent Preference from User Edits, NeurIPS 2024.

Installation

This project is developed in Python 3.6. Using Conda to set up a virtual environment is recommended.
Install the required dependencies.
```
pip install -r requirements.txt
```
Install PyTorch from http://pytorch.org/.

Implementation of PRELUDE Framework

PRELUDE implementation contains the follwoing main concepts task, user, and agent.

Task

Task is the class encapsulating the following:

Access to dataset which is sequence of the $(x_t, f^\star_t)$ pairs of (context, true user preference pairs)
Main task prompt (Prompts to generate $y_t$ given $x_t$ and optionally $f_t$):

def get_task_prompt(self, input: str, preference: Optional[str] = None) -> str:
    ...

User evaluation prompts (Prompts to generate $y'_t$):

def get_edit_prompts(self, input: str, output: str, preference: str) -> Tuple[str, str]:
    ...

Right now two different tasks are implemented - content summarization and email writing

Task specifics can be controlled using TaskConfig which allows to:

Change the number of examples
Choose random seed
Specify data source

User

User encapsulates access to task and LLM resource for simulating user responses. For initialization, TaskConfig and UserConfig (allowing to specify the LLM model name) are required.

Agent

Classes responsible for accomplishing the tasks, encapsulating access to LLM and learning algorithm implementations.

Reproduce Our Experiments

All agents mentioned in our paper are located in the agent folder. You can find the insturction and scripts to reproduce our experiments in the experiments folder.

Implement Your Own Agent

Every agent should be inherited from the base Agent class, and have implementations of the following methods:

def complete(self, text) -> LLMOutput - task completion method returning LLMOutput object containing output text and (optionally) debug token information
def learn(self, message, correction: Correction) -> Dict - learning method taking context text and pair of (agent completion, user edits) as inputs. Return value is the dictionary of metrics required to be logged.

Please check the notebook example of dummy agent implementation and end-to-end experiment run here.

Citation

@inproceedings{Gao2024AligningLA,
  title={Aligning LLM Agents by Learning Latent Preference from User Edits},
  author={Ge Gao and Alexey Taymanov and Eduardo Salinas and Paul Mineiro and Dipendra Misra},
  booktitle={Conference on Neural Information Processing Systems},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
examples		examples
experiments		experiments
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PRELUDE

Table of Contents

Installation

Implementation of PRELUDE Framework

Task

User

Agent

Reproduce Our Experiments

Implement Your Own Agent

Citation

About

Releases

Packages

Contributors 3

Languages

License

gao-g/prelude

Folders and files

Latest commit

History

Repository files navigation

PRELUDE

Table of Contents

Installation

Implementation of PRELUDE Framework

Task

User

Agent

Reproduce Our Experiments

Implement Your Own Agent

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages