pypbl

pypbl is a python library for preference based learning using pairwise comparisons.

https://img.shields.io/badge/GitHub-jimparr19%2Fpypbl-blue.svg?style=flat https://github.com/jimparr19/pypbl/workflows/pythonpackage/badge.svg?style=flat

Basic Usage

If we want to recommend a personalised list of items to an individual.

There are three approaches we could take:

  1. Ask the individual to manually rank all items.
  2. Ask the individual to provide weights based on their preferences of different features (size, cost, weight etc), and calculate the weighted value of each item.
  3. Find similar people and base recommendations on what these people also like.
  4. Ask the individual compare a small number of alternatives, and derive feature weights from those comparisons.

Option 1 quickly becomes an enormous burden on the user as the number of items increases.

Option 2 is difficult for the user to do and replicate. What exactly does it mean if the weight assigned to one feature is double the weight assigned to another?

Option 3 requires lots of data, a way to determine similarity between individuals and may not be fully personalised.

Option 4 is enabled by preference based learning using pairwise comparisons.

Below is an example of using pypbl to rank top choices of cars using very few pairwise comparisons

import pandas as pd

from pypbl.priors import Normal, Exponential
from pypbl.elicitation import BayesPreference

data = pd.read_csv('data/mtcars.csv')
print(data)

# set index of the data frame to be the item names
data.set_index('model', inplace=True)

p = BayesPreference(data=data)
p.set_priors([
    Exponential(1),  # MPG - high miles per gallon is preferred
    Normal(),  # number of cylinders
    Normal(),  # displacement
    Exponential(2),  # horsepower - high horsepower is preferred
    Normal(),  # real axle ratio
    Normal(),  # weight
    Exponential(-3),  # quarter mile time - high acceleration is preferred
    Normal(),  # engine type
    Normal(),  # transmission type
    Normal(),  # number of gears
    Normal()  # number of carburetors
])

# add some preferences and infer the weights for each parameters
p.add_strict_preference('Pontiac Firebird', 'Fiat 128')
p.add_strict_preference('Mazda RX4', 'Mazda RX4 Wag')
p.add_indifferent_preference('Merc 280', 'Merc 280C')
p.infer_weights(method='mean')

print('\ninferred weights')
for a, b in zip(data.columns.values.tolist(), p.weights.tolist()):
    print('{}: {}'.format(a, b))

# rank all the items and highlight the top five
print('\ntop 5 cars')
print(p.rank().head(5))

# suggest a new item to compare against the highest ranked solution - this may take some time to compute
print('\nsuggested pair to request new preference')
print(p.suggest())

See Examples for further uses.

Indices and tables