Bayesian Preference Estimation with Inconsistent Feedback

Header Image

Manuscript submitted for review to the DECISION Journal

This website contains access to data, code, and results for the paper “Bayesian Preference Estimation with Inconsistent Feedback,” submitted to the journal of Mathematical Psychology.

To foster reproducibility, we have made all code and results publicly available in our repository : Repository link

Datasets

There are three datasets used in this paper with the following statistics:

Dataset	Number of Users	Number of Items	Number of Interactions	Number of Categories	% Sparsity
Yelp	6,651	8,267	264,174	265	99.55%
MovieLens1M	6,040	3,260	998,538	18	94.93%
MovieLensSmall	610	8,974	100,010	19	98.17%

Sample data snapshot - from MovieLens dataset:

UserID	movieID	Like Status	Primary Genre	Secondary Genre	Movie Name
54462	288	0	Adventure	Comedy	The Princess Bride
20948	247	0	Action	Adventure	Indiana Jones
25386	401	0	Action	Adventure	Mad Max: Fury Road
48678	491	1	Comedy	Drama	The Truman Show
42717	322	1	Crime	Drama	The Godfather
52687	143	0	Adventure	Drama	Life of Pi
66403	218	0	Action	Adventure	Gladiator
41867	130	1	Drama	Thriller	Se7en
10548	69	0	Action	Drama	Inception
58166	33	0	Action	Comedy	Guardians of the Galaxy

From Yelp Dataset:

UserID	LocationID	Like Status	Type of Location	City
155	442	0	Cinema	San Francisco
1734	6273	1	Cinema	New York
231	3004	1	Restaurant	Los Angeles
787	3475	1	Shopping Center	San Francisco
2293	1857	1	Cinema	San Francisco
1643	364	0	Restaurant	Chicago
565	2669	0	Restaurant	San Francisco
5013	6758	0	Shopping Center	Los Angeles
3500	429	0	Shopping Center	Chicago
605	4223	1	Restaurant	Los Angeles

Process Data Download Instruction

The raw data for this work can be downloaded from this link, separated by train, test, and category datasets.

Three folders are included, each folder contain data for one of the datasets (folders are MovieLens1M, MovieLensSmall, Yelp)

Models Evaluated

In this research, we evaluate the performance of our proposed model by comparing it against several well-known baseline models. These models include advanced machine learning techniques and established Bayesian methods. Unlike our proposed model, these baselines do not explicitly account for user inconsistency.

1. Bayesian Updating Model

This model employs Bayesian inference to update the parameters of a Beta distribution for each user-attribute pair based on binary feedback (e.g., like/dislike).

2. Bayesian Personalized Ranking (BPR)

BPR is designed for optimizing ranking tasks with implicit feedback, such as clicks or purchases, rather than explicit ratings.

3. Collaborative Filtering Model

This model predicts user ratings by leveraging ratings from similar users and items, combining information from multiple sources.

4. Multi-Attribute Utility Model

This model estimates user preferences by calculating the utility of items based on multiple attributes.

5. Most Popular Model (MostPop)

The MostPop model recommends items based on their overall popularity, serving as a baseline for comparison.

Notebooks

For simplicity and reproduciblitiy, three easy-to-run notebooks are provided:

1. `Model_implementation.ipynb`

This notebook demonstrates the implementation of all baseline models. It creates simulated data and runs all baseline and proposed models on this simulated data. Additionally, it includes ablation studies on item scarcities, user scarcities, and users with varying degrees of inconsistencies.

2. `Results.ipynb`

This notebook utilizes the saved and uploaded results from all simulated datasets as well as the three real-world datasets. It analyzes the results by plotting different statistics and outcomes of the proposed model and baseline models.

3. `Main.ipynb`

This notebook combines the entire process, including model implementation, running the model on both simulated and real-world datasets, and performing evaluations.

Results

Results on real-data can be seen below:

Metrics are calculated based on the top-10 predictions in the test set. The best-performing model for each metric is highlighted in bold, and the second-best is italicized.

MovieLens 1M

Model	Precision@10	Recall@10	NDCG@10	F1@10
Bayesian Model	1.2434	0.6991	5.0201	0.8950
Collaborative Filtering	3.2814	1.9516	10.1953	2.4475
Multi-Attribute Utility	0.8642	0.4642	3.5332	0.6039
BPR Model	0.5315	0.2795	2.3449	0.3663
Most Popular Model	1.4520	0.6706	5.8587	0.9175
Mixture Bayesian Model	1.6803	0.9418	6.7846	1.2070

MovieLens 1M Dataset Results

Yelp

Model	Precision@10	Recall@10	NDCG@10	F1@10
Bayesian Model	0.0120	0.0872	0.0547	0.0211
Collaborative Filtering	0.0030	0.0060	0.0137	0.0040
Multi-Attribute Utility	0.0015	0.0150	0.0068	0.0027
BPR Model	0.0030	0.0105	0.0137	0.0047
Most Popular Model	0.0015	0.0021	0.0068	0.0018
Mixture Bayesian Model	0.0158	0.1142	0.0719	0.0278

Yelp Dataset Results

MovieLens Small

Model	Precision@10	Recall@10	NDCG@10	F1@10
Bayesian Model	0.2874	0.1609	1.2000	0.2063
Collaborative Filtering	0.0328	0.0086	0.1490	0.0136
Multi-Attribute Utility	0.2787	0.2283	1.2662	0.2510
BPR Model	0.1639	0.2101	0.6872	0.1842
Most Popular Model	0.3683	0.2178	1.4661	0.2738
Mixture Bayesian Model	0.3915	0.2140	1.6312	0.2767

MovieLens Small Dataset Results