This is the official GitHub repository for the lecture "Multimodal Deep Learning for Recommendation", hosted at the 2024 ACM RecSys Summer School in Bari (Italy).
With the advent of deep learning and, more recently, large models, recommendation systems have greatly refined their capability of profiling users’ preferences and interests that, in most cases, are complex to disentangle. This is especially true for those recommendation algorithms that hugely rely on external side information, such as multimedia recommender systems. In specific domains like fashion, music, and movie recommendation, the multi-faceted features characterizing products and services may influence each customer on online selling platforms differently, paving the way to novel multimedia recommendation models that leverage the lesson-learned from multimodal deep learning.
On such premises, this lecture will delve into the topic of multimodal deep learning for recommendation. Specifically, the lecture will provide an introductory, still general, overview of the current literature. First, it will start by outlining the main rationales behind multimedia recommendation, and why a multimodal formalization is currently needed. Then, it will focus on presenting the multimodal deep learning pipeline for multimedia recommendation. This section will also involve a hands-on session, where we will learn how to build such a pipeline from scratch, from the multimodal dataset processing to the recommendation training and evaluation. Finally, it will highlight the main research challenges, and how we tackled most of them in recent works.
- Instructor: Daniele Malitesta (reach out to me: email)
- Date: Wednesday, October 9, 2024
- Duration: 15:30 - 16:50 CET (80 min)
- Main topics: Multimedia recommender systems, multimodal deep learning
- Part 0: I wish I had known it in advance! [5 min.]
- 0.1: (Some) useful resources to get started
- 0.2: Field-specific conferences/journals
- Part 1: One multimodal schema to rule them all [50 min.]
- 1.1: Personalized (multimedia) recommendation
- 1.2: Why do we need a schema anyway?
- 1.3: [Theory + Practice] A unified multimodal schema
- Part 2: Open challenges and how we solved them [15 min.]
- 2.1: Missing modalities in the input data
- 2.2: Multimodality on user-item interactions
- 2.3: Pre-trained feature extractors
- 2.4: Fine-grained multimodal features
- 2.5: An extensive and fair evaluation
- Part 3: Q & A time [+∞*]
* Just kidding :-)
Main papers
Title | Paper | Code |
---|---|---|
Ducho meets Elliot: Large-scale Benchmarks for Multimodal Recommendation | arXiv | GitHub |
Formalizing Multimedia Recommendation through Multimodal Deep Learning | TORS | GitHub |
Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation | The Web Conference | GitHub |
Ducho: A Unified Framework for the Extraction of Multimodal Features in Recommendation | ACM Multimedia | GitHub |
Side papers
Title | Paper | Code |
---|---|---|
Do We Really Need to Drop Items with Missing Modalities in Multimodal Recommendation? | CIKM | GitHub |
Reshaping Graph Recommendation with Edge Graph Collaborative Filtering and Customer Reviews | DL4SR@CIKM | GitHub |
Leveraging Content-Style Item Representation for Visual Recommendation | ECIR | GitHub |
A Study on the Relative Importance of Convolutional Neural Networks in Visually-Aware Recommender Systems | CVFAD@CVPR | GitHub |
If you have attended the lecture, please leave a comment in this anonymous form. It will take a few minutes, but it can be very important to me to improve the quality of the lecture! Thank you for your time and contribution!
None of the content of this lecture would have been possible without the great contribution of former and current collaborators/supervisors. I wish to thank them all!
Supervisors (former and current): Tommaso Di Noia (Poliba), Fragkiskos Malliaros (CentraleSupélec)
Collaborators (alphabetical order): Walter Anelli (Poliba), Matteo Attimonelli (Poliba), Giandomenico Cornacchia (IBM Europe) Danilo Danese (Poliba), Angela Di Fazio (Poliba), Antonio Ferrara (Poliba), Giuseppe Gassi (Poliba), Alberto Mancino (Poliba), Felice Merra (AWS GenAI), Claudio Pomo (Poliba), Emanuele Rossi (VantAI).