Towards a multi-modal Deep Learning Architecture for User Modeling




Deep learning has succeeded in various applications, including image classification and feature learning. However, there needs to be more research on its use in Intelligent Tutoring Systems or Serious Games, particularly in modeling user behavior during learning or gaming sessions using multi-modal data. Creating an effective user model is crucial for developing a highly adaptive system. To achieve this, it is necessary to consider all available data sources to inform the user’s current state. This study proposes a user-sensitive deep multi-modal architecture that leverages deep learning and user data to extract a rich latent representation of the user. The architecture combines a Long Short-Term Memory, a Convolutional Neural Network, and multiple Deep Neu-
ral Networks to handle the multi-modality of data. The resulting model was evaluated on a public multi-modal dataset, achieving better results than state-of-the-art algorithms for a similar task: opinion polarity detection. These findings suggest that the latent representation learned from the data is useful in discriminating behaviors. This proposed solution can be applied in various contexts where user modeling using multi-modal data is critical for improving the user experience.




How to Cite

Tato, A., & Nkambou, R. (2023). Towards a multi-modal Deep Learning Architecture for User Modeling. The International FLAIRS Conference Proceedings, 36(1).



Main Track Proceedings