Supervising topic models with Gaussian processes

Name: Supervising topic models with Gaussian processes
Author: Kandemir, Melih, Kekeç, T., Yeniterzi, Reyyan

İsim	Supervising topic models with Gaussian processes
Yazar	Kandemir, Melih, Kekeç, T., Yeniterzi, Reyyan
Basım Tarihi:	2018-05
Basım Yeri	- Elsevier
Konu	Latent Dirichlet allocation, Nonparametric Bayesian inference, Gaussian processes, Variational inference, Supervised topic models
Tür	Süreli Yayın
Dil	İngilizce
Dijital	Evet
Yazma	Hayır
Kütüphane:	Özyeğin Üniversitesi
Demirbaş Numarası	0031-3203
Kayıt Numarası	9071858b-d33c-49fc-b4b3-0df83d8d9446
Lokasyon	Computer Science
Tarih	2018-05
Notlar	Netherlands Organization for Scientific Research (NWO)
Örnek Metin	Topic modeling is a powerful approach for modeling data represented as high-dimensional histograms. While the high dimensionality of such input data is extremely beneficial in unsupervised applications including language modeling and text data exploration, it introduces difficulties in cases where class information is available to boost up prediction performance. Feeding such input directly to a classifier suffers from the curse of dimensionality. Performing dimensionality reduction and classification disjointly, on the other hand, cannot enjoy optimal performance due to information loss in the gap between these two steps unaware of each other. Existing supervised topic models introduced as a remedy to such scenarios have thus far incorporated only linear classifiers in order to keep inference tractable, causing a dramatical sacrifice from expressive power. In this paper, we propose the first Bayesian construction to perform topic modeling and non-linear classification jointly. We use the well-known Latent Dirichlet Allocation (LDA) for topic modeling and sparse Gaussian processes for non-linear classification. We combine these two components by a latent variable encoding the empirical topic distribution of each document in the corpus. We achieve a novel variational inference scheme by adapting ideas from the newly emerging deep Gaussian processes into the realm of topic modeling. We demonstrate that our model outperforms other existing approaches such as: (i) disjoint LDA and non-linear classification, (ii) joint LDA and linear classification, (iii) joint non-LDA linear subspace modeling and linear classification, and (iv) non-linear classification without topic modeling, in three benchmark data sets from two real-world applications: text categorization and image tagging.
DOI	10.1016/j.patcog.2017.12.019
Cilt	77

Kaynağa git Özyeğin Üniversitesi

Aramaya Dön

Özyeğin Üniversitesi

Kaynağa git

Supervising topic models with Gaussian processes

Yazar Kandemir, Melih, Kekeç, T., Yeniterzi, Reyyan

Basım Tarihi 2018-05

Basım Yeri - Elsevier

Konu Latent Dirichlet allocation, Nonparametric Bayesian inference, Gaussian processes, Variational inference, Supervised topic models

Tür Süreli Yayın

Dil İngilizce

Dijital Evet

Yazma Hayır

Kütüphane Özyeğin Üniversitesi

Demirbaş Numarası 0031-3203

Kayıt Numarası 9071858b-d33c-49fc-b4b3-0df83d8d9446

Lokasyon Computer Science

Tarih 2018-05

Notlar Netherlands Organization for Scientific Research (NWO)

Örnek Metin Topic modeling is a powerful approach for modeling data represented as high-dimensional histograms. While the high dimensionality of such input data is extremely beneficial in unsupervised applications including language modeling and text data exploration, it introduces difficulties in cases where class information is available to boost up prediction performance. Feeding such input directly to a classifier suffers from the curse of dimensionality. Performing dimensionality reduction and classification disjointly, on the other hand, cannot enjoy optimal performance due to information loss in the gap between these two steps unaware of each other. Existing supervised topic models introduced as a remedy to such scenarios have thus far incorporated only linear classifiers in order to keep inference tractable, causing a dramatical sacrifice from expressive power. In this paper, we propose the first Bayesian construction to perform topic modeling and non-linear classification jointly. We use the well-known Latent Dirichlet Allocation (LDA) for topic modeling and sparse Gaussian processes for non-linear classification. We combine these two components by a latent variable encoding the empirical topic distribution of each document in the corpus. We achieve a novel variational inference scheme by adapting ideas from the newly emerging deep Gaussian processes into the realm of topic modeling. We demonstrate that our model outperforms other existing approaches such as: (i) disjoint LDA and non-linear classification, (ii) joint LDA and linear classification, (iii) joint non-LDA linear subspace modeling and linear classification, and (iv) non-linear classification without topic modeling, in three benchmark data sets from two real-world applications: text categorization and image tagging.

DOI 10.1016/j.patcog.2017.12.019

Cilt 77