Package 'gsdmm' reference manual

Title:	GSDMM Short Text Clustering via Dirichlet Mixture Models
Description:	This package implements a Dirichlet Mixture Model and accompanying Gibbs sampler for short text clustering proposed by Yin and Wang 2014.
Authors:	Till Tietz [aut, cre], Akiru Kato [ctb]
Maintainer:	Till Tietz <[email protected]>
License:	MIT + file LICENSE
Version:	0.0.6
Built:	2025-03-18 13:46:07 UTC
Source:	https://github.com/paithiov909/gsdmm

gsdmm: GSDMM Short Text Clustering via Dirichlet Mixture Models

Description

This package implements a Dirichlet Mixture Model and accompanying Gibbs sampler for short text clustering proposed by Yin and Wang 2014.

Author(s)

Maintainer: Till Tietz [email protected]

Other contributors:

Akiru Kato [contributor]

Fit Dirichlet Mixture Model

Description

Fit Dirichlet Mixture Model

Usage

gsdmm(
  texts,
  n_iter = 30L,
  n_clust = 8L,
  alpha = 0.1,
  beta = 0.1,
  progress = TRUE
)
gsdmm(
  texts,
  n_iter = 30L,
  n_clust = 8L,
  alpha = 0.1,
  beta = 0.1,
  progress = TRUE
)

Arguments

`texts`	a list of character vectors
`n_iter`	integer number of iterations to run gibbs sampler for
`n_clust`	integer upper bound on number of clusters. The returned number of cluster will be smaller or equal to n_clust
`alpha`	double governing the probability of assigning a text to a currently empty cluster (larger alpha means higher probability).
`beta`	double governing the tradeoff between cluster size and fit. Smaller betas make clustering more sensitive to congruence between cluster-word and document-word distributions while larger betas make it more sensitive to cluster size.
`progress`	logical indicating whether to print progress bar.

Value

a list that contains an integer vector of clusters and a word-cluster matrix.

Package 'gsdmm'

Help Index

gsdmm: GSDMM Short Text Clustering via Dirichlet Mixture Models

Description

Author(s)

Fit Dirichlet Mixture Model

Description

Usage

Arguments

Value