Newest 'topic-modeling' Questions

-1 votes

1 answer

76 views

Unsupervised Topic Modeling for Short Event Descriptions

I have a dataset of approximately 750 lines containing quite short texts (less than 150 words each). These are all event descriptions related to a single broad topic (which I cannot specify for ...

Arthur GONAY

9

asked Apr 16 at 11:17

0 votes

1 answer

90 views

MiniBatchKMeans BERTopic not returning topics for half of data

I am trying to topic a dataset of tweets. I have around 50 million tweets. Unfortunately, such a large dataset will not fit in ram (even 128GB) due to the embeddings. Therefore, I have been working on ...

Matthieu B

17

asked Feb 18 at 17:42

0 votes

0 answers

35 views

Calculating Topic Correlations or Coocurrences for keyATM

I have been playing around with the keyATM package extensively, however unfortunately there is no approach how to calculate topic correlations and cooccurences, once the model is calculated. I already ...

dpaltra22

1

asked Feb 4 at 20:17

0 votes

1 answer

100 views

Correct topics from LDA Sequence Model in Gensim

Python's Gensim package offers a dynamic topic model called LdaSeqModel(). I have run into the same problem as in this issue from the Gensim mailing list (which has not been solved). The problem is ...

hyco

221

asked Jan 21 at 16:24

1 vote

1 answer

140 views

Inspect all probabilities of BERTopic model

Say I build a BERTopic model using from bertopic import BERTopic topic_model = BERTopic(n_gram_range=(1, 1), nr_topics=20) topics, probs = topic_model.fit_transform(docs) Inspecting probs gives me ...

coolhand

2,109

asked Dec 20, 2024 at 20:49

0 votes

0 answers

41 views

importing util library failed

i am trying to pip install bertopic command for installing and usng bertopic model, here is my next code : from bertopic import BERTopic topic_model = BERTopic.load("MaartenGr/BERTopic_Wikipedia&...

user4356954

asked Dec 11, 2024 at 20:12

0 votes

0 answers

85 views

Unhashable type when calling HuggingFace topic model `topic_labels_` function

If I try to follow the topic modeling tutorial at: https://huggingface.co/docs/hub/en/bertopic The first few lines give me an error: from bertopic import BERTopic topic_model = BERTopic.load("...

coolhand

2,109

asked Dec 9, 2024 at 16:59

0 votes

0 answers

25 views

PackagesNotFound error even when verified packages as installed

I am trying to follow this tutorial for BERT topic modeling: https://jpcompartir.github.io/BertopicR/ library(reticulate) reticulate::install_miniconda() library(BertopicR) BertopicR::...

coolhand

2,109

asked Dec 9, 2024 at 3:06

0 votes

0 answers

53 views

Topic modelling outputs are gender biased?

Has anyone had this issue? My topic modelling seems to be presenting responses that are very dominated by male respondents. The volume of responses across three different questions is over 800 in each ...

GrBrn

3

asked Oct 29, 2024 at 11:32

0 votes

1 answer

62 views

Stopwords problem in text data preprocessing in Python

I want to do topic modeling in Python. For this reason, I used my own stop word list, a stop word list I found on GitHub, and nltk's stop word list to clean the stopwords. However, when I examined the ...

deniz

11

asked Oct 20, 2024 at 13:10

0 votes

0 answers

41 views

Cannot find AIC/BIC of my topic modelling after using "lda.collapsed.gibbs.sampler" in LDA package

I have used "lda.collapsed.gibbs.sampler" to do my topic modelling and LDA visualisation, and now I want to determine which number of models (K) best fits my model. Then I tried to use AIC/...

Pang kalok

19

asked Oct 7, 2024 at 19:55

4 votes

1 answer

469 views

Topic modelling many documents with low memory overhead

I've been working on a topic modelling project using BERTopic 0.16.3, and the preliminary results were promising. However, as the project progressed and the requirements became apparent, I ran into a ...

Bbrk24

1,033

asked Sep 24, 2024 at 21:57

0 votes

1 answer

45 views

How to extract terms and probabilities from tmResult$terms in topic modeling?

I like to create separate word clouds for each of my 8 topics in an LDA model. I extracted top 40 words across 8 topics - an object of length 320 containing top words and occurrence probabilities. I ...

NoaMi

41

asked Aug 8, 2024 at 12:16

0 votes

1 answer

100 views

How is coherence score calculated in Mallet?

I do understand how the diagnostics output shows the coherence values for each topic but my values range between -150 and -600 and other posts that I have seen where Mallet was used show coherence ...

Glorifier

31

asked Jul 6, 2024 at 14:33

0 votes

1 answer

65 views

Inconsistent Results When Running Python Mallet/Gibb's Sampling as a Soft-Clustering Method to Identify Optimal Number of Topics

Sorry, but I am inexperienced with Mallet and could use some help. I am currently trying to use Mallet as a soft-clustering technique to assign group membership for a given set of terms contained ...

A Bolton

1

asked Jun 24, 2024 at 0:15

Collectives™ on Stack Overflow

Unsupervised Topic Modeling for Short Event Descriptions

MiniBatchKMeans BERTopic not returning topics for half of data

Calculating Topic Correlations or Coocurrences for keyATM

Correct topics from LDA Sequence Model in Gensim

Inspect all probabilities of BERTopic model

importing util library failed

Unhashable type when calling HuggingFace topic model `topic_labels_` function

PackagesNotFound error even when verified packages as installed

Topic modelling outputs are gender biased?

Stopwords problem in text data preprocessing in Python

Cannot find AIC/BIC of my topic modelling after using "lda.collapsed.gibbs.sampler" in LDA package

Topic modelling many documents with low memory overhead

How to extract terms and probabilities from tmResult$terms in topic modeling?

How is coherence score calculated in Mallet?

Inconsistent Results When Running Python Mallet/Gibb's Sampling as a Soft-Clustering Method to Identify Optimal Number of Topics

Hot Network Questions