2

I have data containing a mixture of numeric values and categorical values. I used K-prototype to cluster them.

init = 'Huang'
n_clusters = 50
max_iter = 100

kproto = kprototypes.KPrototypes(n_clusters=n_clusters,init=init,n_init=5,verbose=verbose)

clusters = kproto.fit_predict(data_cats_matrix,categorical=categoricals_indicies)

when I run the last code I'm getting an error as follows :

ValueError: Clustering algorithm could not initialize. Consider assigning the initial clusters manually.

2
  • 2
    maybe there are more clusters (50) than data points? does it work for 2 clusters? Commented Apr 20, 2017 at 8:25
  • 1
    Thank you soo much.. it worked.. Commented Apr 20, 2017 at 9:17

1 Answer 1

2

Your data might not warrant a larger number of clusters.

Run the algorithm for lesser number of k values and note the total cost at the end. If this stops decreasing, there is no need to increase k. It's called the elbow method, you can look it up.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.