Newest 'jaccard-similarity' Questions

0 votes

1 answer

97 views

Compute Jaccard Index for two similar but not equal shapefiles in R [closed]

I have two distinct shapefiles that have a high degree of overlap, but aren't the same. I want to make a comparison and one of the things I would like to generate is the Jaccard Index of regions ...

BLP92

345

asked Nov 17 at 10:58

0 votes

0 answers

48 views

Evaluating Fuzzy clustering quality

Initially, I performed kmeans clustering and obtained some meaningful clusters. To refine these clusters, I ran Fuzzy C Means on the Kmeans center using "e1071" package. Are there any ...

Mary

221

asked Feb 6 at 15:17

0 votes

0 answers

58 views

How to optimize PySpark code to calculate Jaccard Similarity for a huge dataset

I have a huge PySpark dataframe that contains 250 million rows, with columns ItemA and ItemB. I'm trying to calculate the Jaccard Similarity M_ij that can run efficiently and takes a short amount of ...

Rayne

15.2k

asked Nov 14, 2024 at 1:27

2 votes

0 answers

59 views

What's going wrong in these weighted jaccard sum calculations for comparing the pronunciation of consonant clusters? [closed]

Context I have this code for my attempt to create a "similarity mapping" between consonants (or consonant clusters), to the same set of consonants/clusters (basically a cross product mapping)...

Lance Pollard

80.5k

asked Oct 18, 2024 at 14:47

0 votes

3 answers

88 views

R: calculation distance matrix between two lists of strings

Please consider the reprex at the end of the post. I have two lists of dataframes. Each dataframe has a $keyword column, which is a vector of text. I am looking for a computationally efficient way to ...

larry77

1,543

asked Jun 25, 2024 at 12:18

Collectives™ on Stack Overflow

Compute Jaccard Index for two similar but not equal shapefiles in R [closed]

Evaluating Fuzzy clustering quality

How to optimize PySpark code to calculate Jaccard Similarity for a huge dataset

What's going wrong in these weighted jaccard sum calculations for comparing the pronunciation of consonant clusters? [closed]

R: calculation distance matrix between two lists of strings

Hot Network Questions