Questions tagged [edit-distance]
A class of problems concerned with finding or minimizing the sequence of change operations (such as insertion, deletion, substitution, or transposition) to convert one list or string into another
119 questions
3
votes
1
answer
266
views
Damerau-Levenshtein distance performance improvements
I am working on a Python C-extension to calculate Damerau-Levenshtein distance. I am not really familiar with C at all--I just know that it generally has better performance. However, I am not sure how ...
3
votes
2
answers
329
views
Find pairs of similar (by hamming distance) bit strings
I have a list of binary strings where each binary string max length is 15. I need to find list of integers which is count of similar (should differ by max 1 bit position) binary strings present in the ...
2
votes
0
answers
114
views
Simple diff utility in Java using OOP take 2
(See the previous iteration here.)
After further revising my code, I have:
com.github.coderodde.diff.Diff.java:
...
5
votes
2
answers
2k
views
Find all pairs of words that differ by one letter
I have a list of words in English. My aim is to find all pairs of words that differ by exactly one letter (i.e. edit distance is 1). For instance: PAY-PLAY, WARM-ARM, WORD-WORK.
The naive algorithm is ...
2
votes
1
answer
160
views
Simple diff utility in Java using OOP
(Previous and initial iteration lives here.)
(The next iteration is here.)
Now I was able to spare some lines by putting diff related state in an object. Also, I relied on ...
6
votes
2
answers
1k
views
Simple diff utility in Java
(See the next and second iteration here.)
I have this toy implementation of the diff utility in Java:
...
0
votes
2
answers
180
views
New String similarity algorithm (java)
Recently I have been developing a library for Java which provides utility functions for arrays, strings, etc.. While researching string similarity algorithms, I managed to write one of my own. I am ...
3
votes
0
answers
70
views
Get shortest grouped (distinct) string from Hashset of strings
Well, I just have a lot of strings that have common parts between them and not all of them. By this reason, I wanted to group them by their longest common part and take from that the minimum from each ...
2
votes
0
answers
52
views
Loop through the nearest switch and deletion neighbours of a Boolean assignment
My goal is to look for nearest neighbours in a Boolean-assigned mapping (a Python dictionary). For this purpose I'm looping through the candidates, starting from those with the closest distance. The ...
6
votes
2
answers
2k
views
Hamming distance between two strings
I wrote this module to find the Hamming distance between two strings. (It's a problem from exercism.io's Haskell track.)
As I saw it, the problem has two distinct ...
5
votes
1
answer
124
views
calculating upper bound on normalized weighted levenshtein distance
Current implementation
I am using a normalized weighted Levenshtein distance for two utf32 strings with the following costs (insertion: 1, deletion: 1, replacement: 2). The normalization is performed ...
2
votes
1
answer
760
views
An implementation of Levenshtein Distance algorithm in modern C++
Below is an implementation of Levenshtein Distance algorithm.
I am trying to use modern C++ features as much as I can, i.e. auto, no pointer / raw memory but I feel like it is a constant struggle.
...
1
vote
1
answer
202
views
"Edit Distance" algorithm
I have a piece of code that calculates the edit distance between words and works, but it's apparently not fast enough.
ClosestWords.java:
...
1
vote
1
answer
886
views
Vectorize pairwise edit distance computation [closed]
How can I vectorize this function? It takes an array of array of tuples as input and computes the distance between elements using Levenshtein's method. I found some simpler examples but they weren't ...
1
vote
1
answer
101
views
Filter string with min distance in r
I have the following sample data:
target <- "Stackoverflow"
candidates <- c("Stackflow", "Stackflow", "Stckoverfow")
I would like to filter the string from ...