Skip to main content
1 vote
1 answer
63 views

Algorithm to find the best file match when the filename is often a substring of the search term

I'm making Chrome extension to find lyrics for song on YouTube. I have lyrics written in files in extension folder. Files are named Name of song.txt. What I'm trying to do is to match search term ...
Milos Stojanovic's user avatar
0 votes
1 answer
66 views

Compare 2 large tables of objects by string similarity of one of their property in Javascript [closed]

I would like to compare 2 large tables of objects from two different databases: an array of 2700 objects an array of 1800 objects Each object is a scientific publication with 30 properties. The aim ...
Didier mac cormick's user avatar
0 votes
0 answers
75 views

Rapidfuzz critical error when using workers != 1

When using the latest version 3.9.6 of rapidfuzz, I get a critical error when using workers != 1, i.e. I cannot use multi-processing to speed-up string comparisons: Process finished with exit code -...
silence_of_the_lambdas's user avatar
-1 votes
1 answer
71 views

Fuzzy name matching with conditions

I want to do a fuzzy match between two dfs. The first df has a list of names with unique IDs and other data, e.g.: ID Name A B 0 12445 John Smith a b 1 23455 Jack Smith ...
nzskra's user avatar
  • 165
4 votes
0 answers
101 views

Performing a join between dataframes with fuzzy matching without iterrows?

I have looked around a bit, but have not found a similar question, forgive me if I missed something. Using pandas, I am trying to write a function to merge 2 dataframes : df_ref and df_to_merge. ...
arounet's user avatar
  • 41
-1 votes
1 answer
88 views

Best way to match strings from different systems [closed]

Suppose that I have a list of strings like this (the real dataset is far larger and contains other data too): List<string> modelNames = [ "XC60 Momentum Standard T6", ...
Kaine's user avatar
  • 589
0 votes
1 answer
2k views

Grouping similar text?

I have a list of landowners and whatever is highlighted shows a similar text string. these highlighted groupings are the same landowner but a slightly different text string. I was thinking maybe ...
nick lanta's user avatar
0 votes
1 answer
198 views

Python or R Context aware fuzzy matching

I am trying to match two string columns containing food descriptions [foods1 and foods2]. I applied an algorithm weighting the word frequency so less frequent words have more weight but it fails as it ...
User981636's user avatar
  • 3,689
-2 votes
1 answer
699 views

Fuzzy comparison of strings in lists of huge length (taking into account performance)

I have two lists: The first list I get from the database is the names of various companies (can be written in uppercase, lowercase or a combination) list_from_DB = ["Reebok", "MAZDA&...
Paul's user avatar
  • 287
0 votes
0 answers
81 views

String Match using Fuzzy Lookup in Excel

I am trying to use Fuzzy Lookup to match two strings in two columns of a table that looks like below. Table1 Table2 | Column A | Column B | | -------- | -------- | | Flower.com |...
Chitwan 's user avatar
0 votes
1 answer
52 views

Fuzzyfication of an excel file using ranges from a txt with simpful

I want to fuzzyficate this excel file using simpful: with these fuzzy rules: In this case, for example, I'd need the excel to be 'FIFTIES' if the age is between 50 and 59 and EVOL to be '10to15' if ...
SVP's user avatar
  • 54
0 votes
2 answers
74 views

Best way to Join 2 Tables with columns containing SIMILAR data

I am having trouble joining to tables together, the 2 columns have similar data but not exactly the same data. Example: Table 1: Column 1 = "Expect rain for todays weather" Table 2: Column 2 ...
Lesego Zim's user avatar
2 votes
1 answer
230 views

Is it possible to merge two tables in Power Query Editor (Power BI) with Python fuzzy matching?

Merge two tables in power query editor (Power BI) based on string similarity with Python Consider the tables bellow: Table1 Table1 Name ... Apple Fruit A11 ... Banana Fruit B12 ... ... ... Table2 ...
Gustavo Schettino's user avatar
1 vote
1 answer
81 views

use adist to determine which element only needs deletions

I've got this vector of strings (y) and a single string (x) which I want to compare and see which y fits x best if only deletions are considered. x = "PCOR1" y = c("PCor", "...
smoff's user avatar
  • 660
0 votes
1 answer
749 views

Levenshtein on dataframe column and input list

New to pyspark and I need to do fuzzy match. Found that levenhenstein is a native function which can do that. I have a dataframe like this: +----------------+----------------+ | col1| ...
curios's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
24