I'm working on a data manipulation program, and I'm almost done. I've got the duplicate detection algorithm working, and I have a list of all items including duplicates, and a list of duplicates.
I want to go through the list of items, filter by the duplicates list, and remove all but the first of the duplicates. I've tried it like this, but it doesn't actually remove the records from the array.
const removeDupes = (list, dupes) => {
list.forEach(listItem => {
let filtered = dupes.filter(x => ((x.item1.externalId === listItem.externalId)|| (x.item2.externalId === listItem.externalId)));
if(filtered.length > 0){
for(let i = 1; i < filtered.length; i++){
list.splice(list.indexOf(filtered[i]));
}
}
});
return list;
}
Please keep in mind that list and dupes have slightly different schemas. list is simply an array of objects with an ID field called externalID and dupes is an array of objects with this schema:
[{
item1: {schema from list},
item2: {schema from list},
...}]
They're not exact duplicates, more like duplicates from different databases with different schemas that have been reformatted to the same schema....