3

I have a method i wrote to find the duplicates in a List. It works fine but im concerned about the complexity of using containsKey. When we use containsKey we have to compute a hash function for every key and then compare against each with our search item, right ? So wouldn't the complexity be O(n) ?

Here is the function:

public void findDup(List<String> list){

    HashMap<String,Integer> map = new HashMap<>();
    int pos=0;
    for(String s: list){
        if(map.containsKey(s)){
            Log.v("myapp","duplicate found:"+s);
        }
        else
            map.put(s,pos);
        pos++;
    }
}

and to call it i do this:

List<String>list=new ArrayList<>();

    for(int i=0;i<12;i++)
        list.add(i+"");

    //these numbers should surely be duplicates
    list.add("3");list.add("6");

    findDup(list);

//output will be 3 and 6 clearly.

update: i re-wrote the function to just use a set which makes more sense:

public void findDup(List<Integer> list){

        HashSet<Integer> set = new HashSet<>();
        for(Integer num: list){
            if(!set.add(num)){
                Log.v("myapp","duplicate found:"+num);
            }

        }
    }
0

4 Answers 4

7

It is specified in the Javadoc as O(1).

The complexity of your algorithm is therefore O(N).

But it would be that even without the containsKey() call, which is actually unnecessary. All you have to do is test whether put() returns a non-null value, which indicates a duplicate.

When we use containsKey we have to compute a hash function for every key and then compare against each with our search item, right?

Wrong. We compute the hash value of the search key and check if that bucket is occupied by an equal key.

So wouldn't the complexity be O(n) ?

No.

Sign up to request clarification or add additional context in comments.

Comments

3

So wouldn't the complexity be O(n)?

Yes, the complexity for the entire list is going to be O(n). However, you do not need to use HashMap<K,V> because HashSet<Key> would be sufficient for finding duplicates:

Set<String> seen = new HashSet<>();
for(String s: list){
    if(!seen.add(s)){
        Log.v("myapp","duplicate found: "+s);
    }
}

Comments

3

The answer is specified in the documentation.

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among >the buckets.

Since containsKey() is just a get() that throws away the retrieved value, it's O(1) (assuming the hash function works properly).

FYI containsValue is O(n), because without the key it doesn't know where it is and the algorithm has to go over all the values stored in the map.

Comments

1

When we use containsKey we have to compute a hash function for every key and then compare against each with our search item, right ?

Wrong.

When the hash function gives a value, this value is used to index into the table directly. There is no further comparisons, unless collisions occur in which case it may or may not even compare depending on the collision resolution method that is used.

Since you are simply checking for duplicates, a hashset is a better data structure to use

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.