32

I can't find any example of deleting documents from Elasticsearch in Python. What I've seen by now - is definition of delete and delete_by_query functions. But for some reason documentation does not provide even a microscopic example of using these functions. The single list of parameters does not tell me too much, if I do not know how to correctly feed them into the function call. So, lets say, I've just inserted one new doc like so:

doc = {'name':'Jacobian'}
db.index(index="reestr",doc_type="some_type",body=doc)

Who in the world knows how can I now delete this document using delete and delete_by_query ?

4 Answers 4

58

Since you are not giving a document id while indexing your document, you have to get the auto-generated document id from the return value and delete according to the id. Or you can define the id yourself, try the following:

 db.index(index="reestr",doc_type="some_type",id=1919, body=doc)

 db.delete(index="reestr",doc_type="some_type",id=1919)

In the other case, you need to look into return value;

 r = db.index(index="reestr",doc_type="some_type", body=doc)
 # r = {u'_type': u'some_type', u'_id': u'AU36zuFq-fzpr_HkJSkT', u'created': True, u'_version': 1, u'_index': u'reestr'}

 db.delete(index="reestr",doc_type="some_type",id=r['_id'])

Another example for delete_by_query. Let's say after adding several documents with name='Jacobian', run the following to delete all documents with name='Jacobian':

 db.delete_by_query(index='reestr',doc_type='some_type', q={'name': 'Jacobian'})
Sign up to request clarification or add additional context in comments.

6 Comments

Can you, please, elaborate a bit more? Especially, I'm interested in the case when I do not know id (id=1919 in your example) beforehand. I ask this because I need to do bulk deletion.
And, as I asked, an example of delete_by_query will also be greatly appreciated by me and other newbies.
Two tiny examples of how to delete all docs where doc_type="some_type" and where doc.name = "John" wil worth +100500 of reputation =)
Then you need to take a look into the return value from .index() method. I have updated the answer to cover this case. It's quite simple, r['_id'] is the auto-generated id number you're looking for. :)
instead of "q" parameter, you have to provide "body" parameter now body={ 'query': { 'term': { 'name': 'Jacobian' } } }
|
13

The Delete-By-Query API was removed from the ES core in version 2 for several reasons. This function became a plugin. You can look for more details here:

Why Delete-By-Query is a plugin

Delete By Query Plugin

Because I didn't want to add another dependency (because I need this later to run in a docker image) I wrote an own function solving this problem. My solution is to search for all quotes with the specified index and type. After that I remove them using the Bulk API:

def delete_es_type(es, index, type_):
    try:
        count = es.count(index, type_)['count']
        response = es.search(
            index=index,
            filter_path=["hits.hits._id"],
            body={"size": count, "query": {"filtered" : {"filter" : {
                  "type" : {"value": type_ }}}}})
        ids = [x["_id"] for x in response["hits"]["hits"]]
        if len(ids) > 0:
            return
        bulk_body = [
            '{{"delete": {{"_index": "{}", "_type": "{}", "_id": "{}"}}}}'
            .format(index, type_, x) for x in ids]
        es.bulk('\n'.join(bulk_body))
        # es.indices.flush_synced([index])
    except elasticsearch.exceptions.TransportError as ex:
        print("Elasticsearch error: " + ex.error)
        raise ex

I hope that helps future googlers ;)

1 Comment

Looks like delete by query has been reinstated, and the plugin now killed of. elastic.co/guide/en/elasticsearch/reference/5.6/…
5

One can also do something like this:

def delete_by_ids(index, ids):
    query = {"query": {"terms": {"_id": ids}}}
    res = es.delete_by_query(index=index, body=query)
    pprint(res)

# Pass index and list of id that you want to delete.
delete_by_ids('my_index', ['test1', 'test2', 'test3'])

Which will perform the delete operation on bulk data

Comments

0

I came across this post while searching for a way to delete a document on ElasticSearch using their Python library, ElasticSearch-DSL.

In case it helps anyone, this part of their documentation describes the document lifecycle. https://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#document-life-cycle

And at the end of it, it details how to delete a document:

To delete a document just call its delete method:

first = Post.get(id=42)
first.delete()

Hope that helps 🤞

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.