Skip to main content
Finally come up with a DRY remove()
Source Link
greybeard
  • 7.7k
  • 3
  • 21
  • 56

(still in progress)

Follow the Style Guide for Python Code.
Add docstrings, at least for "everything to be used externally".
Add tests(doc)tests. Or, at least, a start for tinkering:

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.
But, wait, doesn't that descend left or right thing look just the same in lookup() and remove()?
If I had a method

    def _find(self, value):
        """ Find node with value, if any.
            Return this node and setter for (re-)placing it in this tree.
        """
        pass

, insert() was simple, and lookup() trivial (if questionable: wording
• lookup: returns the value if it exists in the BST else returns None
, implementation returns a Node):

    def insert(self, value):
        """ Insert value into this tree.
            ToDo: document behaviour with identical keys
        """
        existing, place = self._find(value)
        if existing is not None:
            pass
        place(Node(value))
        return self
    
    def lookup(self, value):
        """ Return node with value, if any. """
        return self._find(value)[0]

Remove can profit from a modified successor method

    @staticmethod
    def leftmost(child, parent=None):
        """ Return the leftmost node of the tree rooted at child,
            and its parent, if any. """
        if child is not None:
            while child.left is not None:
                child, parent = child.left, child
        return child, parent
    
    def deleted(self, node):  # node None?
        """ Return root of a tree with node's value deleted. """
        successor, parent = BinarySearchTree.leftmost(node.right, node)
        if successor is None:
            return node.left
        node.value = successor.value
        if node is not parent:
            parent.left = successor.right
        else:
            node.right = successor.right
        return node
    
    def remove(self, value):
        """ Remove value.
            Return 'tree contained value'. """
        node, place = self._find(value)
        if node is None:
            return False
        place(self.deleted(node))
        return True

My current rendition of _find() adds in Node

    def _set_left(self, v):
        self.left = v
    
    def _set_right(self, v):
        self.right = v

, in BinarySearchTree

    def _set_root(self, v):
        self.root = v
    
    def _find(self, value):
        """ Find node with value, if any.
            Return this node and setter for (re-)placing it in this tree.
        """
        if self.root is None or self.root.value == value:
            return self.root, self._set_root
        child = self.root
        while True:
            node = child
            smaller = value < child.value
            child = child.left if smaller else child.right
            if child is None or child.value == value:
                return child, node._set_left if smaller else node._set_right

(TBC)

(still in progress)

Follow the Style Guide for Python Code.
Add docstrings, at least for "everything to be used externally".
Add tests. Or, at least, a start for tinkering:

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.
But, wait, doesn't that descend left or right thing look just the same in lookup() and remove()?

(TBC)

Follow the Style Guide for Python Code.
Add docstrings, at least for "everything to be used externally".
Add (doc)tests. Or, at least, a start for tinkering:

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.
But, wait, doesn't that descend left or right thing look just the same in lookup() and remove()?
If I had a method

    def _find(self, value):
        """ Find node with value, if any.
            Return this node and setter for (re-)placing it in this tree.
        """
        pass

, insert() was simple, and lookup() trivial (if questionable: wording
• lookup: returns the value if it exists in the BST else returns None
, implementation returns a Node):

    def insert(self, value):
        """ Insert value into this tree.
            ToDo: document behaviour with identical keys
        """
        existing, place = self._find(value)
        if existing is not None:
            pass
        place(Node(value))
        return self
    
    def lookup(self, value):
        """ Return node with value, if any. """
        return self._find(value)[0]

Remove can profit from a modified successor method

    @staticmethod
    def leftmost(child, parent=None):
        """ Return the leftmost node of the tree rooted at child,
            and its parent, if any. """
        if child is not None:
            while child.left is not None:
                child, parent = child.left, child
        return child, parent
    
    def deleted(self, node):  # node None?
        """ Return root of a tree with node's value deleted. """
        successor, parent = BinarySearchTree.leftmost(node.right, node)
        if successor is None:
            return node.left
        node.value = successor.value
        if node is not parent:
            parent.left = successor.right
        else:
            node.right = successor.right
        return node
    
    def remove(self, value):
        """ Remove value.
            Return 'tree contained value'. """
        node, place = self._find(value)
        if node is None:
            return False
        place(self.deleted(node))
        return True

My current rendition of _find() adds in Node

    def _set_left(self, v):
        self.left = v
    
    def _set_right(self, v):
        self.right = v

, in BinarySearchTree

    def _set_root(self, v):
        self.root = v
    
    def _find(self, value):
        """ Find node with value, if any.
            Return this node and setter for (re-)placing it in this tree.
        """
        if self.root is None or self.root.value == value:
            return self.root, self._set_root
        child = self.root
        while True:
            node = child
            smaller = value < child.value
            child = child.left if smaller else child.right
            if child is None or child.value == value:
                return child, node._set_left if smaller else node._set_right
iterable using generator `__iter__`s
Source Link
greybeard
  • 7.7k
  • 3
  • 21
  • 56

(still in progress)

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.
But, wait, doesn't that descend left or right thing look just the same in lookup() and remove()?

Don't spend effort there before measurements suggesting it's a bottleneck.


Guide-line for easy-to-use types: Don't make me think.

You define inorder() to print the values in order.
Wouldn't it be cute to be able to use, for a tree ascending
for item in ascending: …? While it was possible to not touch Node, add

    def __iter__(self):
        """ Yield each node in the tree rooted here in order. """
        if self.left is not None:
            yield from self.left
        yield self.value
        if self.right is not None:
            yield from self.right

there, and in the tree

    def __iter__(self):
        """ Yield each node in the tree in order. """
        if self.root is not None:
            yield from self.root

, enabling print(*ascending)

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.

Don't spend effort there before measurements suggesting it's a bottleneck.

(still in progress)

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.
But, wait, doesn't that descend left or right thing look just the same in lookup() and remove()?

Don't spend effort there before measurements suggesting it's a bottleneck.


Guide-line for easy-to-use types: Don't make me think.

You define inorder() to print the values in order.
Wouldn't it be cute to be able to use, for a tree ascending
for item in ascending: …? While it was possible to not touch Node, add

    def __iter__(self):
        """ Yield each node in the tree rooted here in order. """
        if self.left is not None:
            yield from self.left
        yield self.value
        if self.right is not None:
            yield from self.right

there, and in the tree

    def __iter__(self):
        """ Yield each node in the tree in order. """
        if self.root is not None:
            yield from self.root

, enabling print(*ascending)

review insert()
Source Link
greybeard
  • 7.7k
  • 3
  • 21
  • 56

Keep classes and functions short.
Say, 24 lines for "heads", 48 for a function body, 96 for a class.

Where not impeding readability, keep nesting level low.
This includes early outs, continues, breaks & else in loops; returns

    def insert(self, value):
        """ Insert value into this tree.
            ToDo: document behaviour with identical keys
                  (what about lookup()?)
        """
        latest = Node(value)
        current = self.root
        previous = None     # the parent for latest, if any
        left = None         # the child latest is going to be in parent
        while current is not None:
            previous = current
            left = value < current.value
            current = current.left if left else current.right
        if previous is None:
            self.root = latest
        elif left:
            previous.left = latest
        else:
            previous.right = latest
        return self

(what an awful example for changing entirely too many things at once)

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.


How can I improve the speed/efficiency?

Don't spend effort there before measurements suggesting it's a bottleneck.

(TBC)

(TBC)

Keep classes and functions short.
Say, 24 lines for "heads", 48 for a function body, 96 for a class.

Where not impeding readability, keep nesting level low.
This includes early outs, continues, breaks & else in loops; returns

    def insert(self, value):
        """ Insert value into this tree.
            ToDo: document behaviour with identical keys
                  (what about lookup()?)
        """
        latest = Node(value)
        current = self.root
        previous = None     # the parent for latest, if any
        left = None         # the child latest is going to be in parent
        while current is not None:
            previous = current
            left = value < current.value
            current = current.left if left else current.right
        if previous is None:
            self.root = latest
        elif left:
            previous.left = latest
        else:
            previous.right = latest
        return self

(what an awful example for changing entirely too many things at once)

Python naming style is CapWords for classes, only: that would be new_node.
The left = None statement is there for the comment.
The paranoid prefer is None/is not None over just using what should be a reference in a boolean context. Above rendition tries to keep DRY and avoids some of what looks repetitious in the insert() you presented.


How can I improve the speed/efficiency?

Don't spend effort there before measurements suggesting it's a bottleneck.

(TBC)

Source Link
greybeard
  • 7.7k
  • 3
  • 21
  • 56
Loading