1
\$\begingroup\$

I've written a python function to split a netmask or an ip address. The code is about to become part of a custom router api.

def split_ip_or_netmask(ip_address: str) -> list:
    """ Splits an IP-address or network mask into the parts between dots. 
        Returns a list of parts. """
    
    if not isinstance(ip_address, str):
        raise ValueError("IP must be specified as string")

    ip_split = ip_address.split(".")

    if not len(ip_split) == 4:
        raise ValueError("Given string is not IP address.")

    return ip_split

Sample output:

split_ip_or_netmask(100)
Traceback (most recent call last):
  File "/home/user/pycharm-2023.2.1/plugins/python/helpers/pydev/pydevconsole.py", line 364, in runcode
    coro = func()
  File "<input>", line 1, in <module>
  File "<input>", line 3, in split_ip_or_netmask
ValueError: IP must be specified as string
split_ip_or_netmask("")
Traceback (most recent call last):
  File "/home/user/pycharm-2023.2.1/plugins/python/helaradigm drypers/pydev/pydevconsole.py", line 364, in runcode
    coro = func()
  File "<input>", line 1, in <module>
  File "<input>", line 6, in split_ip_or_netmask
ValueError: Given string is not IP address.
split_ip_or_netmask("1.2.3.4")
['1', '2', '3', '4']

Please review in terms of patterns/anti patterns.

E.g. the code allows splitting a netmask and IP address which is DRY, because we can use the very same code for both. However, it violates the SRP because the function is called split_ip_OR_netmask. In order to fix the SRP issue, I though about renaming the function with some umbrella term which includes both.

I tried to catch the invalid inputs and I also think, the code is very self documenting given the variable names.

What would you say?

\$\endgroup\$
7
  • 2
    \$\begingroup\$ Your function name is confusing split_ip_or_netmask. I'd expect the code to extract using IP/CIDR notation -- 1.2.3.4 and 1.2.3.4/24. \$\endgroup\$ Commented Nov 17, 2023 at 9:10
  • \$\begingroup\$ The term netmask should refer to e.g. a string of the form "255.255.255.0". Now that you say it, I find it confusing, too. \$\endgroup\$ Commented Nov 17, 2023 at 9:12
  • \$\begingroup\$ FWIW 255.255.255.0 = /24 CIDR \$\endgroup\$ Commented Nov 17, 2023 at 9:13
  • \$\begingroup\$ Not exactly. The IP and netmask should be split in two invokations. First invokation: "192.168.178.X" -> [192, 168, 178, X]. Second invokation: "255.255.255.0" -> [255,255,255,0]. For the algorithm this does not matter. But in the second invokation the variable names do not match. It would be very tedious name the variables ip_address_or_netmask, splitted_ip_address_or_netmask. So I'd go with ip_address at first. However, that's not really correct. Your suggestion implies a function invokation given the string "192.168.178.X/24" and it would return both ip and netmask as separate arrays. \$\endgroup\$ Commented Nov 17, 2023 at 9:34
  • \$\begingroup\$ tuple(map(int, x.packed)) can be simplified to tuple(x.packed), since iterating over bytes yields ints. \$\endgroup\$ Commented Nov 21, 2023 at 15:40

2 Answers 2

2
\$\begingroup\$

There's two things to consider here: is it correct? and is it pythonic?


Currently it's 2023, we've moved on a bit from 1981. "IP" means something, and it's not what your code is working with.

Somewhere, probably at module level, you need to mention in a docstring that IPv6 is out-of-scope for your business use case, so we know that in your context "IP" denotes "IPv4".

A convenient and unobtrusive way to point out such restrictions is to use identifiers like dotted_quad.

It wouldn't hurt to throw in a # comment explaining why the well-tested ipaddress module proved unsuitable for your use case. I note in passing that it has already solved IPv6 support, as well as some other commonly encountered problems.

A bitwise mask tends to be a pretty low level implementation detail, which "escaped" by showing up in ifconfig output / options and so became part of a sysad's user interface. Regrettably there are 2 ** 32 possible mask values, but just 32 of them are valid, and code seldom sanity checks that. Most folks would prefer to see a small integer masklen instead, e.g. 24 corresponds to a traditional class-C. When designing your Public API, consider passing around lengths instead of bit masks.

On the topic of terminology, a 32-bit netmask looks just like a valid "zeros broadcast" address. So it's common practice to call such a value "an IP" rather than putting lots of "ip_or_mask" quibbling into function names. A function that can parse or display one can also handle the other. If you want to be very correct, just call it a dotted_quad.


def split_ip_or_netmask(ip_address: str) -> ...

Thank you for the optional type annotation, that's lovely, very helpful. There are a bunch of plausible representations and you've done a great job of explaining which is expected.

The def ... ) -> list: annotation could be improved in two ways.

  1. When modeling something with a fixed-length sequence, essentially a C struct, pythonistas prefer tuple over list -- better to return a 4-tuple here.
  2. It would have been helpful to promise to return a list[str].

So go with def ... ) -> tuple[str, str, str, str]:

The docstring twice mentions "parts", where "octets" is how the RFCs refer to them.

    if not isinstance(ip_address, str):
        raise ValueError("IP must be specified as string")

Explicitly verifying the pre-condition is maybe a good thing? If callers often pass in the wrong type and have trouble deciphering the stack trace, then this is a terrific diagnostic, very helpful. (For extra credit include the errant value: f"IP ... as string, got: {ip_address}".)

But this is a little weird and it's likely better to delete it. mypy linting should have already told us about "dumb caller" who's trying to do the Wrong Thing by passing in bytes or int. (Or let beartype do runtime enforcement if you're paranoid.) As written we're prohibiting caller from exploiting duck typing. Probably not a big deal here, but you may as well habitually embrace duck typing sooner than later. If caller passed in e.g. an int, trying to .split() an integer would immediately produce an informative diagnostic, and caller would stop attempting such foolishness pretty quick. Raising fatal error from split or from your explicit check is pretty much six or half dozen, assuming developers who call this have no trouble identifying the obvious thing that went wrong. If they do have trouble, then yes your explicit check is a definite win.

Then we get to the meat of it, a one liner. Gosh, what a lot of ceremony for a split() call! Maybe it was worth it?

The value-add of this function could be a little bigger. For example, you seem keen to validate addresses. We might verify that only dots and digits come in. We might additionally verify that octets fit within eight bits, that caller didn't try to sneak "10.9.8.300" past us. Of course, such validation responsibility might sensibly go into a class IPv4:, something touched on by the mention of ipaddress above.

Kudos on explicitly verifying the post-condition, the promise that you made, that's brilliant. Again, in the event of lossage, a maintenance engineer would thank you for including the errant value at the end of that diagnostic message.

Feel free to include unit tests with a code submission.


This code accomplishes its design goals.

I would be willing to delegate or accept maintenance tasks on it.

\$\endgroup\$
3
  • \$\begingroup\$ Above I added an updated version of the topic function. I tried to address some comments. Mainly, I removed the need to pass IP and netmask separately. I use the ipaddress package, now. I return both IP and netmask as two 4-tuples. \$\endgroup\$ Commented Nov 21, 2023 at 15:19
  • \$\begingroup\$ Sooo, it's worth noticing that there's two different IPv4 concepts we might pass up and down the call stack: 10.9.8.7 is an IP adddress (a /32 that describes a host's interface), while 10.9.8.0/23 is a CIDR prefix (an entry in a route table or blacklist, or the name of a layer-3 network segment). Take care to give appropriate name to whichever concept you're passing around: address or prefix. \$\endgroup\$ Commented Nov 21, 2023 at 16:33
  • \$\begingroup\$ Two recent cases I needed to support were "192.168.178.1/24" and "192.168.178.1/16". Currently, I dont't know if I need sth like "192.168.178.X/32" respectively "192.168.178.X". However, the function should be able to deal with that. I think, as >The code is about to become part of a custom router api. , the context is clear. \$\endgroup\$ Commented Nov 23, 2023 at 12:38
2
\$\begingroup\$

Faulty documentation

Your docstring states:

Splits an IPv4 interface address including an optional network mask into it's octets. [...]

This is false, snce the network mask is not optional according to the code. It is always retrieved from the parsed IPv4Interface and returned as octets according to the return type annotation and code.
And... it's its.

Useless comments

Your comments don't add any information that your code does not already provide. Remove them.

Questionable exception shadowing

In case of ipaddress-exceptions you raise a custom exception from those. This is imho bad practice. You can just let the caller handle those exceptions, since your custom exception does not add any meaningful information and just shadows the real reason of what went wrong.

Useless map()ing

You can directly convert bytes to octet tuples, since iterating over bytes will yield ints.

Naming

... is one of the two hardest problems in programming, as we all know. I suggest you change the function's name to something more descriptive like addr_and_mask_octets, since it returns the IP address and mask octets.

Type aliases

Consider using a type alias for the rather long return type annotation.

Suggested change:

import ipaddress

Ipv4Octets = tuple[int, int, int, int]

def addr_and_mask_octets(ip_if_addr: str) -> (Ipv4Octets, Ipv4Octets):
    """ Splits an IPv4 interface address including a network mask into their octets."""

    if_addr = ipaddress.IPv4Interface(ip_if_addr)
    return (
        tuple(if_addr.ip.packed),
        tuple(if_addr.netmask.packed)
    )
\$\endgroup\$
1
  • \$\begingroup\$ The netmask is optional in the sense, that it is allowed to pass "1.2.3.4" as well as "1.2.3.4/24". However, the netmask is always returned. If it is not given, the default netmask (/32) is returned. \$\endgroup\$ Commented Nov 21, 2023 at 16:01

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.