There's no documentation. What does this code do and how am I supposed to use it?
There are no test cases. Since the
hashlibhashlibmodule has asha1function, it would be easy to write a test case that generates some random data and hashes it with both algorithms. For example:import hashlib from random import randrange from unittest import TestCase class TestSha1(TestCase): def test_sha1(self): for l in range(129): data = ''.join(chrbytearray(randrange(256)) for _ in range(l)) self.assertEqual(hashlib.sha1(data).hexdigest(), sha1(data))You start by converting the message to a string containing its representation in binary. But then later on you convert it back to integers again. It would be simpler and faster to work with bytes throughout. Python provides the
bytearraybytearraytype for manipulating sequences of bytes and thestructstructmodule for interpreting sequences of bytes as binary data. So you could initialize and pad the message like this:message = bytearray(data) # Append a 1-bit. message.append(b'\x80'0x80) # Pad with zeroes until message length in bytes is 56 (mod 64). message.extend(b'\x00'[0] * (63 - (len(message) + 7) % 64)) # Append the original length (big-endian, 64 bits). message.extend(struct.pack('>Q', len(data) * 8))
and then convert the chunks to 32-bit integers using struct.unpack like this:
for chunk in range(0, len(message), 64):
w = list(struct.unpack('>16L', message[chunk: chunk+64]))
for i in range(16, 80):
w.append(rol((w[i-3] ^ w[i-8] ^ w[i-14] ^ w[i-16]), 1))
I would write the conditions in the main loop as
0 <= i < 20,20 <= i < 40and so on, to match the half-open intervals that Python uses elsewhere (in list slices,rangeand so on).I would replace
elif 60 <= i <= 79:byelse:to make it clear that this must match.:elif 60 <= i <= 79:
by
else:
assert 60 <= i < 80
to make it clear that this case must match.
You can avoid the need for
tempby using a tuple assignment:a, b, c, d, e = ((rol(a, 5) + f + e + k + w[i]) & 0xffffffff, a, rol(b, 30), c, d)