10

I want to compute an SHA1 hash in several steps using TransformBlock()/TransformFinalBlock() :

byte[] block1 = Encoding.ASCII.GetBytes("This");
byte[] block2 = Encoding.ASCII.GetBytes("is");
byte[] block3 = Encoding.ASCII.GetBytes("Sparta");

SHA1 sha = new SHA1Managed();
sha.TransformBlock(block1, 0, block1.Length, block1, 0);
sha.TransformBlock(block2, 0, block2.Length, block2, 0);
sha.TransformFinalBlock(block3, 0, block3.Length);

byte[] result = sha.Hash;

I know there is other ways to compute SHA1 (eg : HashAlgorithm.ComputeHash() or CryptoStream). What is above is a simplified version of more complex code.

What is totally unclear to me is what to pass for the outputBuffer array (the fourth parameter of TransformBlock method) :

int TransformBlock(byte[] inputBuffer, int inputOffset, int inputCount, 
                   byte[] outputBuffer, int outputOffset);

The MSDN page says :

Computes the hash value for the specified region of the input byte array and copies the specified region of the input byte array to the specified region of the output byte array

What if I don't need that array copy ? Should I pass null ? (to avoid input array to be copied each time ?)

Is there a typical use of this ?

Similarly, it seems TransformFinalBlock() also copied input array to an output array. AFAIKm this is what is returned by the method :

byte[] TransformFinalBlock(byte[] inputBuffer, int inputOffset, int inputCount);
2
  • 5
    Personally I don't think this interface is all that good a match for a hash function. Actually, I think it is a horrible design anyway because a block is something that has been explicitly specified internally for block ciphers and hashes. In that definition, "This" is not a block. However, instead of offering a low level interface it offers a high level interface that doesn't do what you would expect, confusing the casual reader. I'd rather use the streaming interface instead. The original designer should hide in shame for this interface definition. Commented Nov 25, 2018 at 15:10
  • Agreed. This API is horrible. For most of my calls to TransformBlock(), only one of the 5 arguments is useful. Designer should hang their head in shame. Commented Jul 25, 2024 at 6:43

1 Answer 1

6

The page and the example you linked are quite clear:

Calling the TransformBlock method with different input and output arrays results in an IOException.

and even the example is clear on the use:

offset += sha.TransformBlock(input, offset, size, input, offset);

SHA1 doesn't really need that parameter. But it is an implementation of the interface ICryptoTransform that has this signature. So SHA1.TransformBlock() has that (useless) parameter. Note that you can set output to null (undocumented but works).

Note that in the HashAlgorithm (that is the base class of SHA1 that implements ICryptoTransform), inside the TransformBlock there is a line like:

if ((outputBuffer != null) && ((inputBuffer != outputBuffer) || (inputOffset != outputOffset)))
    Buffer.BlockCopy(inputBuffer, inputOffset, outputBuffer, outputOffset, inputCount);

So if you set it to null or to input == output then nothing will be copied.

Sign up to request clarification or add additional context in comments.

3 Comments

Seems passing null is also perfectly valid (at least for SHA1 and MD5 implementations) : stackoverflow.com/questions/623159/…
In my opinion, it is a horrible design decision of the .NET team. Clearly they had this idea that because of the already incorrect ICryptoTransform design (again conflating input and output) they had to do something with the outputBuffer. However, copying input can lead to leakage of information. And since a hash function doesn't change the input anyway, then why would you copy it? It fails clean design principles: 1. doing two things at once and more importantly 2. the least surprise principle. Use the streaming interface, at least it doesn't require you to handle non-existing output.
Design is terrible, and documentation totally insufficient. I had to read the source code to see that TransformBlock() could be called with a null output. The API is counter-intuitive and looks wrong on the face of it. This should not be considered good enough for security-related uses.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.