You use the value of (@bucketSize + @bufferSize) inside of a where clause and to calculate a value in a select statement, I think if you took that and created it's own variable, that it might give you some performance increase. as well as some of the other arithmetic that could be done from the start and not calculated during the actual query
len(@values)/(@bucketSize+@bufferSize)+1
could be declared from the beginning
DECLARE maxID
SET maxID = LEN(@values) / (@bucketSize + @bufferSize) + 1
After you do that it is a toss up whether or not to Declare another Variable for the Addition of the @bucketSize + @bufferSize, but it might be worth a try.
Also please use some spaces in your equations, you use plenty of white space everywhere else.
I have a minute or two so I will also say this...
since this is going over so many records and that simple Variable freed up a good 10 seconds, I think that it might be worth it to move as much static calculation out of the query as possible, I know this is less about what SQL was meant for, but I also think this is more about being Dynamic as well.
Maybe I am going a little too far here, but this should still do the same thing as the original
create function [dbo].[BucketString] (
@values varchar(max),
@bucketSize int,
@bufferSize int = 1,
@offset int = 0)
returns @result table (id int, item varchar(max))
begin
DECLARE @bucketAndBufferSize
DECLARE @maxID
DECLARE @bufferMinusBufferMinusOne
DECLARE @bufferMinusOne
SET @bucketAndBufferSize = @bucketSize+@bufferSize
SET @maxID = LEN(@values) / (@bucketAndBufferSize) + 1
SET @bufferMinusBufferMinusOne = @bufferSize - (@bufferSize - 1)
SET @bufferMinusOne = @bufferSize - 1
with buckets as
(
select 1 id
union all
select t.id + 1
from buckets t
where id = t.id
and t.id < @maxID
)
insert into @result
select
id,
substring(@values, @offset + ((id - 1) * @bucketAndBufferSize + (case when @bufferMinusOne = 0 then 1 else @bufferMinusBufferMinusOne end)), @bucketSize) string
from buckets
option (maxrecursion 0)
return;
end
I apologize for the mismatched casing, it's habit to capitalized those words...
When bufferSize is unchanged you want it to be 1 so let's just do that and take out the extra call to the variable.
I took out all the arithmetic that wasn't reliant on information from the query and made them their own variable, this isn't going to be a huge difference, but I think it will be faster than doing the arithmetic inside the query itself, it's like being distracted by that red ball all the time, it distracts you momentarily but you get the job done.