Kudos for figuring out the correct algorithm.
However you can streamline it by not using a
vvector:You correctly treated an operation
a, b, kas a pair of operations: addkfromato the end, and subtractkfromb+1to the end. Now, instead of storing them inv, collect decoupled operations in a vector of their own. Sort it by index.std::partial_sumit, and find the maximum in the resulting array.This will drive the space complexity down from \$O(n)\$ to \$O(m)\$, and change the time complexity from \$O(n+m)\$ to \$O(m\log m)\$. According to constraints, the time complexity seems to be better. One should also keep in mind that accesses to
vcould be all over the place with no particular order, and a well crafted sequence of operations may incur too many cache misses. I didn't profile though.It is possible that spelling the loop out (rather than using
for_eachand lambda) would improve readability.The algorithm would fail if
kwas allowed to be negative. Even it is not the case, it still is a good habit to initializemaxandxtov[0], and start the loop atv.begin() + 1.