I'm looking to do what this question and answer is doing using gdal_translate / gdal_warp, but in Rasterio. Does anyone know of a way? I can't find anything in the rasterio.warp.reproject module that seemed to do what I was after. I am aware you can do it like this using a larger window and boundless=True, but I would prefer to do it using the raw numpy array only if possible (plus the requisite transforms etc.)
2 Answers
Reading a window would be easy, and you can create a dataset in memory if you don't already have one that contains what's in your numpy array.
But if you want to stick closer to numpy, you can certainly just initialize a larger array then use the bounds and pixel sizes to work out where to insert the original.
import rasterio
import numpy as np
from rasterio import transform, dtypes, windows
def main():
np.random.seed(0)
src_w = src_h = 10
src_arr = np.random.randint(0, 255, src_w * src_h, np.uint8)
src_arr = src_arr.reshape((src_h, src_w))
src_bounds = (100, 1000, 200, 1100)
dst_bounds = (50, 900, 250, 1200)
src_xform = transform.from_bounds(*src_bounds, src_w, src_h)
# Option A: create an in-memory dataset from the numpy array, then read windows
with rasterio.MemoryFile() as mf:
with mf.open(driver='GTiff', width=10, height=10, count=1, dtype=dtypes.uint8, transform=src_xform) as src_ds:
src_ds.write(src_arr, 1)
with mf.open() as src_ds:
window = rasterio.windows.from_bounds(*dst_bounds, transform=src_ds.transform)
dst_arr_a = src_ds.read(window=window, boundless=True)
# Option B: create your own target array and figure out where to insert the original
dst_w = int((dst_bounds[2] - dst_bounds[0]) / src_ds.res[0])
dst_h = int((dst_bounds[3] - dst_bounds[1]) / src_ds.res[1])
offset_y = int((dst_bounds[3] - src_bounds[3]) / src_ds.res[1])
offset_x = int((src_bounds[0] - dst_bounds[0]) / src_ds.res[0])
dst_arr_b = np.zeros((dst_h, dst_w), dtype=np.uint8)
dst_arr_b[offset_y:offset_y+src_h, offset_x:offset_x+src_w] = src_arr
# Ensure these two methods agree
assert np.all(dst_arr_a == dst_arr_b)
if __name__ == '__main__':
main()
There is a way to do it purely with Rasterio APIs - thanks to Sean Gillies, the maintainer of the library, for helping me out with this:
(where bounds is a simple dataclass and the self object holds transform and crs properties, and the data member is an numpy ndarray object)
extent=Bounds(xmin=20, ymin=56, xmax=38, ymax=70)
dst_transform = Affine(self.transform.a, 0, extent.xmin, 0, self.transform.e, extent.ymax)
dst_width = max(int(round((extent.xmax - extent.xmin) / self.transform.a)), 1)
dst_height = max(int(round((extent.ymax - extent.ymin) / -self.transform.e)), 1)
new_data, transform = warp.reproject(
source=self.data,
destination=np.zeros(shape=(dst_height, dst_width)),
src_transform=self.transform,
dst_transform=dst_transform,
src_crs=self.crs,
dst_crs=self.crs,
src_nodata=np.nan,
dst_nodata=np.nan,
)
I think the pure numpy way (Option B above) is probably quicker though, thank you @mikewatt