REPRODUCIBLE SETUP
In a Google Colab notebook, I install rasterio:
!pip3 install rasterio
Import modules:
import rasterio
import numpy as np
import time
from rasterio.crs import CRS
from rasterio import warp
from rasterio import transform
Define a URL pointing to a TIF file:
url = 'https://copernicus-dem-30m.s3.amazonaws.com/Copernicus_DSM_COG_10_N59_00_E010_00_DEM/Copernicus_DSM_COG_10_N59_00_E010_00_DEM.tif'
and define row-column pairs:
row = np.argwhere(np.ones(shape=(1000,1000)))[:,0] # array([ 0, 0, 0, ..., 999, 999, 999])
col = np.argwhere(np.ones(shape=(1000,1000)))[:,1] # array([ 0, 1, 2, ..., 997, 998, 999])
PROBLEM
I would like to calculate the latitude-longitude coordinates of these row-column pairs fast. In my real world application, I have a square of (roughly a few tens of millions) pixels defined somewhere on this TIF, and I need lat-lon pairs for each of them.
MY SOLUTION
Partly based on this anwer, I do the following, while timing parts of the code:
with rasterio.Env(GDAL_DISABLE_READDIR_ON_OPEN=True, CPL_VSIL_CURL_ALLOWED_EXTENSIONS="tif"):
with rasterio.open(url) as src:
times=[]
times.append(time.time()) # TIMING HERE
dst_crs = src.crs
src_crs = CRS.from_epsg(4326)
x, y = warp.transform(src_crs, dst_crs, row, col)
times.append(time.time()) # TIMING HERE
lons, lats = transform.xy(src.transform, x, y)
times.append(time.time()) # TIMING HERE
The results of the timing:
[finishtime-starttime for (finishtime, starttime) in zip(times[1:],times[:-1])]
Output:
[0.28756046295166016, 2.810262441635132]
Ie the last line of code, lons, lats = transform.xy(src.transform, x, y) takes ~2.81 seconds, an order of magnitude more than the previous lines.
QUESTION
How could I speed up the lons, lats calculation, currently done by the lons, lats = transform.xy(src.transform, x, y) line?
I am open to any solution. Also, if there is a faster than Python way of doing this in C/C++, then I'm curious about that as well.