5

REPRODUCIBLE SETUP

In a Google Colab notebook, I install rasterio:

!pip3 install rasterio

Import modules:

import rasterio
import numpy as np
import time

from rasterio.crs import CRS
from rasterio import warp
from rasterio import transform

Define a URL pointing to a TIF file:

url = 'https://copernicus-dem-30m.s3.amazonaws.com/Copernicus_DSM_COG_10_N59_00_E010_00_DEM/Copernicus_DSM_COG_10_N59_00_E010_00_DEM.tif'

and define row-column pairs:

row = np.argwhere(np.ones(shape=(1000,1000)))[:,0] # array([  0,   0,   0, ..., 999, 999, 999])
col = np.argwhere(np.ones(shape=(1000,1000)))[:,1] # array([  0,   1,   2, ..., 997, 998, 999])

PROBLEM

I would like to calculate the latitude-longitude coordinates of these row-column pairs fast. In my real world application, I have a square of (roughly a few tens of millions) pixels defined somewhere on this TIF, and I need lat-lon pairs for each of them.


MY SOLUTION

Partly based on this anwer, I do the following, while timing parts of the code:

with rasterio.Env(GDAL_DISABLE_READDIR_ON_OPEN=True, CPL_VSIL_CURL_ALLOWED_EXTENSIONS="tif"):
    with rasterio.open(url) as src:   
        times=[]
        times.append(time.time()) # TIMING HERE
        dst_crs = src.crs
        src_crs = CRS.from_epsg(4326)
        x, y = warp.transform(src_crs, dst_crs, row, col)
        times.append(time.time()) # TIMING HERE
        lons, lats = transform.xy(src.transform, x, y)
        times.append(time.time()) # TIMING HERE

The results of the timing:

[finishtime-starttime for (finishtime, starttime) in zip(times[1:],times[:-1])]

Output:

[0.28756046295166016, 2.810262441635132]

Ie the last line of code, lons, lats = transform.xy(src.transform, x, y) takes ~2.81 seconds, an order of magnitude more than the previous lines.


QUESTION

How could I speed up the lons, lats calculation, currently done by the lons, lats = transform.xy(src.transform, x, y) line?

I am open to any solution. Also, if there is a faster than Python way of doing this in C/C++, then I'm curious about that as well.

0

1 Answer 1

2

It appears like the authors of rasterio know about the slow conversions and have a WIP branch to address this issue. You could implement xy method yourself as the example in the first link demonstrates.

def to_numpy2(transform):
    return np.array([transform.a, 
    transform.b, 
    transform.c, 
    transform.d, 
    transform.e, 
    transform.f, 0, 0, 1], dtype='float64').reshape((3,3))

def xy_np(transform, rows, cols, offset='center'):
    if isinstance(rows, int) and isinstance(cols, int):
        pts = np.array([[rows, cols, 1]]).T
    else:
        assert len(rows) == len(cols)
        pts = np.ones((3, len(rows)), dtype=int)
        pts[0] = rows
        pts[1] = cols

    if offset == 'center':
        coff, roff = (0.5, 0.5)
    elif offset == 'ul':
        coff, roff = (0, 0)
    elif offset == 'ur':
        coff, roff = (1, 0)
    elif offset == 'll':
        coff, roff = (0, 1)
    elif offset == 'lr':
        coff, roff = (1, 1)
    else:
        raise ValueError("Invalid offset")

    _transnp = to_numpy2(transform)
    _translt = to_numpy2(transform.translation(coff, roff))
    locs = _transnp @ _translt @ pts
    return locs[0].tolist(), locs[1].tolist()

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.