6

I have a raster (geotiff) and I would like simply to know some parcentile values, specifically the 80th, 85th, 90th and 95th, because I need to highlight all values above these thresholds in QGIS.

First I thought that in Properties --> Style --> Cumulative count cut I had simply to write the percent value in Min or Max boxes (80 and 95 respectively, for example). But I discovered that if I write 80 in the first box or in the second one (next to the cumulative count cut), the value will be different.

enter image description here

So I tried to convert the raster in an Ascii grid, I opened the Ascii with text editor, I copied the cell values and I pasted in an Excel file. I deleted all the no data values (-9999 in my case) and I used the Percentile function in Excel to extract from the resulting matrix my values.

I thought that it would the right way to find the values, but I wanted to be sure, so I used the Max function in Excel just to see if the value was the same to the one I can find in Propreties-->Metadata-->Maximum value.... and the to values were different, so I don't trust in what I did in Excel.

Any help?

2 Answers 2

9

You can use PyQGIS to convert your raster to numpy Array then find percentiles with numpy.percentile():

import gdal
import numpy as np
from osgeo import gdal_array

rasterfile = r"C:\Test\nh_66_5.tif" #Change

percentiles = [50,80,90] #Change

rasterArray = gdal_array.LoadFile(rasterfile) #Read raster as numpy array

for p in percentiles:
    print('{0}th percentile is: {1}'.format(p, np.percentile(rasterArray,p)))

enter image description here

If your raster is multiband, np.percentile will flatten the Array (=include all bands). If you want percentiles for specific band, use indexing, for example np.percentile(rasterArray[0],50)

0
1

To get this to work with nodata pixels, which in this example is values less than 0, recommend using nanpercentile:

import numpy as np
from osgeo import gdal
from osgeo import gdal_array

raster_file = r"/my/path/image.tif"
percentiles = [20,40,80,100] 
raster_array = gdal_array.LoadFile(raster_file)

masked_data = np.ma.masked_where(raster_array[0] < 0, raster_array[0])
masked_data = np.ma.filled(masked_data, np.nan)

for p in percentiles:
    print('{0}th percentile is: {1}'.format(p, np.nanpercentile(masked_data,p)))

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.