2

I am testing geofileops.makevalid. To do so, I create a GPKG file with a number invalid of geometries to fix.

First, I create this Python script, test.py:

import geofileops as gfo
import geopandas as gpd
import shapely.geometry
import numpy as np

np.random.seed(42)

r = np.random.rand
p=2

polygons = [shapely.geometry.Polygon([[r(), r()], [r(), r()], [r(), r()], [r(), r()]]) for _ in range(10**p)]
df = gpd.GeoSeries(polygons)

df.to_file(f"example_{p}.gpkg")
gfo.makevalid(input_path=f"example_{p}.gpkg",output_path=f"madevalid_{p}.gpkg")

I make sure there is nothing else in my working directory. I run the script, it produces example_2.gpkg and madevalid_2.gpkg, as expected. However, If I change the number of geometries I am trying to fix from 100 to 10000 (ie I change p):

import geofileops as gfo
import geopandas as gpd
import shapely.geometry
import numpy as np

np.random.seed(42)

r = np.random.rand
p=4

polygons = [shapely.geometry.Polygon([[r(), r()], [r(), r()], [r(), r()], [r(), r()]]) for _ in range(10**p)]
df = gpd.GeoSeries(polygons)

df.to_file(f"example_{p}.gpkg")
gfo.makevalid(input_path=f"example_{p}.gpkg",output_path=f"madevalid_{p}.gpkg")

I reliably get this error:

Traceback (most recent call last):
  File "fiona/ogrext.pyx", line 1238, in fiona.ogrext.WritingSession.start
  File "fiona/ogrext.pyx", line 1239, in fiona.ogrext.WritingSession.start
  File "fiona/_err.pyx", line 291, in fiona._err.exc_wrap_pointer
fiona._err.CPLE_AppDefinedError: Layer example_4 already exists, CreateLayer failed. Use the layer creation option OVERWRITE=YES to replace it.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/username/glint_repos/data-layers/GBR/parcels/test/test.py", line 14, in <module>
    df.to_file(f"example_{p}.gpkg")
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geopandas/geoseries.py", line 606, in to_file
    data.to_file(filename, driver, index=index, **kwargs)
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geopandas/geodataframe.py", line 1203, in to_file
    _to_file(self, filename, driver, schema, index, **kwargs)
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geopandas/io/file.py", line 545, in _to_file
    _to_file_fiona(df, filename, driver, schema, crs, mode, **kwargs)
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geopandas/io/file.py", line 572, in _to_file_fiona
    with fiona.open(
         ^^^^^^^^^^^
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/fiona/env.py", line 457, in wrapper
    return f(*args, **kwds)
           ^^^^^^^^^^^^^^^^
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/fiona/__init__.py", line 346, in open
    colxn = Collection(
            ^^^^^^^^^^^
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/fiona/collection.py", line 237, in __init__
    self.session.start(self, **kwargs)
  File "fiona/ogrext.pyx", line 1247, in fiona.ogrext.WritingSession.start
fiona.errors.DriverIOError: Layer example_4 already exists, CreateLayer failed. Use the layer creation option OVERWRITE=YES to replace it.
Error <A process in the process pool was terminated abruptly while the future was running or pending.> executing {'layer': 'madevalid_4', 'tmp_partial_output_path': PosixPath('/var/folders/jz/34xs15sd3ng6dpxkv78hh1lr0000gn/T/geofileops/makevalid_000001/madevalid_4_0.gpkg'), 'sql_stmt': '\n                SELECT * FROM\n                    ( \n        SELECT \n            IIF(ST_IsValid(geom) = 1,\n                geom,\n                GEOSMakeValid(geom, 0)\n            ) AS geom\n              ,fid\n          FROM "example_4" layer\n         WHERE 1=1\n           AND (layer.rowid >= 1 AND layer.rowid <= 1000) \n    \n                       LIMIT -1 OFFSET 0\n                    )\n                 WHERE geom IS NOT NULL\n            '}
Traceback (most recent call last):
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geofileops/util/_geoops_sql.py", line 771, in _single_layer_vector_operation
    _ = future.result()
        ^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Traceback (most recent call last):
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geofileops/util/_geoops_sql.py", line 771, in _single_layer_vector_operation
    _ = future.result()
        ^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/username/glint_repos/data-layers/GBR/parcels/test/test.py", line 15, in <module>
    gfo.makevalid(input_path=f"example_{p}.gpkg",output_path=f"madevalid_{p}.gpkg")
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geofileops/geoops.py", line 1273, in makevalid
    _geoops_sql.makevalid(
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geofileops/util/_geoops_sql.py", line 341, in makevalid
    _single_layer_vector_operation(
  File "/Users/username/Library/Caches/pypoetry/virtualenvs/gleam-3xfTBcJj-py3.11/lib/python3.11/site-packages/geofileops/util/_geoops_sql.py", line 777, in _single_layer_vector_operation
    raise Exception(message) from ex
Exception: Error <A process in the process pool was terminated abruptly while the future was running or pending.> executing {'layer': 'madevalid_4', 'tmp_partial_output_path': PosixPath('/var/folders/jz/34xs15sd3ng6dpxkv78hh1lr0000gn/T/geofileops/makevalid_000001/madevalid_4_0.gpkg'), 'sql_stmt': '\n                SELECT * FROM\n                    ( \n        SELECT \n            IIF(ST_IsValid(geom) = 1,\n                geom,\n                GEOSMakeValid(geom, 0)\n            ) AS geom\n              ,fid\n          FROM "example_4" layer\n         WHERE 1=1\n           AND (layer.rowid >= 1 AND layer.rowid <= 1000) \n    \n                       LIMIT -1 OFFSET 0\n                    )\n                 WHERE geom IS NOT NULL\n            '}

I made sure to remove everything but the script from my working directory before running it. (Just to be sure I am on the right track, I also run the script without the last line, and it runs without error - so I'm confident the issue is related to geofileops.)

From the trace it is visible that the first error is fiona._err.CPLE_AppDefinedError.

How to handle fiona._err.CPLE_AppDefinedError when using geofileops.makevalid on large files?

1 Answer 1

2

For larger files, geofileops uses multiprocessing with multiple parallel processes under the hood. In that case, the if __name__ == "__main__": construct needs to be used to avoid this kind of issues (source).

Something like this:

import geofileops as gfo
import geopandas as gpd
import shapely.geometry
import numpy as np

if __name__ == "__main__":
    np.random.seed(42)

    r = np.random.rand
    p = 4

    polygons = [
        shapely.geometry.Polygon([[r(), r()], [r(), r()], [r(), r()], [r(), r()]])
        for _ in range(10**p)
    ]
    df = gpd.GeoSeries(polygons)

    df.to_file(f"example_{p}.gpkg")
    gfo.makevalid(input_path=f"example_{p}.gpkg", output_path=f"madevalid_{p}.gpkg")

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.