0

I want to group the points by userid then create a LineString for each userid. Here's the GeoDataframe:

enter image description here

My goal is to store the results in a Geodataframe. Here's the code which I actually obtained from one question in this forum several years ago:

line_strings = df.groupby("userid")["geometry"].apply(lambda x: LineString(x.tolist()) if x.size > 1 else x.tolist())

However, the output is in Series instead.

enter image description here

So if I want to convert it back to GeoDataFrame, I unable to do so because it said TypeError: Input must be valid geometry objects: [<POINT (900800.126 7174374.539)>]

So I want to know where is my mistake?

P.S.: I also want to know how can I exclude User IDs that has only one point? (Hence LineString can't be created for these User IDs)

2
  • Do you want a geodataframe with both lines and points in it, or do you just want the user_ids that can be grouped into linestrings? Commented Jan 8 at 12:58
  • A geodataframe with only lines (grouped according to userid) in it @Bera Commented Jan 9 at 6:31

1 Answer 1

0

The typical way to "group by" a GeoDataFrame is to use dissolve. You group by one or more columns, and the geometry column of the resulting GeoDataFrame will contain a union of all grouped values.

Next, you can convert the unioned points (or rather multipoints) to linestrings.

Code sample:

import geopandas as gpd
import shapely
from shapely import Point

input = gpd.GeoDataFrame(
    {
        "userid": [1, 2, 2, 3, 3], 
        "geometry": [
            Point(0,0), Point(1,0), Point(1,1), Point(2,0), Point(2,1)
        ],
    },
    crs="EPSG:31370",
)

diss = input.dissolve(by="userid")
diss_multi = diss.loc[diss.geometry.type == "MultiPoint"]

print(diss)

lines = diss_multi.set_geometry(
    diss_multi.geometry.apply(lambda x: shapely.LineString(shapely.get_coordinates(x)))
)

print(lines)

Result:

userid
2       LINESTRING (1 0, 1 1)
3       LINESTRING (2 0, 2 1)

PS: it is always easier if you include a runnable minimal reproducible example in code including input data. Based on how I understood your question I tried to create "runnable" sample data in the code sample.

7
  • The geometry of my geodataframe after dissolve consists of Points and MultiPoints. Somehow, I can't run the code to produce LineString. The error stated: "TypeError: One of the arguments is of incorrect type. Please provide only Geometry objects." Commented Jan 9 at 6:51
  • Odd. Maybe you are using old versions of pandas and/or shapely? You can run geopandas.show_versions() to print all relevant version information. Commented Jan 9 at 7:32
  • Geopandas version is 0.14.2. Also if you want to try out the problem, here's the link (I'm currently learning how to use Python in GIS context using this course): [link] (github.com/Automating-GIS-processes-II-2024/exercise-2/blob/…) Commented Jan 9 at 8:45
  • Is there a specific reason you are using an old version of geopandas? If not, please update to geopandas 1.0.1 Commented Jan 9 at 8:56
  • 1
    I simply don't know that Geopandas has updates because I'm recently starting to relearn Python after almost one year abandoning it. Commented Jan 9 at 9:46

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.