Revisions to How to reduce close points

deleted 10 characters in body

Source Link

edited Jan 25, 2014 at 15:27

220.3k
35
410
623

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each g in groups where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Note that this algorithm does not guarantee that each pair of points of a group has a distance <DELTA< DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA. You have to decide for yourself if that is sufficient, but as @DieterLücking has pointed out, don't expect to find an algorithm to solve the problem exactly the way you have described it originally.

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each g in groups where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Note that this algorithm does not guarantee that each pair of points of a group has a distance <DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA.

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each g in groups where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Note that this algorithm does not guarantee that each pair of points of a group has a distance < DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA. You have to decide for yourself if that is sufficient, but as @DieterLücking has pointed out, don't expect to find an algorithm to solve the problem exactly the way you have described it originally.

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

deleted 10 characters in body

Source Link

edited Jan 25, 2014 at 15:22

Doc Brown

220.3k
35
410
623

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each groups g in grouplistgroups where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Note that this algorithm does not guarantee that each pair of points of a group has a distance <DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA.

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each groups g in grouplist where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each g in groups where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Note that this algorithm does not guarantee that each pair of points of a group has a distance <DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA.

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

added 486 characters in body

Source Link

edited Jan 25, 2014 at 14:37

Doc Brown

220.3k
35
410
623

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s.

Here Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each groups g in grouplist where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s.

Here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each groups g in grouplist where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each groups g in grouplist where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

Source Link

answered Jan 25, 2014 at 14:27

Doc Brown

220.3k
35
410
623

Loading

Stack Exchange Network

Return to Answer