Fast distance calculation for Starcraft2 bot

I am coding a bot for Starcraft 2, in which many distances have to be calculated every frame.

Here is the part of the library that is being used and I want to improve: https://github.com/Dentosal/python-sc2/blob/develop/sc2/position.py

I built a new class Points that inherits from np.ndarray.
It is not yet connected to the rest of the library, but the functions are done. I removed the functions furthest_to, further_than and so on because the closer-versions are basically the same execpt a -1 or <.

Are these functions implemented in most efficient way? Is there a way to improve the parts that look like this:

find = np.where(np.any(M < distance, axis=1)) selection = np.array([self[i] for i in find[0]])

Any other comments or suggestions are also welcome :)

from typing import Any, Dict, List, Optional, Set, Tuple, Union  # for mypy type checking



import numpy as np

from scipy.spatial.distance import cdist



from position import Point2





class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        if point in self:

            return Point2(tuple(point))

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = self[np.argmin(distances)]

        return Point2(tuple(result))



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        position = np.array([point])

        M = cdist(self, position)

        find = np.where(np.all(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        p = np.array([point])

        M = cdist(self, p)

        find = np.where(np.any(np.logical_and(distance1 < M, M < distance2), axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def sort_by_distance_to(self, point: Point2, reverse: bool = False) -> "Points":

        """Returns a new Points object with all points of self sorted by distance to point.

        Ordered from smallest to biggest distance. Reverse order with keyword reverse=True."""

        deltas = self - point

        distances = (1 if reverse else -1) * np.einsum("ij,ij->i", deltas, deltas)

        result = self[distances.argsort()[::-1]]

        return Points(result)



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        M = cdist(self, points)

        find = np.where(np.any(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def n_closest_to_distance(self, point: Point2, distance: Union[int, float], n: int) -> "Points":

        """Returns a new Points object with the n points of self

        which calculated distance to point is closest to distance."""

        deltas = self - point

        distances = np.absolute(distance - np.einsum("ij,ij->i", deltas, deltas))

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)

asked Dec 25 at 10:07

Tweakimp

2841213

1

What is a typical size for Points, so that timings are meaningful?
– Graipher
Dec 25 at 18:23

Not all of above functions are currently in the library.I tracked size of the calls for a single game for the function closest: average n of 'closest' call: 13.8 min n of 'closest' call: 2 max n of 'closest' call: 128 So i think if you test with up to 300 for a really late game situation with both players maxed out, it would make a meaningful test.
– Tweakimp
Dec 25 at 18:29

Also, what is Point2. Is it also a subclass of numpy.ndarray (since it seems to support subtraction)?
– Graipher
Dec 26 at 10:57

1

You can see it in the link I provided. It is just an object with x and y coordinates.
– Tweakimp
Dec 26 at 10:58

add a comment |

I am coding a bot for Starcraft 2, in which many distances have to be calculated every frame.

Here is the part of the library that is being used and I want to improve: https://github.com/Dentosal/python-sc2/blob/develop/sc2/position.py

Are these functions implemented in most efficient way? Is there a way to improve the parts that look like this:

find = np.where(np.any(M < distance, axis=1)) selection = np.array([self[i] for i in find[0]])

Any other comments or suggestions are also welcome :)

from typing import Any, Dict, List, Optional, Set, Tuple, Union  # for mypy type checking



import numpy as np

from scipy.spatial.distance import cdist



from position import Point2





class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        if point in self:

            return Point2(tuple(point))

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = self[np.argmin(distances)]

        return Point2(tuple(result))



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        position = np.array([point])

        M = cdist(self, position)

        find = np.where(np.all(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        p = np.array([point])

        M = cdist(self, p)

        find = np.where(np.any(np.logical_and(distance1 < M, M < distance2), axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def sort_by_distance_to(self, point: Point2, reverse: bool = False) -> "Points":

        """Returns a new Points object with all points of self sorted by distance to point.

        Ordered from smallest to biggest distance. Reverse order with keyword reverse=True."""

        deltas = self - point

        distances = (1 if reverse else -1) * np.einsum("ij,ij->i", deltas, deltas)

        result = self[distances.argsort()[::-1]]

        return Points(result)



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        M = cdist(self, points)

        find = np.where(np.any(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def n_closest_to_distance(self, point: Point2, distance: Union[int, float], n: int) -> "Points":

        """Returns a new Points object with the n points of self

        which calculated distance to point is closest to distance."""

        deltas = self - point

        distances = np.absolute(distance - np.einsum("ij,ij->i", deltas, deltas))

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)

asked Dec 25 at 10:07

Tweakimp

2841213

1

What is a typical size for Points, so that timings are meaningful?
– Graipher
Dec 25 at 18:23

Not all of above functions are currently in the library.I tracked size of the calls for a single game for the function closest: average n of 'closest' call: 13.8 min n of 'closest' call: 2 max n of 'closest' call: 128 So i think if you test with up to 300 for a really late game situation with both players maxed out, it would make a meaningful test.
– Tweakimp
Dec 25 at 18:29

Also, what is Point2. Is it also a subclass of numpy.ndarray (since it seems to support subtraction)?
– Graipher
Dec 26 at 10:57

1

You can see it in the link I provided. It is just an object with x and y coordinates.
– Tweakimp
Dec 26 at 10:58

add a comment |

I am coding a bot for Starcraft 2, in which many distances have to be calculated every frame.

Here is the part of the library that is being used and I want to improve: https://github.com/Dentosal/python-sc2/blob/develop/sc2/position.py

Are these functions implemented in most efficient way? Is there a way to improve the parts that look like this:

find = np.where(np.any(M < distance, axis=1)) selection = np.array([self[i] for i in find[0]])

Any other comments or suggestions are also welcome :)

from typing import Any, Dict, List, Optional, Set, Tuple, Union  # for mypy type checking



import numpy as np

from scipy.spatial.distance import cdist



from position import Point2





class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        if point in self:

            return Point2(tuple(point))

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = self[np.argmin(distances)]

        return Point2(tuple(result))



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        position = np.array([point])

        M = cdist(self, position)

        find = np.where(np.all(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        p = np.array([point])

        M = cdist(self, p)

        find = np.where(np.any(np.logical_and(distance1 < M, M < distance2), axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def sort_by_distance_to(self, point: Point2, reverse: bool = False) -> "Points":

        """Returns a new Points object with all points of self sorted by distance to point.

        Ordered from smallest to biggest distance. Reverse order with keyword reverse=True."""

        deltas = self - point

        distances = (1 if reverse else -1) * np.einsum("ij,ij->i", deltas, deltas)

        result = self[distances.argsort()[::-1]]

        return Points(result)



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        M = cdist(self, points)

        find = np.where(np.any(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def n_closest_to_distance(self, point: Point2, distance: Union[int, float], n: int) -> "Points":

        """Returns a new Points object with the n points of self

        which calculated distance to point is closest to distance."""

        deltas = self - point

        distances = np.absolute(distance - np.einsum("ij,ij->i", deltas, deltas))

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)

asked Dec 25 at 10:07

Tweakimp

2841213

I am coding a bot for Starcraft 2, in which many distances have to be calculated every frame.

Here is the part of the library that is being used and I want to improve: https://github.com/Dentosal/python-sc2/blob/develop/sc2/position.py

Are these functions implemented in most efficient way? Is there a way to improve the parts that look like this:

find = np.where(np.any(M < distance, axis=1)) selection = np.array([self[i] for i in find[0]])

Any other comments or suggestions are also welcome :)

from typing import Any, Dict, List, Optional, Set, Tuple, Union  # for mypy type checking



import numpy as np

from scipy.spatial.distance import cdist



from position import Point2





class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        if point in self:

            return Point2(tuple(point))

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = self[np.argmin(distances)]

        return Point2(tuple(result))



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        position = np.array([point])

        M = cdist(self, position)

        find = np.where(np.all(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        p = np.array([point])

        M = cdist(self, p)

        find = np.where(np.any(np.logical_and(distance1 < M, M < distance2), axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def sort_by_distance_to(self, point: Point2, reverse: bool = False) -> "Points":

        """Returns a new Points object with all points of self sorted by distance to point.

        Ordered from smallest to biggest distance. Reverse order with keyword reverse=True."""

        deltas = self - point

        distances = (1 if reverse else -1) * np.einsum("ij,ij->i", deltas, deltas)

        result = self[distances.argsort()[::-1]]

        return Points(result)



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        deltas = self - point

        distances = np.einsum("ij,ij->i", deltas, deltas)

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        M = cdist(self, points)

        find = np.where(np.any(M < distance, axis=1))

        selection = np.array([self[i] for i in find[0]])

        return Points(selection)



    def n_closest_to_distance(self, point: Point2, distance: Union[int, float], n: int) -> "Points":

        """Returns a new Points object with the n points of self

        which calculated distance to point is closest to distance."""

        deltas = self - point

        distances = np.absolute(distance - np.einsum("ij,ij->i", deltas, deltas))

        result = (self[distances.argsort()[::-1]])[-n:]

        return Points(result)

python python-3.x numpy scipy

asked Dec 25 at 10:07

Tweakimp

2841213

asked Dec 25 at 10:07

Tweakimp

2841213

asked Dec 25 at 10:07

Tweakimp

2841213

asked Dec 25 at 10:07

Tweakimp

2841213

asked Dec 25 at 10:07

Tweakimp

2841213

1

What is a typical size for Points, so that timings are meaningful?
– Graipher
Dec 25 at 18:23

Not all of above functions are currently in the library.I tracked size of the calls for a single game for the function closest: average n of 'closest' call: 13.8 min n of 'closest' call: 2 max n of 'closest' call: 128 So i think if you test with up to 300 for a really late game situation with both players maxed out, it would make a meaningful test.
– Tweakimp
Dec 25 at 18:29

Also, what is Point2. Is it also a subclass of numpy.ndarray (since it seems to support subtraction)?
– Graipher
Dec 26 at 10:57

1

You can see it in the link I provided. It is just an object with x and y coordinates.
– Tweakimp
Dec 26 at 10:58

add a comment |

1

What is a typical size for Points, so that timings are meaningful?
– Graipher
Dec 25 at 18:23

Not all of above functions are currently in the library.I tracked size of the calls for a single game for the function closest: average n of 'closest' call: 13.8 min n of 'closest' call: 2 max n of 'closest' call: 128 So i think if you test with up to 300 for a really late game situation with both players maxed out, it would make a meaningful test.
– Tweakimp
Dec 25 at 18:29

Also, what is Point2. Is it also a subclass of numpy.ndarray (since it seems to support subtraction)?
– Graipher
Dec 26 at 10:57

1

You can see it in the link I provided. It is just an object with x and y coordinates.
– Tweakimp
Dec 26 at 10:58

What is a typical size for Points, so that timings are meaningful?
– Graipher
Dec 25 at 18:23

Not all of above functions are currently in the library.I tracked size of the calls for a single game for the function closest: average n of 'closest' call: 13.8 min n of 'closest' call: 2 max n of 'closest' call: 128 So i think if you test with up to 300 for a really late game situation with both players maxed out, it would make a meaningful test.
– Tweakimp
Dec 25 at 18:29

Also, what is Point2. Is it also a subclass of numpy.ndarray (since it seems to support subtraction)?
– Graipher
Dec 26 at 10:57

You can see it in the link I provided. It is just an object with x and y coordinates.
– Tweakimp
Dec 26 at 10:58

add a comment |

1 Answer
1

active

oldest

votes

You should probably have another look at the scipy.spatial module. It provides (hopefully) faster methods for most of these checks using a k-d-tree.

from scipy.spatial import cKDTree



class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        obj.kd_tree = cKDTree(obj)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        _, i = self.kd_tree.query([[point.x, point.y]])

        return Point2(self[i][0])



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        selection = self.kd_tree.query_ball_point([point.x, point.y], distance)

        return self[selection]



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        selection_close = self.kd_tree.query_ball_point([point.x, point.y], distance1)

        selection_far = self.kd_tree.query_ball_point([point.x, point.y], distance2)

        selection = list(set(selection_far) - set(selection_close))

        return self[selection]



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        _, indices = self.kd_tree.query([[point.x, point.y]], k=n)

        return self[indices]



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        pairs = self.kd_tree.query_ball_tree(points.kd_tree, distance)

        return points[[i for closest in pairs for i in closest]]

These are all the ones I could quickly find a way for using the tree. Not included are sort_by_distance_to, n_closest_to_distance and n_closest_to_distance.

In order to test if this is really faster, here are some tests, with the following setup:

np.random.seed(42)

points = np.random.rand(300, 2)

points_graipher = Points(points)

points_op = PointsOP(points)

point = Point2(np.random.rand(2))

points2 = np.random.rand(10, 2)

points2_graipher = Points(points2)

Here PointsOP is you class and Points is the class defined in this answer.

%timeit points_op.closest_to(point)

# 38.3 µs ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_to(point)

# 43.7 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closer_than(point, 0.1)

# 39.5 µs ± 238 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)    

%timeit points_graipher.closer_than(point, 0.1)

# 11 µs ± 26 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



%timeit points_op.in_distance_between(point, 0.1, 0.2)

# 52.9 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_between(point, 0.1, 0.2)

# 21.9 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closest_n_points(point, 10)

# 29.5 µs ± 359 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_n_points(point, 10)

# 41.7 µs ± 287 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.in_distance_of_points(points2, 0.1)

# 116 µs ± 727 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_of_points(points2_graipher, 0.1)

# 89.2 µs ± 500 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As you can see for $N = 300$ points there are some methods which are faster with the KDTree (up to four times), some that are basically the same and some that are slower (by up to two times).

To get a feeling how the different approaches, scale, here are some plots. The only thing changing is the number of points. The steps are 30, 300, 3000, 30000.

enter image description here

To summarize, you should check this for some actual cases you have. Depending on the size of points your implementation or this implementation is faster.

edited Dec 26 at 16:28

answered Dec 26 at 11:31

Graipher

23.5k53585

Thank you for your answer. In the first comment i said n around 300 will be the highest values, most n will be at 20 or so.
– Tweakimp
Dec 26 at 12:19

@Tweakimp: Yes, which is why I took 300 as the first case. The larger case was mostly to see if and when the KDTree becomes much better. Will add some plots including also some lower numbers.
– Graipher
Dec 26 at 12:22

1

@Tweakimp: Added those plots.
– Graipher
Dec 26 at 12:59

1

Thank you for all the work you put into this. :) Interesting to see how closest n points is constant...
– Tweakimp
Dec 26 at 13:07

@Tweakimp: Yeah, I was a bit surprised by this as well. Apparently those cuts in the tree are very efficient :)
– Graipher
Dec 26 at 13:08

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210306%2ffast-distance-calculation-for-starcraft2-bot%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You should probably have another look at the scipy.spatial module. It provides (hopefully) faster methods for most of these checks using a k-d-tree.

from scipy.spatial import cKDTree



class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        obj.kd_tree = cKDTree(obj)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        _, i = self.kd_tree.query([[point.x, point.y]])

        return Point2(self[i][0])



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        selection = self.kd_tree.query_ball_point([point.x, point.y], distance)

        return self[selection]



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        selection_close = self.kd_tree.query_ball_point([point.x, point.y], distance1)

        selection_far = self.kd_tree.query_ball_point([point.x, point.y], distance2)

        selection = list(set(selection_far) - set(selection_close))

        return self[selection]



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        _, indices = self.kd_tree.query([[point.x, point.y]], k=n)

        return self[indices]



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        pairs = self.kd_tree.query_ball_tree(points.kd_tree, distance)

        return points[[i for closest in pairs for i in closest]]

These are all the ones I could quickly find a way for using the tree. Not included are sort_by_distance_to, n_closest_to_distance and n_closest_to_distance.

In order to test if this is really faster, here are some tests, with the following setup:

np.random.seed(42)

points = np.random.rand(300, 2)

points_graipher = Points(points)

points_op = PointsOP(points)

point = Point2(np.random.rand(2))

points2 = np.random.rand(10, 2)

points2_graipher = Points(points2)

Here PointsOP is you class and Points is the class defined in this answer.

%timeit points_op.closest_to(point)

# 38.3 µs ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_to(point)

# 43.7 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closer_than(point, 0.1)

# 39.5 µs ± 238 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)    

%timeit points_graipher.closer_than(point, 0.1)

# 11 µs ± 26 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



%timeit points_op.in_distance_between(point, 0.1, 0.2)

# 52.9 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_between(point, 0.1, 0.2)

# 21.9 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closest_n_points(point, 10)

# 29.5 µs ± 359 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_n_points(point, 10)

# 41.7 µs ± 287 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.in_distance_of_points(points2, 0.1)

# 116 µs ± 727 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_of_points(points2_graipher, 0.1)

# 89.2 µs ± 500 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As you can see for $N = 300$ points there are some methods which are faster with the KDTree (up to four times), some that are basically the same and some that are slower (by up to two times).

To get a feeling how the different approaches, scale, here are some plots. The only thing changing is the number of points. The steps are 30, 300, 3000, 30000.

enter image description here

To summarize, you should check this for some actual cases you have. Depending on the size of points your implementation or this implementation is faster.

edited Dec 26 at 16:28

answered Dec 26 at 11:31

Graipher

23.5k53585

Thank you for your answer. In the first comment i said n around 300 will be the highest values, most n will be at 20 or so.
– Tweakimp
Dec 26 at 12:19

@Tweakimp: Yes, which is why I took 300 as the first case. The larger case was mostly to see if and when the KDTree becomes much better. Will add some plots including also some lower numbers.
– Graipher
Dec 26 at 12:22

1

@Tweakimp: Added those plots.
– Graipher
Dec 26 at 12:59

1

Thank you for all the work you put into this. :) Interesting to see how closest n points is constant...
– Tweakimp
Dec 26 at 13:07

@Tweakimp: Yeah, I was a bit surprised by this as well. Apparently those cuts in the tree are very efficient :)
– Graipher
Dec 26 at 13:08

add a comment |

You should probably have another look at the scipy.spatial module. It provides (hopefully) faster methods for most of these checks using a k-d-tree.

from scipy.spatial import cKDTree



class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        obj.kd_tree = cKDTree(obj)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        _, i = self.kd_tree.query([[point.x, point.y]])

        return Point2(self[i][0])



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        selection = self.kd_tree.query_ball_point([point.x, point.y], distance)

        return self[selection]



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        selection_close = self.kd_tree.query_ball_point([point.x, point.y], distance1)

        selection_far = self.kd_tree.query_ball_point([point.x, point.y], distance2)

        selection = list(set(selection_far) - set(selection_close))

        return self[selection]



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        _, indices = self.kd_tree.query([[point.x, point.y]], k=n)

        return self[indices]



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        pairs = self.kd_tree.query_ball_tree(points.kd_tree, distance)

        return points[[i for closest in pairs for i in closest]]

These are all the ones I could quickly find a way for using the tree. Not included are sort_by_distance_to, n_closest_to_distance and n_closest_to_distance.

In order to test if this is really faster, here are some tests, with the following setup:

np.random.seed(42)

points = np.random.rand(300, 2)

points_graipher = Points(points)

points_op = PointsOP(points)

point = Point2(np.random.rand(2))

points2 = np.random.rand(10, 2)

points2_graipher = Points(points2)

Here PointsOP is you class and Points is the class defined in this answer.

%timeit points_op.closest_to(point)

# 38.3 µs ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_to(point)

# 43.7 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closer_than(point, 0.1)

# 39.5 µs ± 238 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)    

%timeit points_graipher.closer_than(point, 0.1)

# 11 µs ± 26 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



%timeit points_op.in_distance_between(point, 0.1, 0.2)

# 52.9 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_between(point, 0.1, 0.2)

# 21.9 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closest_n_points(point, 10)

# 29.5 µs ± 359 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_n_points(point, 10)

# 41.7 µs ± 287 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.in_distance_of_points(points2, 0.1)

# 116 µs ± 727 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_of_points(points2_graipher, 0.1)

# 89.2 µs ± 500 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As you can see for $N = 300$ points there are some methods which are faster with the KDTree (up to four times), some that are basically the same and some that are slower (by up to two times).

To get a feeling how the different approaches, scale, here are some plots. The only thing changing is the number of points. The steps are 30, 300, 3000, 30000.

enter image description here

To summarize, you should check this for some actual cases you have. Depending on the size of points your implementation or this implementation is faster.

edited Dec 26 at 16:28

answered Dec 26 at 11:31

Graipher

23.5k53585

Thank you for your answer. In the first comment i said n around 300 will be the highest values, most n will be at 20 or so.
– Tweakimp
Dec 26 at 12:19

@Tweakimp: Yes, which is why I took 300 as the first case. The larger case was mostly to see if and when the KDTree becomes much better. Will add some plots including also some lower numbers.
– Graipher
Dec 26 at 12:22

1

@Tweakimp: Added those plots.
– Graipher
Dec 26 at 12:59

1

Thank you for all the work you put into this. :) Interesting to see how closest n points is constant...
– Tweakimp
Dec 26 at 13:07

@Tweakimp: Yeah, I was a bit surprised by this as well. Apparently those cuts in the tree are very efficient :)
– Graipher
Dec 26 at 13:08

add a comment |

You should probably have another look at the scipy.spatial module. It provides (hopefully) faster methods for most of these checks using a k-d-tree.

from scipy.spatial import cKDTree



class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        obj.kd_tree = cKDTree(obj)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        _, i = self.kd_tree.query([[point.x, point.y]])

        return Point2(self[i][0])



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        selection = self.kd_tree.query_ball_point([point.x, point.y], distance)

        return self[selection]



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        selection_close = self.kd_tree.query_ball_point([point.x, point.y], distance1)

        selection_far = self.kd_tree.query_ball_point([point.x, point.y], distance2)

        selection = list(set(selection_far) - set(selection_close))

        return self[selection]



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        _, indices = self.kd_tree.query([[point.x, point.y]], k=n)

        return self[indices]



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        pairs = self.kd_tree.query_ball_tree(points.kd_tree, distance)

        return points[[i for closest in pairs for i in closest]]

These are all the ones I could quickly find a way for using the tree. Not included are sort_by_distance_to, n_closest_to_distance and n_closest_to_distance.

In order to test if this is really faster, here are some tests, with the following setup:

np.random.seed(42)

points = np.random.rand(300, 2)

points_graipher = Points(points)

points_op = PointsOP(points)

point = Point2(np.random.rand(2))

points2 = np.random.rand(10, 2)

points2_graipher = Points(points2)

Here PointsOP is you class and Points is the class defined in this answer.

%timeit points_op.closest_to(point)

# 38.3 µs ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_to(point)

# 43.7 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closer_than(point, 0.1)

# 39.5 µs ± 238 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)    

%timeit points_graipher.closer_than(point, 0.1)

# 11 µs ± 26 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



%timeit points_op.in_distance_between(point, 0.1, 0.2)

# 52.9 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_between(point, 0.1, 0.2)

# 21.9 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closest_n_points(point, 10)

# 29.5 µs ± 359 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_n_points(point, 10)

# 41.7 µs ± 287 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.in_distance_of_points(points2, 0.1)

# 116 µs ± 727 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_of_points(points2_graipher, 0.1)

# 89.2 µs ± 500 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As you can see for $N = 300$ points there are some methods which are faster with the KDTree (up to four times), some that are basically the same and some that are slower (by up to two times).

To get a feeling how the different approaches, scale, here are some plots. The only thing changing is the number of points. The steps are 30, 300, 3000, 30000.

enter image description here

To summarize, you should check this for some actual cases you have. Depending on the size of points your implementation or this implementation is faster.

edited Dec 26 at 16:28

answered Dec 26 at 11:31

Graipher

23.5k53585

You should probably have another look at the scipy.spatial module. It provides (hopefully) faster methods for most of these checks using a k-d-tree.

from scipy.spatial import cKDTree



class Points(np.ndarray):

    def __new__(cls, units_or_points):

        obj = np.asarray(units_or_points).view(cls)

        obj.kd_tree = cKDTree(obj)

        return obj



    def closest_to(self, point: Point2) -> Point2:

        """Returns the point of self that is closest to another point."""

        _, i = self.kd_tree.query([[point.x, point.y]])

        return Point2(self[i][0])



    def closer_than(self, point: Point2, distance: Union[int, float]) -> "Points":

        """Returns a new Points object with all points of self that

        are closer than distance to point."""

        selection = self.kd_tree.query_ball_point([point.x, point.y], distance)

        return self[selection]



    def in_distance_between(

        self, point: Point2, distance1: Union[int, float], distance2: Union[int, float]

    ) -> "Points":

        """Returns a new Points object with all points of self

        that are between distance1 and distance2 away from point."""

        selection_close = self.kd_tree.query_ball_point([point.x, point.y], distance1)

        selection_far = self.kd_tree.query_ball_point([point.x, point.y], distance2)

        selection = list(set(selection_far) - set(selection_close))

        return self[selection]



    def closest_n_points(self, point: Point2, n: int) -> "Points":

        """Returns a new Points object with the n points of self that are closest to point."""

        _, indices = self.kd_tree.query([[point.x, point.y]], k=n)

        return self[indices]



    def in_distance_of_points(self, points: "Points", distance: Union[int, float]) -> "Points":

        """Returns a new Points object with every point of self that

        is in distance of any point in points."""

        pairs = self.kd_tree.query_ball_tree(points.kd_tree, distance)

        return points[[i for closest in pairs for i in closest]]

These are all the ones I could quickly find a way for using the tree. Not included are sort_by_distance_to, n_closest_to_distance and n_closest_to_distance.

In order to test if this is really faster, here are some tests, with the following setup:

np.random.seed(42)

points = np.random.rand(300, 2)

points_graipher = Points(points)

points_op = PointsOP(points)

point = Point2(np.random.rand(2))

points2 = np.random.rand(10, 2)

points2_graipher = Points(points2)

Here PointsOP is you class and Points is the class defined in this answer.

%timeit points_op.closest_to(point)

# 38.3 µs ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_to(point)

# 43.7 µs ± 249 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closer_than(point, 0.1)

# 39.5 µs ± 238 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)    

%timeit points_graipher.closer_than(point, 0.1)

# 11 µs ± 26 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



%timeit points_op.in_distance_between(point, 0.1, 0.2)

# 52.9 µs ± 275 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_between(point, 0.1, 0.2)

# 21.9 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.closest_n_points(point, 10)

# 29.5 µs ± 359 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.closest_n_points(point, 10)

# 41.7 µs ± 287 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)



%timeit points_op.in_distance_of_points(points2, 0.1)

# 116 µs ± 727 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit points_graipher.in_distance_of_points(points2_graipher, 0.1)

# 89.2 µs ± 500 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As you can see for $N = 300$ points there are some methods which are faster with the KDTree (up to four times), some that are basically the same and some that are slower (by up to two times).

To get a feeling how the different approaches, scale, here are some plots. The only thing changing is the number of points. The steps are 30, 300, 3000, 30000.

enter image description here

To summarize, you should check this for some actual cases you have. Depending on the size of points your implementation or this implementation is faster.

edited Dec 26 at 16:28

answered Dec 26 at 11:31

Graipher

23.5k53585

edited Dec 26 at 16:28

answered Dec 26 at 11:31

Graipher

23.5k53585

answered Dec 26 at 11:31

Graipher

23.5k53585

answered Dec 26 at 11:31

Graipher

23.5k53585

Thank you for your answer. In the first comment i said n around 300 will be the highest values, most n will be at 20 or so.
– Tweakimp
Dec 26 at 12:19

@Tweakimp: Yes, which is why I took 300 as the first case. The larger case was mostly to see if and when the KDTree becomes much better. Will add some plots including also some lower numbers.
– Graipher
Dec 26 at 12:22

1

@Tweakimp: Added those plots.
– Graipher
Dec 26 at 12:59

1

Thank you for all the work you put into this. :) Interesting to see how closest n points is constant...
– Tweakimp
Dec 26 at 13:07

@Tweakimp: Yeah, I was a bit surprised by this as well. Apparently those cuts in the tree are very efficient :)
– Graipher
Dec 26 at 13:08

add a comment |

Thank you for your answer. In the first comment i said n around 300 will be the highest values, most n will be at 20 or so.
– Tweakimp
Dec 26 at 12:19

@Tweakimp: Yes, which is why I took 300 as the first case. The larger case was mostly to see if and when the KDTree becomes much better. Will add some plots including also some lower numbers.
– Graipher
Dec 26 at 12:22

1

@Tweakimp: Added those plots.
– Graipher
Dec 26 at 12:59

1

Thank you for all the work you put into this. :) Interesting to see how closest n points is constant...
– Tweakimp
Dec 26 at 13:07

@Tweakimp: Yeah, I was a bit surprised by this as well. Apparently those cuts in the tree are very efficient :)
– Graipher
Dec 26 at 13:08

Thank you for your answer. In the first comment i said n around 300 will be the highest values, most n will be at 20 or so.
– Tweakimp
Dec 26 at 12:19

@Tweakimp: Yes, which is why I took 300 as the first case. The larger case was mostly to see if and when the KDTree becomes much better. Will add some plots including also some lower numbers.
– Graipher
Dec 26 at 12:22

@Tweakimp: Added those plots.
– Graipher
Dec 26 at 12:59

Thank you for all the work you put into this. :) Interesting to see how closest n points is constant...
– Tweakimp
Dec 26 at 13:07

@Tweakimp: Yeah, I was a bit surprised by this as well. Apparently those cuts in the tree are very efficient :)
– Graipher
Dec 26 at 13:08

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Gfrktyl