Timeline for Why is the worst case for this function O(n^2)?

Current License: CC BY-SA 4.0

23 events

when toggle format	what		by	license	comment
Oct 2, 2021 at 19:34	comment	added	Eric Duminil		@sdenham: I didn't even realize you typed "sort" twice, and indeed understood it as binary search. I edited the answer and tried to integrate your comments. Thanks for the feedback.
Oct 2, 2021 at 19:33	history	edited	Eric Duminil	CC BY-SA 4.0	added 282 characters in body
Oct 2, 2021 at 15:22	comment	added	sdenham		@EricDuminil A correction to my previous example - it should be "binary search", not "binary sort" - though I imagine you already recognized it... Please do update your answer, as doing so will unlock my downvote - which I now think was undeserved in the first place - so I can undo it.
Oct 2, 2021 at 13:10	comment	added	Eric Duminil		@sdenham Excellent explanation, thanks. I'll update my answer.
Oct 2, 2021 at 13:00	comment	added	sdenham		@EricDuminil ...Partial functions are perfectly reasonable and even ubiquitous in math and comp. sci. - e.g. the binary sort, defined only for sorted collections (even though a Python implementation could be applied to unsorted ones.) It is certainly something we use and can justifiably analyze, even though the code could give wrong answers when applied outside of its domain... I hope this goes some way to answering your question "why?''
Oct 2, 2021 at 12:59	comment	added	sdenham		@EricDuminil This code, if considered as a function from three arbitrary collections to a boolean, is O(n³). However, by stipulation, it is being analyzed as a function from three sets to a boolean, which is O(n²) for a non-obvious (and therefore pedagogically useful) reason. In situations where it is being used as the latter, it would be pointless to analyze it as the former. If it were written in a more strongly-typed language, it might not even be possible to create a program in which it is applied to arbitrary collections...
Oct 1, 2021 at 7:17	comment	added	Eric Duminil		@sdenham: I just stumbled upon this question again. Just to be sure : do you agree that, in the general case (e.g. with duplicate elements in the sequences), the above algorithm is O(n**3)?
Sep 9, 2019 at 22:04	comment	added	sdenham		Even as an answer to what you thought the question was, note that there are a great many different curves that pass through two points - and anything you offer as explanation here is missing from that answer.
Sep 9, 2019 at 20:12	comment	added	sdenham		Furthermore, in your last reply, you also presented an answer to a question that was not posed - I wrote "counting the number of steps in one example is not an explanation."
Sep 9, 2019 at 20:05	review	Low quality posts
Sep 10, 2019 at 2:25
Sep 9, 2019 at 19:58	comment	added	sdenham		You are missing the point that you are not answering the question that was posed.
Sep 9, 2019 at 17:22	comment	added	sdenham		I think I see the issue here: when the authors say "we will assume that no individual sequence contains duplicate values", that is not a step in answering the question; it is, rather, a precondition under which the question is going to be addressed. For pedagogical purposes, this turns an uninteresting problem into one that challenges peoples' intuitions about big-O - and it seems to have been successful at that, judging by the number of people who have strongly insisted that O(n²) must be wrong... Also, while it is moot here, counting the number of steps in one example is not an explanation.
Sep 9, 2019 at 16:28	comment	added	sdenham		While a discussion of other functions to solve this problem may be informative in its own right (though there seems to be some confusion here), it does not answer the question asked. There is a proof (without quotes) that this function is O(n²), and it is not really an answer if one is left unsure why it is not O(n³).
Sep 9, 2019 at 14:18	comment	added	Baldrickk		@JanDorniak I think sets and dicts are hash tables in python as opposed to the red-black trees in C++. So the absolute worst case is worse, up to 0(n) for a search, but the average case is O(1). As opposed to O(log n) for C++ wiki.python.org/moin/TimeComplexity. Given that it's a python question, and that the domain of the problem leads to a high likelyhood of average case performance, I don't think the O(1) claim is a poor one.
Sep 9, 2019 at 14:12	comment	added	jaskij		@Baldrickk that much I know but do Python sets or dicts have guaranteed big O? Most containers in C++ standard library have guarantees on big O that's why I even mentioned it here
Sep 9, 2019 at 12:56	comment	added	Baldrickk		@JanDorniak Python has sets and dicts. In fact using dicts in python is more common and easier than using map in C++
Sep 9, 2019 at 10:22	comment	added	jaskij		I don't know if Python has such data structures, but C++'s `std::set` and `std::map` work that way if memory serves. Also ping me once you edit so that I can remove that downvote.
Sep 9, 2019 at 10:18	comment	added	jaskij		@EricDuminil hash sets indeed have bad worst-case performance, but on average work well. A balanced binary tree, such as a red-black tree, guarantees an O(log n) lookup, getting worst case to O(n log n). The downside is that it requires an order on the elements (or their keys).
Sep 9, 2019 at 2:50	comment	added	jaskij		Unless you get a perfect hash with at least one bucket per input element you cannot test inclusion in O(1). A sorted set usually has O(log n) lookup. Unless you are talking about average cost, but that's not what the question is about. Still, having a balanced binary set getting hard O(n log n) is trivial.
Sep 9, 2019 at 2:44	comment	added	SteveJ		Your last comment is quite interesting, hadn't thought of that. Are you suggesting that is due to your being able to do three O(n) operations in series?
Sep 8, 2019 at 13:52	history	edited	Eric Duminil	CC BY-SA 4.0	added 275 characters in body
Sep 8, 2019 at 12:40	review	First posts
Sep 10, 2019 at 0:46
Sep 8, 2019 at 12:38	history	answered	Eric Duminil	CC BY-SA 4.0

toggle format