So in this video and the next, we're going to study a very cool divide and conquer algorithm for the closest pair problem.
this is a problem where you're given n points in the plane and you want to figure out which pair of points are closest to each other.
So this would be the first taste we get of an application in computational geometry, which is the part of algorithms which studies how to reason and manipulate geometric objects.
So those algorithms are important in, among other areas robotics, computer vision and computer graphics.
So this is relatively advanced material, it's a bit more difficult than the other applications of divide and conquer that we've seen.
The algorithm's a little bit tricky and it has a quite nontrivial proof of correctness, so just be ready for that and also be warned that because it's more advanced I'm going to talk about the material in at a slightly faster pace tha I do in most of the other videos.
So let's begin now by defining the problem formally, so we're given as imput endpoints in the plane, so each one just define by its x coordinate and ist y coordinate.
And when we talk about the distance between two points in this problem, we're going to focus on Euclidean distance.
So, let me remind you what that is briefly, but we're going to introduce some simple notation for that, which we'll use for the rest of the lecture.
So we're just going to note the Euclidean distance between two points, pi and pj, by d of pi pj.
So in terms of the x and y coordinates of these two points, we just look at the squared differences in each coordinate, sum them up, and take the square root.
And now as the name of the problem would suggest, the goal is to identify among all pairs of points that pair which has the smallest distance between them.
Next, let's start getting a feel for the problem by making some preliminary observations.
First, I want to make an assumption purely for convenience that there's no ties.
So that is I'm going to assume all endpoints have distinct x coordinat es, and also all endpoints have distinct y coordinates.
It's not difficult to extend the algorithm to accommodate ties.
I'll leave it to you to think about how to do that.
So next, let's draw some parallels with the problem of counting inversions, which was a earlier application of divide and conquer that we saw.
The first parallel I want, want to out is that, if we're comfortable with the quadratic time algorithm, then this is not a hard problem, we can simply solve this by brute-force search.
And again, by brute-force search, I just mean we set up a double for loop, which iterates over all distinct pairs of points.
We compute the distance for each such pair and we remember the smallest.
That's clearly a correct algorithm, it has to iterate over a quadratic number of pairs, so its running time is going to be theta of n squared.
And, as always, the question is can we apply some algorithmic ingenuity to do better? Can we have a better algorithm than this naive one which iterates over all pairs of points? You might have a, an initial instinct that because the problem asks about a quadratic number of different objects, perhaps we fundamentally need to do quadratic work.
But again, recall back in counting inversions, using divide and conquer, we were able to get an n log n algorithm despite the fact that there might be as many as a quadratic number of inversions in an array.
So the question is, can we do something similar here for the closest pair problem? Now, one of the keys to getting an n log n time algorithm for counting inversions was to leverage a sorting subroutine.
Recall that we piggybacked on merge sort to count the number of inversions in n log n time.
So the question is, here, with the closest pair problem, perhaps, sorting again can be useful in some way to beat the quadratic barrier.
So, to develop some evidence that sorting will indeed help us compute the closest pair of points embedded in quadratic time, let's look at a special case of the problem, really, an easier version of t he problem, which is when the points are just in one dimension, so on the line rather that in two dimensions in the plane.
So in the 1D version, all the points just lie on a line like this one, and we're given the points in some arbitrary order not necessarily in sorted order.
So, a way to solve the closest pair problem in one dimension, is to simply sort the points, and then of course, the closest pair better be adjacent in this ordering, so you just iterate through the n minus 1 consecutive pairs and see which one is closest to each other So, more formally, here's how you solve the one-dimensional version of the problem.
You sort the points according to their only coordinate, because you're going to remember, this is one dimension.
So as we've seen, using merge sort, we can sort the points in n log n time and then we just do a scan through the points, so this takes linear time.
And for each consecutive pair, we compute their distance and we remember the smallest of those consecutive pairs and we return that.
That's gotta be the closest pair.
So, in this picture here on the right, I'm just going to circle here in green the closest pair of points.
So this is something we discover by sorting and then doing a linear scan.
Now, needless to say, this isn't directly useful, this is not the problem I started out with.
We wanted to find out the closest pair among of points in the plane not points in the line.
But, I want to point out that, this, even in the line, there are a quadratic number of different pairs, so brute-force search is still a quadratic time algorythm even in the 1D case.
So at least, with one dimension, we can use sorting, piggyback on it, to beat the naive brute-force search bound and solve the problem in n log n time.
So our goal for this lecture is going to be to devise an equally good algorithm for the two-dimensional case, so we want to solve closest pair of points in the plane, again, in n log n, n time.
So we will succeed in this goal.
I'm going to show you an n log n time algo rithm for 2D closest pair.
It's going to take us a couple steps.
So let me begin with a high level approach.
Alright.
So the first I need to try is just to copy what worked for us in the one-dimensional case.
So the one-dimensional case, we first sorted the points by their coordinate and that was really useful.
Now, in the 2D case, points have two coordinates, x coordinates and y coordinates, so there's two ways to sort them.
So let's just sort them both ways, that is, the first step of our algorithm, which you should really think of as a preprocessing step.
We're going to take the input points.
We invoke merge sort once to sort them according to x coordinate, that's one copy of the points.
And then we make a second copy of the points where they're sorted by y coordinates.
So we're going to call those copies of points px, that's an array of the points sorted by x coordinate, and py for them sorted by y coordinate.
Now, we know merge short takes n log n times, so this preprocessing step only takes o of n log n time.
And again, given that we're shooting for an algorithm with running time big O of n log n, why not sort the points? We don't even know how we're going to use this fact right now, but it's sort of harmless.
It's not going to effect our goal of getting a big of O n log n time algorithm.
And indeed, this illustrates a broader point, which is one of the themes of this course.
So recall, I hope one of the things you take away from this course is a sense for what are the four free primitives, what are manipulations or operations you can do on data which basically are costless.
Meaning that if your data set fits in the main memory of your computer, you can basically invoke the primitive and it's just going to run blazingly fast, and you can just do it even if you don't know why.
And again, sorting is the canonical for free primitive, although, we'll see some more later in the course and so, here, we're using exactly that principle.
So we don't even understand why yet we might wa nt the points to be sorted.
It just seems like it's probably going to be useful, motivated by the 1D case, so let's go ahead and make assorted copies of the points by x and y coordinate upfront.
So reasoning by analogy with the 1D suggests that sorting the points might be useful, but we can't carry this analogy too far.
So in particular, we're not going to be able to get away with just a simple linear scan through these arrays to identify the closest pair of points.
So, to see that, consider the following example.
So we're going to look at a point set which has six points.
There's going to be two points, which I'll put in blue which are very close in x coordinate, but very far away in y coordinate.
And then there's going to be another pair of points which I'll do in green, which are very close in y coordinate, but very far away in x coordinate.
And then there's going to be a red pair of points, which are not too far away in either the x coordinate or the y coordinate.
So in this set of six points, the closest pair is the pair of red points.
They're not even going to show up consecutively on either of the two arrays, right? So in the array that's sorted by x coordinate, this blue point here is going to be wedged in between the two red points, they won't be consecutive.
And similarly in the, in py, which is sort of by y coordinate, this green coordinate is going to be wedged between the two red points.
So you won't even notice these red point if you just do a linear scan if your px and py, or py look at the consecutive pairs of points.
So, following our preprocessing step where we just invert, invoke merge sort twice we're going to do a quite nontrivial divide and conquer algorithm to compute the closest pair.
So really, in this algorithm, we're applying the divide and conquer algorithm twice.
First, internal to the sorting subroutine, assuming that we use the merge sort algorithm to sort.
Divide and conquer is being used there to get an n log n running time in this preprocessing step, and the n, we're going to use it again on sorted arrays in a new way and that's what I'm going to tell you about next.
So let's just briefly review the divide and conquer algorithm design paradigm before we apply it to the closest pair problem.
So, as usual, the first step is to figure out a way to divide your problem into smaller subproblems.
Sometimes this has a reasonable amount of ingenuity, but it's not going to.
Here in the closest pair problem, we're going to proceed exactly as we did in the merge sort and counting inversions problems, where we took the array and broke it into its left and right half.
So here, we're going to take the input point set, and again, just recurse on the left half of the points, and recurse on the right half of the points.
So here, by left and right, I mean with respect to the points x coordinates.
There's pretty much never any ingenuity in the conquer step, that just means you take the sub-problems you identified in the first step, and you solve them recursively.
That's what we'll do here, we'll recursively complete the closest pair in the left half of the points, and the closest pair in the right half of the points.
So where all the creativity in divide and conquer algorithms is in the combined step.
Given the solutions to your sub problems, how do you somehow recover a solution to the original problem? The one that you actually care about.
So for closest pair, the questionis going to be, given that you've computed the closest pair on the left half of the points, and the closest pair on the right half of the points, how do you then quickly recover the closest pair for the whole point set? That's a tricky problem, that's what we're going to spend most of our time on.
So let's make this divide and conquer approach for closest pair a little bit more precise, so let's now actually start spelling out our closest pair algorithm.
The input we're given, it's, this follows the preprocessing steps or recall that we invoke, merge sort, we get our two sorted copies of the poin t set Px, sorted by x coordinate, and py sorted by y coordinate.
So the first dividend is the division step.
So given that we have a copy of the points px sorted by x coordinate, it's easy to identify the leftmost half of the points, those with the, those n over two smallest x coordinates and in the right half, those were the n over two largest x coordinates.
We're going to call those Q and R respectively.
One thing I'm skipping over is the base case.
I'm not going to bother writing that down, so base case omitted, but it's what you would think it would be.
So basically once you have a small number point, say two points or three points, then you can just solve the problem in constant time by a brute-force search.
You just look at all the pairs and you return the closest pair.
So think of it being at least four points in the input.
Now, in order to recurse, to call clo pair again, in the left and right halves, we need sorted version of Q and R, both by x coordinate and by y coordinate, so we're just going to form those by doing suitable linear scans through px and py.
And so one thing I encourage you to think through carefully or maybe even code up after the video is how would you form Qx, Qy, Rx and Ry given that you already have Px and Py.
And if you think about it, because Px and Py are already sorted just producing these sorted sublists takes linear time.
It's in some sense the opposite of the merge subroutine used in merge sort.
Here, we're sort of splitting rather than merging.
But again, this can be done in linear time, that's something you should think through carefully later.
So that's the division step, now we just conquer, meaning we recursively call closest pair line on each of the two subproblems, so when we invoke closest pair on the left half of the points on Q we're going to get back what are indeed, the closest pair of points amongst those in Q.
So we're going to call those P1 and Pq, So among all pairs of points that both lie in Q, P1 and Q1 minimize the distance between them.
Similarly, we're going to call Q2Q2 the results of the second recursive call, that is, P2 and Q2 are amongst all pairs of points that both lie in R, the pair that has the minimum Euclidean distance.
Now, conceptually, there's two cases.
There's a lucky case and there's an unlucky case.
In the original point set P, if we're lucky, the closest pair of points in all of P, actually, both of them lie in Q or both of them lie in R.
In this lucky case, we'd already be done if the closest pair in the entire point set they happen to both lie in Q, then this first recursive call is going to recover them and we just have them in our hands P1Q1.
Similarly, if both of the closest pair of points in all of P lies on the right side in R, then they get handed to us on a silver platter by the second recursive call that just operate on R.
So in the unlucky case, the closest pair of point in P happens to be split.
That is, one of the points lies in the left half, in Q, and the other point lies in the right half, in R.
Notice, if the closest pair of points in all of P is split, is half in Q and half in R, neither recursive call is going to find it.
Okay? The pair of points is not passed to either of the two recursive calls, so there's no way it's going to be returned to us.
Okay? So we have not identified the closest pair after these two recursive calls, if the closest pair happens to be split.
This is exactly analagous to what happened when we were counting inversions.
The recursive call on the left half of the array counted the left inversions.
The recursive call on the right half of the array counted the right inversions.
But we still had to count the split inversions, so in this closest pair algorithm, we still need a special purpose subroutine that computes the closest pair for the case in which it is split, in which there is one point in Q and one point in R.
So just like in counting inversions, I'm going to write down that subroutine and I'm going to leave it unimplemented for now, we'll figur e out how to implement it quickly in the rest of the lecture.
Now, if we have a correct implementation of closest split pair, so that takes us input the original point set sort of the x and y coordinate, and returns the smallest pair that's split or one points in Q and one points in R, then we're done.
So then, the split, then the closest pair has to either be on the lef or onm the right or it has to be split.
Steps two through four compute the closest pair in each of those categories, so those are the only possible candidates for the closest pair and we just returned the best of them.
So that's an argument for y, if we have a correct implementation of the closest split para subroutine, then that implies a correct implementation of closest pair.
Now, what about the running time? So the running time of the closest para algorithm is going to be in part determined by the running time of closest split pair.
So in the next quiz, I want you to think about what kind of running time we should be shooting for with a closest split pair subroutine.
因此,在此視頻及下一個視頻中,我們將研究一種非??岬姆侄沃惴?,用于解決最接近的配對問題。
這是一個問題,您在平面中獲得了n個點,并且想找出哪對點最接近。
因此,這將是我們在計算幾何中應(yīng)用的第一個嘗試,這是研究如何推理和操縱幾何對象的算法的一部分。
因此,這些算法在機(jī)器人技術(shù),計算機(jī)視覺和計算機(jī)圖形學(xué)等領(lǐng)域都很重要。
因此,這是相對高級的材料,它比我們已經(jīng)看到的其他“分而治之”應(yīng)用程序要困難一些。
該算法有點棘手,并且具有相當(dāng)不平凡的正確性證明,因此請為此做好準(zhǔn)備,并警告一下,由于它的高級性,我將以稍微快一點的速度談?wù)撛摬牧?。其他大多?shù)視頻。
因此,讓我們從形式上定義問題開始,因此,我們將其指定為平面中的歸因端點,因此每個端點僅由其x坐標(biāo)和ist y坐標(biāo)定義。
當(dāng)我們討論此問題中兩點之間的距離時,我們將重點關(guān)注歐幾里得距離。
因此,讓我提醒您一下這是簡短的內(nèi)容,但是我們將為此引入一些簡單的表示法,我們將在后續(xù)的講座中使用。
因此,我們僅要注意pi pj的d點,即pi和pj兩點之間的歐幾里得距離。
因此,就這兩點的x和y坐標(biāo)而言,我們只需查看每個坐標(biāo)的平方差,將它們求和,然后求平方根即可。
現(xiàn)在,正如問題的名稱所暗示的那樣,目標(biāo)是在所有成對的點之間識別距離最小的那對點。
接下來,讓我們通過做一些初步的觀察來開始了解這個問題。
首先,我僅出于方便起見就假設(shè)沒有聯(lián)系。
因此,我將假設(shè)所有端點都具有不同的x坐標(biāo),并且所有端點都具有不同的y坐標(biāo)。
擴(kuò)展算法以適應(yīng)聯(lián)系并不困難。
我留給您考慮如何做。
因此,接下來,讓我們與計算反轉(zhuǎn)的問題進(jìn)行一些比較,這是我們所看到的分而治之的早期應(yīng)用。
我想做的第一個并行操作是,如果我們對二次時間算法感到滿意,那么這不是一個難題,我們可以通過蠻力搜索簡單地解決這個問題。
同樣,通過蠻力搜索,我只是意味著我們設(shè)置了一個double for循環(huán),該循環(huán)遍歷所有不同的點對。
我們?yōu)槊總€這樣的對計算距離,并記住最小的距離。
顯然,這是一種正確的算法,它必須迭代二次對,因此其運(yùn)行時間將是n平方的θ。
而且,像往常一樣,問題是我們可以運(yùn)用某些算法的獨創(chuàng)性來做得更好嗎?我們可以有一個比在所有點對上迭代的幼稚算法更好的算法嗎?最初,您可能會有一個直覺,因為問題詢問的是不同數(shù)量的對象的二次方,也許我們從根本上需要進(jìn)行二次工作。
但是,再次回顧一下,使用除法和征服計數(shù)倒數(shù)的方法,盡管數(shù)組中倒數(shù)的數(shù)量可能是二次數(shù),但我們?nèi)匀豢梢缘玫絥 log n算法。
所以問題是,對于最接近的配對問題,我們可以在這里做類似的事情嗎?現(xiàn)在,獲取n log n時間算法來計算反轉(zhuǎn)的關(guān)鍵之一是利用排序子例程。
回想一下,我們背負(fù)了歸并排序,以n次n次記錄反轉(zhuǎn)次數(shù)。
因此,問題是,在這里,對于最接近的配對問題,也許可以用某種方式再次進(jìn)行排序以克服二次障礙。
因此,為了獲得一些證據(jù)證明排序確實可以幫助我們計算二次時間中嵌入的最接近的點對,讓我們看一下問題的一種特殊情況,實際上是問題的更簡單版本,即當(dāng)這些點只是在一維上,所以在直線上,而不是平面上的二維。
因此,在1D版本中,所有點都位于這樣的一條線上,并且我們可以按任意順序(不一定按排序順序)獲得這些點。
因此,一種解決一維最接近對問題的方法是簡單地對點進(jìn)行排序,然后,當(dāng)然,最接近對最好按此順序相鄰,因此您只需遍歷n個負(fù)1個連續(xù)對,然后查看哪個因此,更正式地講,這是解決問題的一維版本的方法。
您可以根據(jù)點的唯一坐標(biāo)對其進(jìn)行排序,因為您要記住,這是一個維度。
因此,如我們所見,使用合并排序,我們可以按n log n個時間對點進(jìn)行排序,然后只掃描這些點,所以這需要線性時間。
對于每個連續(xù)對,我們計算它們的距離,并記住那些連續(xù)對中的最小對,然后將其返回。
那一定是最接近的一對。
因此,在這張右圖中,我將在此處以綠色圈出最近的一對點。
因此,這是我們通過排序然后進(jìn)行線性掃描發(fā)現(xiàn)的。
現(xiàn)在,不用說,這不是直接有用的,這不是我剛開始遇到的問題。
我們想找出平面上的點中最接近的點,而不是直線中的點。
但是,我想指出的是,即使在一行中,也存在二次數(shù)的不同對,因此,即使在一維情況下,蠻力搜索仍然是二次時間算法。
因此,至少在一個維度上,我們可以使用排序,搭載在上面,以克服樸素的蠻力搜索界限,并在n log n時間內(nèi)解決問題。
因此,本次演講的目標(biāo)是針對二維情況設(shè)計一個同樣好的算法,因此,我們希望在n log n,n時間內(nèi)再次求解平面中最接近的點對。
因此,我們將成功實現(xiàn)這一目標(biāo)。
我將向您展示2D最接近對的n次算法。
這需要我們走幾步。
因此,讓我從一個高層次的方法開始。
好的。
因此,我首先要嘗試的是復(fù)制一維情況下對我們有用的內(nèi)容。
因此,在一維情況下,我們首先通過點的坐標(biāo)對其進(jìn)行排序,這確實很有用。
現(xiàn)在,在2D情況下,點具有兩個坐標(biāo),即x坐標(biāo)和y坐標(biāo),因此有兩種對它們進(jìn)行排序的方法。
因此,讓我們以兩種方式對它們進(jìn)行排序,即算法的第一步,您應(yīng)該將其真正視為預(yù)處理步驟。
我們將獲取輸入點。
我們調(diào)用一次歸并排序以根據(jù)x坐標(biāo)對它們進(jìn)行排序,這就是這些點的一個副本。
然后,我們對按y坐標(biāo)對它們進(jìn)行排序的點進(jìn)行第二次復(fù)制。
因此,我們將這些點稱為px,即按x坐標(biāo)排序的點的數(shù)組,對于py則按y坐標(biāo)排序的點的數(shù)組。
現(xiàn)在,我們知道m(xù)erge short需要n log n次,因此此預(yù)處理步驟僅需要n log n次的o。
再一次,假設(shè)我們正在為運(yùn)行時間為O O n log n的算法進(jìn)行射擊,為什么不對點排序?我們甚至不知道我們現(xiàn)在將如何使用這個事實,但這是無害的。
這不會影響我們獲得大量O n log n時間算法的目標(biāo)。
實際上,這說明了一個更廣泛的觀點,這是本課程的主題之一。
因此,回想一下,我希望您從本課程中學(xué)到的東西之一是對四個免費(fèi)原語的理解,對基本無成本的數(shù)據(jù)可以進(jìn)行的操作或操作是什么。
這意味著,如果您的數(shù)據(jù)集適合您計算機(jī)的主內(nèi)存,則您基本上可以調(diào)用該原語,并且它將以極快的速度運(yùn)行,即使您不知道為什么也可以這樣做。
同樣,排序是免費(fèi)原始語言的規(guī)范,盡管,我們將在本課程的后面部分看到更多內(nèi)容,因此,在這里,我們正使用該原理。
因此,我們什至不理解為什么我們還不知道要排序的點。
似乎受一維情況的啟發(fā),它可能會很有用,所以讓我們繼續(xù)制作x和y坐標(biāo)的點的各種副本。
因此,通過與一維類比進(jìn)行推理表明,對點進(jìn)行排序可能很有用,但我們不能將此類類推得太遠(yuǎn)。
因此,尤其是,僅通過這些數(shù)組進(jìn)行簡單的線性掃描以識別最接近的點對就無法擺脫困境。
因此,請看下面的示例。
因此,我們將看一個包含六個點的點集。
將有兩個點,我將用藍(lán)色表示,在x坐標(biāo)上非常接近,但在y坐標(biāo)上非常遙遠(yuǎn)。
然后我將用綠色做另外一對點,它們在y坐標(biāo)上非常接近,而在x坐標(biāo)上非常遙遠(yuǎn)。
然后會有一對紅色的點,它們在x坐標(biāo)或y坐標(biāo)上都不太遠(yuǎn)。
因此,在這套六個點中,最接近的一對是紅色點對。
它們甚至不會連續(xù)出現(xiàn)在兩個陣列中的任何一個上,對嗎?因此,在按x坐標(biāo)排序的數(shù)組中,此藍(lán)色點將被楔入兩個紅色點之間,它們將不會連續(xù)。
同樣,在以y坐標(biāo)表示的py中,該綠色坐標(biāo)將被楔入兩個紅色點之間。
因此,如果您只進(jìn)行線性掃描(如果px和py或py看著連續(xù)的點對),您甚至都不會注意到這些紅點。
因此,在預(yù)處理步驟中,我們只需進(jìn)行反轉(zhuǎn),然后調(diào)用合并排序兩次,我們將做一個非常平凡的分而治之算法來計算最接近的對。
因此,實際上,在此算法中,我們兩次應(yīng)用了分治法。
首先,在排序子例程的內(nèi)部,假設(shè)我們使用合并排序算法進(jìn)行排序。
在此預(yù)處理步驟中,使用分治法來獲得n log n的運(yùn)行時間,而n我們將以一種新的方式在排序數(shù)組上再次使用它,這就是我要告訴您的下一個。
因此,在將其應(yīng)用到最接近的對問題之前,讓我們簡要地回顧一下分而治之算法設(shè)計范例。
因此,像往常一樣,第一步是想出一種將問題分解為較小的子問題的方法。
有時,這有一定的創(chuàng)造力,但事實并非如此。
在最接近的對問題中,我們將像在合并排序和計數(shù)反演問題中一樣,繼續(xù)進(jìn)行操作,在這里我們將數(shù)組分成左右兩半。
因此,在這里,我們將采用輸入點集,然后再次遞歸于點的左半部分,并遞歸于點的右半部分。
所以在這里,我指的是關(guān)于點x坐標(biāo)。
在征服步驟中幾乎沒有任何聰明才智,這僅意味著您要解決第一步中確定的子問題,然后遞歸地解決它們。
這就是我們要做的,我們將在點的左半部分中遞歸地完成最接近的對,在點的右半部分中遞歸地完成最接近的對。
因此,分治法中所有創(chuàng)造力都在結(jié)合步驟中。
給定您的子問題的解決方案,您如何以某種方式恢復(fù)原始問題的解決方案?您真正關(guān)心的那個。
因此,對于最接近的對,問題就在于,假設(shè)您已計算出點的左半部分的最接近對,而對點的右半部分則計算了最接近對,那么您如何快速恢復(fù)以下點對呢?整個定點?這是一個棘手的問題,這就是我們將要花費(fèi)大部分時間的原因。
因此,讓我們對最接近對的分治法更加精確一些,讓我們現(xiàn)在開始實際闡明最接近對的算法。
我們得到的輸入是按照預(yù)處理步驟執(zhí)行的,或者我們調(diào)用了調(diào)用,合并排序,得到了點集px的兩個排序副本,分別按x坐標(biāo)排序,而py按y坐標(biāo)排序。
因此,第一筆紅利是除法步驟。
因此,假設(shè)我們擁有按x坐標(biāo)排序的點px的副本,則很容易識別出點的最左半部分,即那些在最小的x坐標(biāo)上的n個點,以及在兩個最小的x坐標(biāo)上的n個點的最右半部分。最大x坐標(biāo)。
我們將分別稱為Q和R。
我要跳過的一件事是基本情況。
我不會費(fèi)心寫下來,因此省略了基本情況,但這就是您認(rèn)為的那樣。
因此,基本上,一旦您有一個小數(shù)點,例如兩點或三點,您就可以通過蠻力搜索在恒定時間內(nèi)解決問題。
您只需查看所有對,然后返回最接近的對。
因此,可以認(rèn)為輸入中至少有四個點。
現(xiàn)在,為了遞歸,在左右兩半再次調(diào)用clo對,我們需要按x坐標(biāo)和y坐標(biāo)對Q和R進(jìn)行排序,所以我們將通過做適當(dāng)?shù)倪x擇來形成Q和R。通過px和py進(jìn)行線性掃描。
因此,我鼓勵您仔細(xì)考慮一遍,甚至在視頻播放后進(jìn)行編碼,考慮到已經(jīng)擁有Px和Py,您將如何形成Qx,Qy,Rx和Ry。
而且,如果考慮一下,因為Px和Py已經(jīng)排序,只生成這些排序的子列表將花費(fèi)線性時間。
從某種意義上講,這與合并排序中使用的合并子例程相反。
在這里,我們有點分裂而不是合并。
但是同樣,這可以在線性時間內(nèi)完成,這是您以后應(yīng)該仔細(xì)考慮的事情。
這就是除法步驟,現(xiàn)在我們就征服了,這意味著我們遞歸地在兩個子問題中的每一個上調(diào)用最接近的對線,因此,當(dāng)我們在Q上點的左半部分調(diào)用最接近的對時,我們將得到確實是,是Q中最接近的一對點。
因此,我們將它們稱為P1和Pq,因此在都位于Q的所有成對的點之間,P1和Q1將它們之間的距離最小化。
類似地,我們將第二個遞歸調(diào)用的結(jié)果稱為Q2Q2,即P2和Q2都位于R中,這對點具有最小的歐幾里德距離,這對所有點都在其中。
現(xiàn)在,從概念上講,有兩種情況。
有一個幸運(yùn)的情況,有一個不幸運(yùn)的情況。
如果幸運(yùn)的話,在原始點集P中,所有P中最接近的一對點實際上都位于Q或都位于R中。
在這種幸運(yùn)的情況下,如果它們在整個點集中最接近的一對都碰巧都位于Q上,那么我們已經(jīng)做完了,那么第一個遞歸調(diào)用將恢復(fù)它們,而我們只需將它們放在我們的手中。
類似地,如果所有P中最接近的兩個點對都位于R的右側(cè),那么第二個遞歸調(diào)用(它們僅在R上)會將它們放在銀盤上交給我們。
因此,在不幸的情況下,P中最接近的點對恰好被分割了。
也就是說,其中一個點位于Q的左半部分,另一點位于R的右半部分。
請注意,如果將所有P中最接近的點分開,則Q的一半,R的一半,則兩個遞歸調(diào)用都不會找到它。
好的?這對點不會傳遞給兩個遞歸調(diào)用中的任何一個,因此不可能將其返回給我們。
好的?因此,如果最接近的一對碰巧被拆分,則在這兩個遞歸調(diào)用之后,我們尚未確定最接近的一對。
這恰好與我們計算反演時發(fā)生的情況類似。
數(shù)組左半部分的遞歸調(diào)用計算了左反轉(zhuǎn)。
數(shù)組右半部分的遞歸調(diào)用計算了正確的反轉(zhuǎn)。
但是我們?nèi)匀槐仨氂嬎惴至训牡箶?shù),因此在這種最接近的對算法中,我們?nèi)匀恍枰粋€特殊用途的子例程,該子例程針對分裂的情況計算最接近的對,其中Q中有一個點,而在Q中有一個點。 R.
因此,就像計算反轉(zhuǎn)一樣,我將寫下該子例程,并且現(xiàn)在暫時不執(zhí)行它,我們將在本教程的其余部分中弄清楚如何快速實現(xiàn)它。
現(xiàn)在,如果我們正確地實現(xiàn)了最接近的拆分對,那么就需要我們輸入x和y坐標(biāo)的原始點集,然后返回拆分的最小對或Q中的一個點和R中的一個點,那么我們'重做。
因此,然后進(jìn)行分割,然后最接近的一對必須位于左側(cè)或右側(cè),否則必須對其進(jìn)行分割。
第2步到第4步將計算每個類別中最接近的對,因此這些是最接近對的唯一可能的候選者,我們只返回了其中的最好的對。
因此,這是y的參數(shù),如果我們對最接近的split para子例程有正確的實現(xiàn),則意味著對最接近的對的正確實現(xiàn)。
現(xiàn)在,運(yùn)行時間如何?因此,最接近對等算法的運(yùn)行時間將部分由最接近分裂對的運(yùn)行時間確定。
因此,在下一個測驗中,我希望您考慮使用最接近的拆分對子例程應(yīng)該拍攝什么樣的運(yùn)行時間。
O(n log n) Algorithm for Closest Pair I [Advanced - Optional] - Question 1
Suppose we can correctly implement the ClosestSplitPair subroutine in???(??)?time.
What will be the overall running time of the ClosestPair algorithm ? (Choose the smallest upper bound that applies)
A. ??(??)
B. ??(??log(??))
C. ??(??(log(??))^2)
D. ??(??^2)