Python中的堆問題

Heap in python

Heap,即堆,也就是優(yōu)先隊列。我們可以在這里找到[]維基百科](https://en.wikipedia.org/wiki/Heap_(data_structure))

堆(英語:Heap)是計算機科學(xué)中一類特殊的數(shù)據(jù)結(jié)構(gòu)的統(tǒng)稱。堆通常是一個可以被看做一棵樹的數(shù)組對象。在隊列中,調(diào)度程序反復(fù)提取隊列中第一個作業(yè)并運行,因為實際情況中某些時間較短的任務(wù)將等待很長時間才能結(jié)束,或者某些不短小,但具有重要性的作業(yè),同樣應(yīng)當(dāng)具有優(yōu)先權(quán)。堆即為解決此類問題設(shè)計的一種數(shù)據(jù)結(jié)構(gòu)。

邏輯定義:

n個元素序列{k1,k2...ki...kn},當(dāng)且僅當(dāng)滿足下列關(guān)系時稱之為堆:
(ki <= k2i,ki <= k2i+1)或者(ki >= k2i,ki >= k2i+1), (i = 1,2,3,4...n/2)

性質(zhì):

堆的實現(xiàn)通過構(gòu)造二叉堆(binary heap),實為二叉樹的一種;由于其應(yīng)用的普遍性,當(dāng)不加限定時,均指該數(shù)據(jù)結(jié)構(gòu)的這種實現(xiàn)。這種數(shù)據(jù)結(jié)構(gòu)具有以下性質(zhì)。

  • 任意節(jié)點小于(或大于)它的所有后裔,最小元(或最大元)在堆的根上(堆序性)。
  • 堆總是一棵完全樹。即除了最底層,其他層的節(jié)點都被元素填滿,且最底層盡可能地從左到右填入。

將根節(jié)點最大的堆叫做最大堆或大根堆,根節(jié)點最小的堆叫做最小堆或小根堆。常見的堆有二叉堆、斐波那契堆等。

支持的基本操作[編輯]

操作 描述 時間復(fù)雜度
build 創(chuàng)建一個空堆 O(n)
insert 向堆中插入一個新元素 O(log n)
update 將新元素提升使其匹配堆的性質(zhì)
get 獲取當(dāng)前堆頂元素的值 O(1)
delete 刪除堆頂元素 O(log n)
heapify 使刪除堆頂元素的堆再次成為堆

某些堆實現(xiàn)還支持其他的一些操作,如斐波那契堆支持檢查一個堆中是否存在某個元素。

示例程序:

為將元素X插入堆中,找到空閑位置,創(chuàng)建一個空穴,若滿足堆序性(英文:heap order),則插入完成;否則將父節(jié)點元素裝入空穴,刪除該父節(jié)點元素,完成空穴上移。直至滿足堆序性。這種策略叫做上濾(percolate up)。[1]

void Insert( ElementType X, PriorityQueue H )
{
    int i;

    if( IsFull(H) )
    {
        printf( "Queue is full.\n" );
        return;
    }

    for( i = ++H->Size; H->Element[i/2] > X; i /= 2 )
        H->Elements[i] = H->Elements[i/2];
    H->Elements[i] = X;
}

以上是插入到一個二叉堆的過程。
DeleteMin,刪除最小元,即二叉樹的根或父節(jié)點。刪除該節(jié)點元素后,隊列最后一個元素必須移動到堆得某個位置,使得堆仍然滿足堆序性質(zhì)。這種向下替換元素的過程叫作下濾。

ElementType
DeleteMin( PriorityQueue H )
{
    int i, Child;
    ElementType MinElement, LastElement;

    if( IsEmpty( H ) )
    {
        printf( "Queue is empty.\n" );
        return H->Elements[0];
    }
    MinElement = H->Elements[1];
    LastElement = H->Elements[H->Size--];

    for( i = 1; i*2 <= H->Size; i = Child )
    {
        // Find smaller child.
        Child = i*2;
        if( Child != H->Size && H->Elements[Child+1]
                             <  H->Elements[Child] )
            Child++;

        // Percolate one level.
        if( LastElement > H->Elements[Child] )
            H->Elements[i] = H->Elements[Child];
        else
            break;
    }
    H->Elements[i] = LastElement;
    return MinElement;
}

應(yīng)用

堆排序,或者運用堆的排序以選擇優(yōu)先

Python中的heapq模塊 -- 堆排序算法

Purpose:

目的:

The heapq implements a min-heap sort algorithm suitable for use with Python’s lists.
heapq模塊執(zhí)行了一個適用于Python列表的最小堆排序算法。

A heap is a tree-like data structure where the child nodes have a sort-order relationship with the parents. Binary heaps can be represented using a list or array organized so that the children of element N are at positions 2N+1 and 2N+2 (for zero-based indexes). This layout makes it possible to rearrange heaps in place, so it is not necessary to reallocate as much memory when adding or removing items.
堆,是一個類似于樹的數(shù)據(jù)結(jié)構(gòu),其中子節(jié)點們與其父節(jié)點有一個排序的關(guān)系。二叉堆能夠表示為使用一個列表或者數(shù)組來組織,因此元素N的子節(jié)點位于2N+1和2N+2(對于基于0的索引)。這樣的布局允許在原來位置重置堆,因此沒有必要在添加和刪除元素的時候重置過多的內(nèi)存空間。

A max-heap ensures that the parent is larger than or equal to both of its children. A min-heap requires that the parent be less than or equal to its children. Python’s heapq module implements a min-heap.
一個最大堆可以確保父節(jié)點大于或者等于其兩個子節(jié)點。最小堆需要父節(jié)點小于或者等于其子節(jié)點。Python的heapq模塊使用最小堆。

Example Data

示例數(shù)據(jù)

The examples in this section use the data in heapq_heapdata.py.
本節(jié)的這個示例使用heapq_heapdata.py中的data。

heapq_heapdata.py

# This data was generated with the random module.

data = [19, 9, 4, 10, 11]

The heap output is printed using heapq_showtree.py:
heapq_showtree.py使用heap的輸出:

heapq_showtree.py

import math
from io import StringIO


def show_tree(tree, total_width=36, fill=' '):
    """Pretty-print a tree."""
    output = StringIO()
    last_row = -1
    for i, n in enumerate(tree):
        if i:
            row = int(math.floor(math.log(i + 1, 2)))
        else:
            row = 0
        if row != last_row:
            output.write('\n')
        columns = 2 ** row
        col_width = int(math.floor(total_width / columns))
        output.write(str(n).center(col_width, fill))
        last_row = row
    print(output.getvalue())
    print('-' * total_width)
    print()

Creating a Heap

創(chuàng)建一個堆

There are two basic ways to create a heap, heappush() and heapify().
想要創(chuàng)建一個堆,有兩種基本方法:heappush() 和 heapify()。

heapq_heappush.py

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

heap = []
print('random :', data)
print()

for n in data:
    print('add {:>3}:'.format(n))
    heapq.heappush(heap, n)
    show_tree(heap)

Using heappush(), the heap sort order of the elements is maintained as new items are added from a data source.
使用heappush(),當(dāng)來自于數(shù)據(jù)源的新元素添加到堆中時,堆排序算法將維護元素的順序。
$ python3 heapq_heappush.py

random : [19, 9, 4, 10, 11]

add 19:

             19

add 9:

             9
    19

add 4:

             4
    19                9

add 10:

             4
    10                9
19

add 11:

             4
    10                9
19       11

If the data is already in memory, it is more efficient to use heapify() to rearrange the items of the list in place.
如果data已經(jīng)在內(nèi)存之中,使用heapify()來在列表內(nèi)部重置元素將會更高效。

heapq_heapify.py

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

print('random    :', data)
heapq.heapify(data)
print('heapified :')
show_tree(data)

The result of building a list in heap order one item at a time is the same as building it unordered and then calling heapify().
按照堆的順序,一次一個元素構(gòu)建一個列表的結(jié)果,與構(gòu)建一個構(gòu)建一個未排序的列表一致,直接使用heapify():

$ python3 heapq_heapify.py

random : [19, 9, 4, 10, 11]
heapified :

             4
    9                 19
10       11

Accessing Contents of a Heap

訪問堆中的內(nèi)容

Once the heap is organized correctly, use heappop() to remove the element with the lowest value.
一旦堆正確的組織,就可以使用heappop()來移除最小的元素值。

heapq_heappop.py

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

print('random    :', data)
heapq.heapify(data)
print('heapified :')
show_tree(data)
print

for i in range(2):
    smallest = heapq.heappop(data)
    print('pop    {:>3}:'.format(smallest))
    show_tree(data)

In this example, adapted from the stdlib documentation, heapify() and heappop() are used to sort a list of numbers.
在本示例中,參考標(biāo)準(zhǔn)庫的文檔,heapify() 和 heappop()用于對于一個數(shù)字列表進(jìn)行排序。
$ python3 heapq_heappop.py

random : [19, 9, 4, 10, 11]
heapified :

             4
    9                 19
10       11

pop 4:

             9
    10                19
11

pop 9:

             10
    11                19

To remove existing elements and replace them with new values in a single operation, use heapreplace().
想要用一次操作移除一個已經(jīng)存在的元素,然后使用一個新的元素替代它,可以使用 heapreplace():

heapq_heapreplace.py
import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

heapq.heapify(data)
print('start:')
show_tree(data)

for n in [0, 13]:
    smallest = heapq.heapreplace(data, n)
    print('replace {:>2} with {:>2}:'.format(smallest, n))
    show_tree(data)

Replacing elements in place makes it possible to maintain a fixed size heap, such as a queue of jobs ordered by priority.
在原來的位置替換元素,這樣就可以維護一個固定大小的堆,例如按照優(yōu)先級排列的任務(wù)隊列。

$ python3 heapq_heapreplace.py

start:

             4
    9                 19
10       11

replace 4 with 0:

             0
    9                 19
10       11

replace 0 with 13:

             9
    10                19
13       11

Data Extremes From a Heap

堆中兩端的數(shù)據(jù)

heapq also includes two functions to examine an iterable to find a range of the largest or smallest values it contains.
heapq包含兩個函數(shù)用與迭代查找堆中一定范圍內(nèi)最大或者最小的值。

heapq_extremes.py

import heapq
from heapq_heapdata import data

print('all       :', data)
print('3 largest :', heapq.nlargest(3, data))
print('from sort :', list(reversed(sorted(data)[-3:])))
print('3 smallest:', heapq.nsmallest(3, data))
print('from sort :', sorted(data)[:3])

Using nlargest() and nsmallest() are only efficient for relatively small values of n > 1, but can still come in handy in a few cases.
使用 nlargest() 和 nsmallest() 這兩個函數(shù)對于查找 n > 1的較小數(shù)值會顯得更高效,但是在一些場景下更靈活。

$ python3 heapq_extremes.py

all : [19, 9, 4, 10, 11]
3 largest : [19, 11, 10]
from sort : [19, 11, 10]
3 smallest: [4, 9, 10]
from sort : [4, 9, 10]

Efficiently Merging Sorted Sequences

高效的合并已排序的序列

Combining several sorted sequences into one new sequence is easy for small data sets.
使用如下所示的方法,對于較小的數(shù)據(jù)集,合并一些已經(jīng)排序的序列到一個新的序列會變的很容易。

list(sorted(itertools.chain(*data)))

For larger data sets, this technique can use a considerable amount of memory. Instead of sorting the entire combined sequence, merge() uses a heap to generate a new sequence one item at a time, and determine the next item using a fixed amount of memory.
對于較大的數(shù)據(jù)集,使用如上所示的代碼將會占用大量的內(nèi)存。與對整個已經(jīng)組合的序列進(jìn)行排序相比,merge()使用了堆來一次性生成新的序列,并且判斷新生成的序列使用固定數(shù)量的內(nèi)存。

heapq_merge.py

import heapq
import random

random.seed(2016)

data = []
for i in range(4):
    new_data = list(random.sample(range(1, 101), 5))
    new_data.sort()
    data.append(new_data)

for i, d in enumerate(data):
    print('{}: {}'.format(i, d))

print('\nMerged:')
for i in heapq.merge(*data):
    print(i, end=' ')
print()

Because the implementation of merge() uses a heap, it consumes memory based on the number of sequences being merged, rather than the number of items in those sequences.
由于在堆中執(zhí)行merge()函數(shù),所消耗的內(nèi)存取決于所合并的序列數(shù)量,而不是這些序列中的元素數(shù)量。

$ python3 heapq_merge.py

0: [33, 58, 71, 88, 95]
1: [10, 11, 17, 38, 91]
2: [13, 18, 39, 61, 63]
3: [20, 27, 31, 42, 45]

Merged:
10 11 13 17 18 20 27 31 33 38 39 42 45 58 61 63 71 88 91 95

最讓我眼前一亮的應(yīng)用,就是leetcode中的第二十三題目:merge k sorted list,我們可以將問題轉(zhuǎn)換為合并queue:

class ListNode(object):
    def __init__(self, x):
        self.val = x
        self.next = None

def mergeKLists(lists):
    current = dummy = ListNode(0)

    heap = []
    for sorted_list in lists:
        if sorted_list:
            heapq.heappush(heap, (sorted_list.val, sorted_list))

    while heap:
        smallest = heapq.heappop(heap)[1]
        current.next = smallest
        current = current.next
        if smallest.next:
            heapq.heappush(heap, (smallest.next.val, smallest.next))

    return dummy.next

將鏈表的問題轉(zhuǎn)換成堆排序的問題。

1

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容