SQL On Big Data

The query engine parses the query, verifies the semantics of the query against the
catalog store, and generates the logical query plan that is then optimized by the query
optimizer to generate the optimal physical execution plan, based on CPU, I/O, and
network costs to execute the query. The final query plan is then used by the storage
engine to retrieve the underlying data from various data structures, either on disk or in
memory. It is in the storage engine that processes such as Lock Manager, Index Manager,
Buffer Manager, etc., get to work to fetch/update the data, as requested by the query.
Most RDBMS are SMP-based architectures. Traditional database platforms operate
by reading data off disk, bringing it across an I/O interconnect, and loading it into
memory for further processing. An SMP-based system consists of multiple processors,
each with its own memory cache. Memory and I/O subsystems are shared by each of the
processors. The SMP architecture is limited in its ability to move large amounts of data, as
required in data warehousing and large-scale data processing workloads.
The major drawback of this architecture is that it moves data across backplanes and
I/O channels. This does not scale and perform when large data sets are to be queried,
and more so when the queries involve complex joins that require multiple phases of
processing. A huge inefficiency lies in delivering large data off disk across the network
and into memory for processing by the DataBase Management System (DBMS).

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀(guān)點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容