ETL is dead, long live streams

--re-organized according to Infoq talk "ETL is dead, long live streams"

Data and data system changed a lot over the past decade. In the past, database and datawarehouse are main location for our data. And most of the database and datawarehouse are relational.?

The recent data trend includes:

#1 single server databases are replaced by a myriad of distributed data platform that operated at a company-wide scale. A medium to large size company could have more than one data centre in different locations.

#2 There are many more non-transactional data: logs,images,sensors etc. No-sql database appeared and data blending work more handly.

#3 Stream data is increasingly ubiquitous,and faster processing is needed.

Therefore, the tranditional way of Extract, Transform and Load becomes a giant mess, the shortcomings are as below:

#1 it needs a global schema

#2 data cleaning and curation is manual and fundamentally error-prone

#3 operational cost of ETL is high and resource intensive.

#4 ETL tools were normally built to narrowly focus on connecting to databases and datawarehouse in a batch fashion.

Comparably, Streaming platform will have its shiny points:

#1 It is able to process high volume and high diversity data

#2 It is able to provide real-time processed data from ground level

#3 It enables forward-compatible data architecture.


最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • **2014真題Directions:Read the following text. Choose the be...
    又是夜半驚坐起閱讀 11,113評論 0 23
  • 上了周老板的課受益良多,所以按照內(nèi)容自己設(shè)計了一個古風(fēng)類圖片分享方法,分享一下。
    陌上之云閱讀 686評論 0 2
  • 由于影視行業(yè)近幾年井噴式的發(fā)展,現(xiàn)在進(jìn)入影視后期行業(yè)的從業(yè)者的越來越多。開設(shè)影視專業(yè)的大學(xué)也很多。而且專業(yè)的名字也...
    七日課吧閱讀 2,175評論 4 21
  • 這兩天兒子一直在忙著畫點什么,給快過教師節(jié)的老師們,一開始想給徐老師畫張肖像畫,外形畫的挺像,還是剪了短發(fā)的,但是...
    李宇航媽媽閱讀 240評論 0 1

友情鏈接更多精彩內(nèi)容