011 大數(shù)據(jù)與云計(jì)算-綜合指南

011 Big Data And Cloud Computing – A Comprehensive Guide

1. Objective

1. 目標(biāo)

This cloud computing tutorial for Big data and cloud computing will help you in learning Big data with Cloud technology to understand what is cloud storage, Big data in the cloud, characteristics of cloud computing, cloud computing services and cloud hosting, cloud data storage and deployment models, cloud computing companies and cloud service providers, cloud infrastructure, advantages of cloud computing and issues with cloud computing.

這個(gè)面向大數(shù)據(jù)和云計(jì)算的教程將幫助你用云技術(shù)學(xué)習(xí)大數(shù)據(jù),了解什么是云存儲(chǔ)、云上大數(shù)據(jù)、云計(jì)算的特點(diǎn)、 云計(jì)算服務(wù)和云托管、云數(shù)據(jù)存儲(chǔ)和部署模式、云計(jì)算公司和云服務(wù)提供商、云基礎(chǔ)設(shè)施、云計(jì)算的優(yōu)勢(shì)和云計(jì)算的問題.

Big Data and cloud computing

2. Introduction to Big data and Cloud Computing

2. 大數(shù)據(jù)、云計(jì)算介紹

Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). It’s a virtualization framework.
It is like resource on demand whether it be storage, computing etc. Cloud follows pay per usage model. You need to pay the amount of resource you use.
This computing service by cloud charges you based only on the amount of computing resources we use. So for example, if you want to give demo to a client on a cluster of more than 100 machines and you do not have so many machines currently available with you, then in such case cloud computing plays a very important role.
Cloud plays an important role within the Big Data world, by providing horizontally expandable and optimized infrastructure that supports practical implementation of Big Data.

云計(jì)算是指通過(guò)網(wǎng)絡(luò) (通常是互聯(lián)網(wǎng)) 作為服務(wù)提供的計(jì)算資源 (硬件和軟件) 的使用.這是一個(gè)虛擬化框架.
無(wú)論是存儲(chǔ)、計(jì)算等,它都像按需資源一樣,云遵循按使用付費(fèi)的模式.你需要支付你使用的資源數(shù)量.
云計(jì)算服務(wù)僅根據(jù)我們使用的計(jì)算資源數(shù)量向您收費(fèi). 例如,如果你想在 100 多臺(tái)機(jī)器的集群上給客戶端演示,而你目前沒有這么多機(jī)器可用, 在這種情況下,云計(jì)算就扮演著非常重要的角色.
云在大數(shù)據(jù)領(lǐng)域發(fā)揮著重要作用,它提供了橫向擴(kuò)展和優(yōu)化的基礎(chǔ)設(shè)施,支持大數(shù)據(jù)的實(shí)際實(shí)施.

3. Cloud Computing and Big Data

3. 云計(jì)算、大數(shù)據(jù)

In cloud computing, all data is gathered in data centers and then distributed to the end-users. Further, automatic backups and recovery of data is also ensured for business continuity, all such resources are available in the cloud. We do not know exact physical location of these resources provided to us. You just need dummy terminals like desktops, laptops, phones etc. and a net connection.
There are multiple ways to access the cloud:

  1. Applications or software as a service (SAAS) ex. Salesforce.com, dropbox, google drive etc.
  2. Platform as a service (PAAS)
  3. Infrastructure as a service (IAAS)

在云計(jì)算中,所有的數(shù)據(jù)都集中在數(shù)據(jù)中心,然后分發(fā)給終端用戶.此外,為了業(yè)務(wù)連續(xù)性,還確保了數(shù)據(jù)的自動(dòng)備份和恢復(fù),所有這些資源都可以在云中獲得.我們不知道提供給我們的這些資源的確切物理位置.你只需要像臺(tái)式機(jī)、筆記本電腦、手機(jī)等虛擬終端和網(wǎng)絡(luò)連接.
訪問云有多種方式:

  1. 應(yīng)用程序或軟件即服務(wù) (SAAS) Salesforce.com 、 dropbox 、 google drive 等.
  2. 平臺(tái)即服務(wù) (PAAS)
  3. 基礎(chǔ)設(shè)施即服務(wù) (IAAS)

4. Features of Cloud Computing

4. 云計(jì)算的特點(diǎn)

Let us see few features of cloud computing:

讓我們看看云計(jì)算的幾個(gè)特點(diǎn):

a. Scalability

a. 可擴(kuò)展性

Scalability is provided by using distributed computing

分布式計(jì)算提供了可擴(kuò)展性

b. Elasticity

b. 彈性

Customers are allowed to use and pay for only that much resource which it is using. In cloud computing, elasticity is defined as the degree to which a system is able to adapt to workload changes in an autonomic manner, so that at any time the available resources match the current demand as closely as possible.

客戶只允許使用和支付它正在使用的那么多資源.在云計(jì)算中,彈性被定義為系統(tǒng)能夠以自主的方式適應(yīng)工作負(fù)載變化的程度, 因此,在任何時(shí)候,可用資源都盡可能地與當(dāng)前需求相匹配.

c. Resource Pooling

c. 資源池

Same resources are allowed to be used by multiple organizations. The computing resources are pooled for serving various consumers via multi-tenant model, with different resources dynamically assigned and reassigned according to consumer demand.

多個(gè)組織允許使用相同的資源.通過(guò)多租戶模型將計(jì)算資源匯集起來(lái),為不同的消費(fèi)者提供服務(wù),并根據(jù)消費(fèi)者的需求動(dòng)態(tài)分配和重新分配不同的資源.

d. Self service

d. 自助服務(wù)

Customers are provided easy to use interface through which they can choose services they want. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed without requiring human interaction.

為客戶提供了易于使用的界面,通過(guò)該界面他們可以選擇他們想要的服務(wù).消費(fèi)者可以根據(jù)需要單方面提供計(jì)算能力,如服務(wù)器時(shí)間和網(wǎng)絡(luò)存儲(chǔ),而無(wú)需人工交互.

e. Low Costs

e. 低成本

It charges you based only on the amount of computing resources we use and you need not buy expensive infrastructure. Pricing on a utility computing basis is usage-based and fewer IT skills are required for implementation.

它只根據(jù)我們使用的計(jì)算資源數(shù)量向您收費(fèi),您不需要購(gòu)買昂貴的基礎(chǔ)設(shè)施.基于效用計(jì)算的定價(jià)是基于使用的,實(shí)施所需的 IT 技能更少.

f. Fault Tolerance

f. 容錯(cuò)性

Allows recovery in case of a part in cloud system fails to respond.

允許在云系統(tǒng)中的某個(gè)部件無(wú)法響應(yīng)的情況下進(jìn)行恢復(fù).

5. Cloud Deployment Models

5. 云部署模式

There are mainly 2 types of cloud deployments models:

云部署模式主要有兩種:

  • Public cloud – A cloud is called a “public cloud” when the services are open over a network for public use.

  • Private Cloud – Private cloud is operated solely for a single organization, whether managed internally or by a third-party, and hosted either internally or externally.

  • 公共云 -- 當(dāng)服務(wù)通過(guò)網(wǎng)絡(luò)開放供公共使用時(shí),云被稱為 “公共云”.

  • 私有云 -- 私有云僅針對(duì)單個(gè)組織運(yùn)行,無(wú)論是內(nèi)部管理還是由第三方管理,并在內(nèi)部或外部托管.

6. Cloud Delivery Models

6. 云交付模式

Cloud services are categorized as below:

  1. Infrastructure as a service (IAAS): It means complete infrastructure will be provided to you. Maintenance related tasks will be done by cloud provider and you can use it as per your requirement. It can be used as public and private both.
    Examples of IaaS are virtual machines, load balancers, and network attached storage.
  2. Platform as a service (PAAS): Here we have object storage, queuing, databases, runtime etc. All these we can get directly from the cloud provider. It’s our responsibility to configure and use that. Providers will give us the resources but connectivity to our database and other similar activities are our responsibility. Examples of PaaS are Windows Azure and Google App Engine (GAE).
  3. Applications or software as a service (SAAS) ex. Salesforce.com, dropbox, google drive etc. Here we do not have any responsibility. We are using the application that is running on the cloud. All infrastructure setup is the responsibility of the service provider. For SaaS to work, the infrastructure (IaaS) and the platform (PaaS) must be in place.

云服務(wù)分類如下:

  1. 基礎(chǔ)設(shè)施即服務(wù) (IAAS): 這意味著將向您提供完整的基礎(chǔ)設(shè)施.云提供商將完成與維護(hù)相關(guān)的任務(wù),您可以根據(jù)自己的要求使用它.公共和私人都可以使用它.
    IaaS 的例子包括虛擬機(jī)、負(fù)載均衡器和網(wǎng)絡(luò)連接存儲(chǔ).
  2. 平臺(tái)即服務(wù) (PAAS): 這里我們有對(duì)象存儲(chǔ)、隊(duì)列、數(shù)據(jù)庫(kù)、運(yùn)行時(shí)等,所有這些都可以直接從云提供商那里獲得.配置和使用它是我們的責(zé)任.提供商將為我們提供資源,但我們有責(zé)任連接我們的數(shù)據(jù)庫(kù)和其他類似活動(dòng).Windows 的例子有 Windows Azure 和 Google App Engine (GAE).
  3. 應(yīng)用程序或軟件即服務(wù) (SAAS)在這里,我們沒有任何責(zé)任.我們正在使用在云上運(yùn)行的應(yīng)用程序.服務(wù)提供商負(fù)責(zé)所有基礎(chǔ)設(shè)施的設(shè)置.SaaS 要工作,必須有基礎(chǔ)設(shè)施 (IaaS) 和平臺(tái) (PaaS).

7. Cloud for Big Data

7. 云之大數(shù)據(jù)

Below are some examples of how cloud applications are used for Big Data:
**IAAS in a public cloud: **Using a cloud provider’s infrastructure for Big Data services, gives access to almost limitless storage and compute power. IaaS can be utilised by enterprise customers to create cost effective and easily scalable IT solutions where cloud providers bear the complexities and expenses of managing the underlying hardware. If the scale of a business customer’s operations fluctuate, or they are looking to expand, they can tap into the cloud resource as and when they need it rather than purchase, install and integrate hardware themselves.
**PAAS in a private cloud: **PaaS vendors are beginning to incorporate Big Data technologies such as Hadoop and MapReduce into their PaaS offerings, which eliminate the dealing with the complexities of managing individual software and hardware elements. For example, web developers can use individual PaaS environments at every stage of development, testing and ultimately hosting their websites. However, businesses that are developing their own internal software can also utilise Platform as a Service, particularly to create distinct ring-fenced development and testing environments.
**SAAS in a hybrid cloud: **Many organizations feel the need to analyse the customer’s voice, especially on social media. SaaS vendors provide the platform for the analysis as well as the social media data. Office software is the best example of businesses utilising SaaS. Tasks related to accounting, sales, invoicing and planning can all be performed through SAAS. Businesses may wish to use one piece of software that performs all of these tasks or several that each perform different tasks. The software can be subscribed through internet and then accessed online via any computer in the office using a username and password. If needed, they can switch to software that fulfills their requirements in better manner. Everyone who needs access to a particular piece of software can be set up as a user, whether it is one or two people or every employee in a corporation that employs hundreds.

以下是云應(yīng)用程序如何用于大數(shù)據(jù)的一些示例:
公有云中的 IAAS: 將云提供商的基礎(chǔ)架構(gòu)用于大數(shù)據(jù)服務(wù),可以獲得幾乎無(wú)限的存儲(chǔ)和計(jì)算能力.企業(yè)客戶可以利用 IaaS 創(chuàng)建經(jīng)濟(jì)高效、易于擴(kuò)展的 IT 解決方案,其中云提供商承擔(dān)管理底層硬件的復(fù)雜性和費(fèi)用.如果業(yè)務(wù)客戶的運(yùn)營(yíng)規(guī)模波動(dòng),或者他們希望擴(kuò)大規(guī)模,他們可以在需要時(shí)利用云資源,而不是購(gòu)買云資源, 硬件本身的安裝和集成.
私有云中的 PAAS: PaaS 供應(yīng)商開始將 Hadoop 和 MapReduce 等大數(shù)據(jù)技術(shù)整合到他們的 PaaS 產(chǎn)品中,這消除了管理單個(gè)軟件和硬件元素的復(fù)雜性.例如,web 開發(fā)人員可以在開發(fā)、測(cè)試和最終托管網(wǎng)站的每個(gè)階段使用單獨(dú)的 PaaS 環(huán)境.然而,正在開發(fā)自己內(nèi)部軟件的企業(yè)也可以利用平臺(tái)即服務(wù),特別是創(chuàng)建不同的環(huán)網(wǎng)化開發(fā)和測(cè)試環(huán)境.
混合云中的 SAAS: 許多組織認(rèn)為有必要分析客戶的聲音,尤其是在社交媒體上. SaaS 供應(yīng)商為分析和社交媒體數(shù)據(jù)提供了平臺(tái).辦公軟件是使用 SaaS 的企業(yè)的最佳例子.與會(huì)計(jì)、銷售、開票和計(jì)劃相關(guān)的任務(wù)都可以通過(guò) SAAS 執(zhí)行.企業(yè)可能希望使用一個(gè)執(zhí)行所有這些任務(wù)的軟件,或者使用幾個(gè)執(zhí)行不同任務(wù)的軟件.該軟件可以通過(guò)互聯(lián)網(wǎng)訂閱,然后使用用戶名和密碼通過(guò)辦公室的任何計(jì)算機(jī)在線訪問.如果需要,他們可以以更好的方式切換到滿足需求的軟件.每個(gè)需要訪問特定軟件的人都可以作為用戶來(lái)設(shè)置,無(wú)論是一兩個(gè)人,還是一家擁有數(shù)百名員工的公司.

8. Providers in the Big Data Cloud Market

8. 大數(shù)據(jù)云計(jì)算市場(chǎng)供應(yīng)商

Cloud computing companies come in all shapes and sizes. All large software vendors either have already started offerings in cloud space, or are in the process of launching one. In addition there are many startups that have interesting products in cloud space. Here we have a list of major vendors of cloud computing. Few of the cloud providers are google, citrix, netmagic, redhat, rackspace etc. Amazon (aws) is the leading cloud provider amongst all. Microsoft is also providing cloud services and it is called as azure.
Infrastructure as a Service cloud computing companies:

各種規(guī)模的云計(jì)算公司都有.所有大型軟件供應(yīng)商要么已經(jīng)開始在云空間提供產(chǎn)品,要么正在推出產(chǎn)品.此外,還有很多初創(chuàng)公司在云領(lǐng)域推出了有趣的產(chǎn)品.在這里,我們列出了云計(jì)算的主要供應(yīng)商.很少有云提供商是 google、 citrix 、 netmagic 、 redhat 、 rackspace 等. Amazon(aws) 是所有云提供商中領(lǐng)先的.微軟也提供云服務(wù),被稱為 azure.

云計(jì)算公司的基礎(chǔ)設(shè)施即服務(wù):

  • Amazon’s offerings include S3 (Data storage/file system), SimpleDB (non-relational database) and EC2 (computing servers).

  • Rackspace’s offerings include Cloud Drive (Data storage/file system), Cloud Sites (web site hosting on cloud) and Cloud Servers(computing servers).

  • IBM’s offerings include Smart Business Storage Cloud and Computing on Demand (CoD).

  • AT&T’s provides Synaptic Storage and Synaptic Compute as a service.

  • Amazon 的產(chǎn)品包括 S3 (數(shù)據(jù)存儲(chǔ)/文件系統(tǒng)) 、 SimpleDB (非關(guān)系數(shù)據(jù)庫(kù)) 和 EC2 (計(jì)算服務(wù)器).

  • Rackspace 的產(chǎn)品包括云驅(qū)動(dòng)器 (數(shù)據(jù)存儲(chǔ)/文件系統(tǒng)) 、云站點(diǎn) (云上托管的網(wǎng)站) 和云服務(wù)器 (計(jì)算服務(wù)器).

  • IBM 的產(chǎn)品包括智能業(yè)務(wù)存儲(chǔ)云和按需計(jì)算 (CoD).

  • AT&T 提供突觸存儲(chǔ)和突觸計(jì)算即服務(wù).

Platform as a Service cloud computing companies

云計(jì)算公司平臺(tái)即服務(wù)

  • Googles AppEngine is a development platform that is built upon Python and Java.

  • com’s provides a development platform that is based upon Apex.

  • Microsoft Azure provides a development platform based upon .Net.

  • Google AppEngine 是一個(gè)基于 Python 和 Java 的開發(fā)平臺(tái).

  • Com 提供了一個(gè)基于 Apex 的開發(fā)平臺(tái).

  • 微軟 Azure 提供了一個(gè)基于.Net.

Software as a Service companies

軟件即服務(wù)公司

  • In SaaS, Google provides space that includes Google Docs, Gmail, Google Calendar and Picasa.

  • IBM provides LotusLive iNotes, a web-based email service for messaging and calendaring capabilities to business users.

  • Zoho provides online products similar to Microsoft office suite.

  • 在 SaaS 中,谷歌提供了包括谷歌文檔、 Gmail 、谷歌日歷和 Picasa 在內(nèi)的空間.

  • IBM 為業(yè)務(wù)用戶提供了基于 web 的消息傳遞和日歷功能 LotusLive iNotes.

  • Zoho 提供類似 Microsoft office 套件的在線產(chǎn)品.

9. Issues in Using Cloud Services

9. 使用云服務(wù)時(shí)的問題

Some important cloud services issues are as listed:

列出了一些重要的云服務(wù)問題:

a. Data Security

a. 數(shù)據(jù)安全

Organizations must ensure that their agreement with the cloud service provider ensure data security. Handing over private data to others worries some people. Corporate executives might hesitate to take advantage of a cloud computing system because they can’t keep their company’s information under lock and key.

公司必須確保與云服務(wù)提供商的協(xié)議確保數(shù)據(jù)安全.一些人擔(dān)心將私人數(shù)據(jù)交給其他人.企業(yè)高管可能會(huì)猶豫是否利用云計(jì)算系統(tǒng),因?yàn)樗麄儫o(wú)法將公司的信息保密.

b. Performance

b. 性能

Parameters of cloud performance must be specified in the agreement and quantified wherever possible. Exceptions must be clearly noted. Service-Level Agreement (SLA) should clearly state all the terms and conditions between a service user and a service provider to ensure proper performance.

必須在協(xié)議中指定云性能的參數(shù),并盡可能量化. 必須明確指出例外情況.服務(wù)級(jí)別協(xié)議 (SLA) 應(yīng)明確說(shuō)明服務(wù)用戶和服務(wù)提供商之間的所有條款和條件,以確保適當(dāng)?shù)男阅?

c. Compliance

c. 合規(guī)性

Cloud services must be compatible with the compliance needs of the business. Some companies are also concerned about regulatory issues. Market observers say that around 50 percent people worry that they will be tied to one provider of cloud storage.

云服務(wù)必須與業(yè)務(wù)的合規(guī)性需求相兼容.一些公司也擔(dān)心監(jiān)管問題.市場(chǎng)觀察人士說(shuō),大約 50% 人擔(dān)心他們將與一家云存儲(chǔ)提供商聯(lián)系在一起.

d. Legal Issues

d. 法律問題

Organization must ensure that the location of the physical resources of the cloud does not bring any legal issue. The cloud presents a number of legal challenges towards privacy issues involved in data stored in multiple locations in the cloud, additionally increasing the risk of confidentiality and privacy breaches.

組織必須確保云物理資源的位置不會(huì)帶來(lái)任何法律問題.云對(duì)存儲(chǔ)在云中多個(gè)位置的數(shù)據(jù)涉及的隱私問題提出了一些法律挑戰(zhàn),此外還增加了保密和隱私泄露的風(fēng)險(xiǎn).

e. Costs

e. 成本

Organizations should be aware of all the costs involved with the use of cloud, and use the services in a controlled manner as cloud offers pay as per usage method of the cost incurred by the company.

組織應(yīng)該了解使用云所涉及的所有成本,并以受控的方式使用服務(wù),因?yàn)樵瓢凑展景l(fā)生的成本的使用方法提供支付.

https://data-flair.training/blogs/big-data-and-cloud-computing-comprehensive-guide

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容