這門課主要是講web上的data模型，側(cè)重在xml上

1.Semi-Structured Data (SSD)

由于自然語言和二進制所寫的數(shù)據(jù)總會造成對于人類或者機器閱讀的不友好，所以需要一種對于human-machine 都readable 的語言。這樣就不用專門寫解釋器了。

我們可以利用tree來表示SSD這種結(jié)構(gòu)

屬于該結(jié)構(gòu)下的有：XML，JSON，Lore，Object Exchange Model

2.XML

xml （eXtensible Markup Language），是ssd的一種用來存儲數(shù)據(jù)的data model，并不是文檔的布局工具。

3.XML：Basics

可利用tree來表示，data中的一小塊都可以稱為是element，不存在二義性的層次結(jié)構(gòu)，有一個root節(jié)點來包含所有的element，不需要解釋器。

<?xml version="1.0" encoding="utf-8" ?>

<!DOCTYPE book SYSTEM "book.dtd"> ／／指明該文檔為book類型，且可由book.dtd對其進行校驗

element用tags來修飾<book>,</book> or <book/>, 對于element的名字是區(qū)分大小寫的

element內(nèi)部可以存在attribute <boook title="jshkaf"/> ／／attribute不能定義兩次?

XML：well-formed

exactly one root element

tags inside <> are correct

tags are nested

attribute are unique and quoted

no comments inside the tags

或者簡單一點，可以用tree來表示這個xml文檔

XML：Valid

可以利用DTD，XML Schema，Relax NG來對XML中的數(shù)據(jù)進行校驗

DTD：利用正則表達式來表達允許出現(xiàn)的文檔格式

<!ELEMENT elem_name elem_regexp>

element regular expression: *,+,?, [,], EMPTY, ANY, #PCDATA(text)

mixed element could be (#PCDATA|sub_element)

<!ATTLIST elem_name?

? ? ? ? ? ? ? ? ? att_name att_type att_values>

attribute type: ID, IDREF(FK), CDATA, v1|v2|..., vn

attribute values: #REQUIRED,? #IMPLIED, #FIXED v

4. VALID in xml:

利用dtd來進行validation的過程，講正則表達式轉(zhuǎn)化為一個自動機讓它進行識別的過程。

自動機有兩種DFA和NFA，DFA指的就是每個狀態(tài)根據(jù)action只能到達下一個狀態(tài)，而NFA在于每個狀態(tài)根據(jù)相應(yīng)的action可能可以到達多個狀態(tài)。每一個NFA都可以轉(zhuǎn)化為DFA，但是需要指數(shù)級的空間，evalution時可能并不是很有效。W3C要求使用喬姆斯基automaton，需要1-unambigous的形式

所以dtd利用FA來進行validation時采用top-down的形式：

1.對于每個<!ELEMENT...>開始的建立它自己相對應(yīng)的automaton A

2. 對于Document D中的每個元素對它進行匹配相應(yīng)的自動機

3. 若出現(xiàn)不匹配項則return no valid

4. 若全部match 則document valid

缺點：

dtd不是xml形式的。。。。

利用XML SChema進行匹配

<xsd: element name="name"/>

<xsd: element name = "x", minOccurs="1",maxOccurs="unbounded",type="xsd:date">

type: string, boolean, number, float, time, duration, base64binary, AnyURI...

利用tree automata 進行xml校驗

Tree 分為兩種：

rank: every node has bounded number of children

unranked: doesn't have that rule

unranked => rank tree using first-child next-sibling

bottom-up non-deterministic tree automata:

A=(alphabet, states, accept_states, transition)

從葉子節(jié)點向上推導，直至root節(jié)點，若root節(jié)點的狀態(tài)屬于終態(tài)集中，則將之稱為accepting

能被這樣一個樹形自動機識別的語言被稱為：regular tree language

Top-down tree automata

A=(alphabet, states, initial_states, accept_states, transition)

從root節(jié)點向下推導直至所有葉子節(jié)點都屬于終態(tài)集則accept

regular tree language= accept by non-deterministic bottom-up = deterministic bottom-up= non-deterministic top-down

f{epsilon}-> Qf

d{Qf}->Qd

a{Qb*,QC+}->QA

5. Regular Tree grammars

G=(N,T,S,P)

N: non-terminal symbols

T: terminal symbols

S: start symbols?

P: production rules

Dir-> directory[Person*]

Perosn->person{epsilon}

RelaxNG

Competing NON-terminals:

AB are non-terminals, A->x[shdjfk], B->x[sdfdsg]

then call AB competing non-terminals

Local tree gramma:

there is no competing non-terminals

(dtd)

Sing-type tree gramma:

the competing non-terminals should not show in the same content.

(XML Schema)

Restrained Competition Gramma:

the competing non terminals could show in the same content but without the same pre..

there should not have UAV and UBW show in the same time

simple example:

Person→student[DirA])

Person→professor[DirB]

DirA→direction[Name.Number?.Add?]

DirB→direction[Name.Add?.Phone?]

the gramma is STG but not RCG not LTG

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Web Data model

Web Data model

1.Semi-Structured Data (SSD)

2.XML

3.XML：Basics

XML：well-formed

XML：Valid

4. VALID in xml:

利用dtd來進行validation的過程，講正則表達式轉(zhuǎn)化為一個自動機讓它進行識別的過程。

利用XML SChema進行匹配

利用tree automata 進行xml校驗

bottom-up non-deterministic tree automata:

Top-down tree automata

5. Regular Tree grammars

Competing NON-terminals:

Local tree gramma:

Sing-type tree gramma:

Restrained Competition Gramma:

simple example:

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Web Data model

1.Semi-Structured Data (SSD)

2.XML

3.XML：Basics

XML：well-formed

XML：Valid

4. VALID in xml:

利用dtd來進行validation的過程，講正則表達式轉(zhuǎn)化為一個自動機讓它進行識別的過程。

利用XML SChema進行匹配

利用tree automata 進行xml校驗

bottom-up non-deterministic tree automata:

Top-down tree automata

5. Regular Tree grammars

Competing NON-terminals:

Local tree gramma:

Sing-type tree gramma:

Restrained Competition Gramma:

simple example:

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

利用dtd來進行validation的過程，講正則表達式轉(zhuǎn)化為一個自動機讓它進行識別的過程。