關(guān)于 HTTP 協(xié)議的快速回顧

翻譯自:
http://www.haproxy.org/#doc1.4

1. Quick reminder about HTTP


當(dāng) haproxy 運(yùn)行于 HTTP 模式,請求報文和響應(yīng)報文都將被徹底地進(jìn)行分析和建立索引,因而基本上可以對 HTTP 報文的任何內(nèi)容進(jìn)行匹配。

如果能理解 HTTP 請求報文和響應(yīng)報文是如何建立的,那么在配置中編寫正確的規(guī)則就更為容易。

1.1. The HTTP transaction model


HTTP 協(xié)議是 transaction-driven,對應(yīng)于一個請求,有且僅有一個響應(yīng)。傳統(tǒng)的工作模式是這樣的:client 與 server 建立連接,client 向 server 發(fā)出 HTTP 請求報文,server 回復(fù)響應(yīng)報文給 client,連接關(guān)閉。新的請求只能新起一個新的連接發(fā)送:

[CON1] [REQ1] ... [RESP1] [CLO1] [CON2] [REQ2] ... [RESP2] [CLO2] ...

這種模式被稱為 "HTTP close" 模式,有多少個 HTTP transaction,對應(yīng)就有多少個連接被建立。當(dāng) server 回復(fù)了響應(yīng)報文后,服務(wù)端就主動關(guān)閉鏈接,因此 client 不需要知道內(nèi)容的長度。

由于 HTTP 協(xié)議的 transactional 屬性,有了改進(jìn)的方法。對于兩個連續(xù)的 transactions,server 在第一次響應(yīng)后不會馬上關(guān)閉連接。

在這種模式中,server 需要將響應(yīng)內(nèi)容的長度告訴 client 以避免客戶端無限期地等待。為此,一個特殊的 header 被使用:"Content-length"。這個模式被稱為 "keep-alive" 模式:

[CON] [REQ1] ... [RESP1] [REQ2] ... [RESP2] [CLO] ...

這種模式可以減少兩個 transactions 之間的延遲,并且減輕 server 端處理連接建立、關(guān)閉的工作。一般來說這種模式好于第一種 "HTTP close" 模式,但也不總是這樣,因?yàn)榭蛻舳私?jīng)常限制了他們的并發(fā)連接數(shù)為一個比較小的值。

最后一種改進(jìn)模式是 "pipelining" 模式。它仍然使用 keep-alive 連接保持,但 client 不等待接收第一個響應(yīng)之后才發(fā)送第二個請求,這對于獲取大量的圖片來組成一個頁面時是很有用的:

[CON] [REQ1] [REQ2] ... [RESP1] [RESP2] [CLO] ...

這種模式對于性能的提升是顯而易見的,因?yàn)?client 的一個請求與下一個請求之間沒有了網(wǎng)絡(luò)延遲。許多的 HTTP agent 不能正確支持 "pipelining" 模式,因?yàn)闊o法在 HTTP 中將請求和響應(yīng)進(jìn)行關(guān)聯(lián)。因?yàn)檫@個原因,server 必須嚴(yán)格按照接收到的請求的順序發(fā)送響應(yīng)。

HAProxy 默認(rèn)工作于 "tunnel-like" 模式,支持連接保持:對于每個連接,HAProxy 處理第一個請求,然后將后續(xù)的所有..(包括額外的請求) 轉(zhuǎn)發(fā)到被選擇的服務(wù)器。一旦連接建立,連接在 client 和 server 端都是持久的。

HAProxy 如果使用了 "option http-server-close" 選項,連接在 client 端是持久的,對于所有進(jìn)來的請求進(jìn)行獨(dú)立的處理,將它們分發(fā)到后端服務(wù)器,server 端以 "HTTP close" 模式工作。

HAProxy 如果使用了 "option httpclose" 選項,client 和 server 端都工作于 "HTTP close" 模式。

如果 server 在 "HTTP close" 模式工作不正常,可嘗試使用 "option forceclose" 或者 "option http-pretend-keepalive" 選項,或許會有幫助。

1.2. HTTP request


首先,我們看看這個 HTTP 請求:

Line Contents
number
1 GET /serv/login.php?lang=en&profile=2 HTTP/1.1
2 Host: www.mydomain.com
3 User-agent: my small browser
4 Accept: image/jpeg, image/gif
5 Accept: image/png

1.2.1. The Request line


Line 1 是 "request line",它總是由三個字段組成,三個字段通常以空格(LWS)分隔:

  • a METHOD : GET
  • a URI : /serv/login.php?lang=en&profile=2
  • a version tag : HTTP/1.1

這種結(jié)構(gòu)很好解析, HAProxy 可以自行對其進(jìn)行解析,所以無需用戶自己寫復(fù)雜的正則表達(dá)式去抓取其中的字段。

注:LWS (linear white spaces),which are commonly spaces, but can also be tabs or line feeds/carriage returns followed by spaces/tabs.

URI 可以有幾種不同的形式 :

  • 一個 “相對的 URI” :

    /serv/login.php?lang=en&profile=2

    這是一個不包括 host 部分的完整的 URL。一般情況下,服務(wù)器,反向代理和透明代理都接收這種 URI。

  • 一個 “絕對的 URI”,也被稱為 “URL” :

    http://192.168.0.12:8080/serv/login.php?lang=en&profile=2

它的組成為:
    scheme: 格式為 <協(xié)議名>://
    host:       主機(jī)名或IP地址
    端口號:        格式為 ":PORT",是可選項
    相對 URI: 以 / 為起始,跟在地址后面

反向代理一般會接收這種請求,但支持 HTTP/1.1 協(xié)議的服務(wù)器也必須接收這種形式的請求。
  • a star ('*') :

    這種形式必須和 OPTIONS 方法聯(lián)合使用,并且能被 relay。這是用于查詢下一跳的能力的。

  • an address:port combination : 192.168.0.12:80

    這必須和 CONNECT 方法聯(lián)合使用,用于通過 HTTP 代理建立 TCP 隧道,一般是為了 HTTPS,有時也為其他協(xié)議。

在相對 URI /serv/login.php?lang=en&profile=2 中,有兩個 sub-parts。

/serv/login.php 是 “path”,這是一個文件在服務(wù)器上的相對路徑。

lang=en&profile=2 是 “query string”,通常與 GET 方法一起使用,請求目標(biāo)通常是一個動態(tài)腳本。它的含義與具體的動態(tài)語言、框架、應(yīng)用相關(guān)。

1.2.2. The request headers

The headers start at the second line. They are composed of a name at the
beginning of the line, immediately followed by a colon (':'). Traditionally,
an LWS is added after the colon but that's not required. Then come the values.
Multiple identical headers may be folded into one single line, delimiting the
values with commas, provided that their order is respected. This is commonly
encountered in the "Cookie:" field. A header may span over multiple lines if
the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5
define a total of 3 values for the "Accept:" header.

從 Line 2 開始是 HTTP 的 headers(首部),格式為 header_name: value。

 2     Host: www.mydomain.com
 3     User-agent: my small browser
 4     Accept: image/jpeg, image/gif
 5     Accept: image/png
 <空行>

Line 4 和 5 可合并為一行:

Accept: image/jpeg, image/gif, image/png

Contrary to a common mis-conception, header names are not case-sensitive, and
their values are not either if they refer to other header names (such as the
"Connection:" header).

首部名對大小寫不敏感。

The end of the headers is indicated by the first empty line. People often say
that it's a double line feed, which is not exact, even if a double line feed
is one valid form of empty line.

首部以一個空行為結(jié)尾。double line feed :LFLF 也是一種有效的空行。

Fortunately, HAProxy takes care of all these complex combinations when indexing
headers, checking values and counting them, so there is no reason to worry
about the way they could be written, but it is important not to accuse an
application of being buggy if it does unusual, valid things.

HAProxy 能夠?qū)λ鼈冞M(jìn)行正確解析。

Important note:
As suggested by RFC2616, HAProxy normalizes headers by replacing line breaks
in the middle of headers by LWS in order to join multi-line headers. This
is necessary for proper analysis and helps less capable HTTP parsers to work
correctly and not to be fooled by such complex constructs.

1.3. HTTP response

以下是一個 HTTP response:

Line Contents
number
1 HTTP/1.1 200 OK
2 Content-length: 350
3 Content-Type: text/html

As a special case, HTTP supports so called "Informational responses" as status
codes 1xx. These messages are special in that they don't convey any part of the
response, they're just used as sort of a signaling message to ask a client to
continue to post its request for instance. In the case of a status 100 response
the requested information will be carried by the next non-100 response message
following the informational one. This implies that multiple responses may be
sent to a single request, and that this only works when keep-alive is enabled
(1xx messages are HTTP/1.1 only). HAProxy handles these messages and is able to
correctly forward and skip them, and only process the next non-100 response. As
such, these messages are neither logged nor transformed, unless explicitly
state otherwise. Status 101 messages indicate that the protocol is changing
over the same connection and that haproxy must switch to tunnel mode, just as
if a CONNECT had occurred. Then the Upgrade header would contain additional
information about the type of protocol the connection is switching to.

1.3.1. The Response line

Line 1 is the "response line". It is always composed of 3 fields :

  • a version tag : HTTP/1.1
  • a status code : 200
  • a reason : OK

The status code is always 3-digit. The first digit indicates a general status :

  • 1xx = informational message to be skipped (eg: 100, 101)
  • 2xx = OK, content is following (eg: 200, 206)
  • 3xx = OK, no content following (eg: 302, 304)
  • 4xx = error caused by the client (eg: 401, 403, 404)
  • 5xx = error caused by the server (eg: 500, 502, 503)

Please refer to RFC2616 for the detailed meaning of all such codes. The
"reason" field is just a hint, but is not parsed by clients. Anything can be
found there, but it's a common practice to respect the well-established
messages. It can be composed of one or multiple words, such as "OK", "Found",
or "Authentication Required".

Haproxy 自己可能發(fā)出以下的 status code :

Code When / reason
200 access to stats page, and when replying to monitoring requests
301 when performing a redirection, depending on the configured code
302 when performing a redirection, depending on the configured code
303 when performing a redirection, depending on the configured code
307 when performing a redirection, depending on the configured code
308 when performing a redirection, depending on the configured code
400 for an invalid or too large request
401 when an authentication is required to perform the action (when
accessing the stats page)
403 when a request is forbidden by a "block" ACL or "reqdeny" filter
408 when the request timeout strikes before the request is complete
500 when haproxy encounters an unrecoverable internal error, such as a
memory allocation failure, which should never happen
502 when the server returns an empty, invalid or incomplete response, or
when an "rspdeny" filter blocks the response.
503 when no server was available to handle the request, or in response to
monitoring requests which match the "monitor fail" condition
504 when the response timeout strikes before the server responds

Haproxy 的 4xx 和 5xx 狀態(tài)碼可進(jìn)行自定義,(see "errorloc" in section
4.2).

1.3.2. The response headers

Response headers work exactly like request headers, and as such, HAProxy uses
the same parsing function for both. Please refer to paragraph 1.2.2 for more
details.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • Spring Cloud為開發(fā)人員提供了快速構(gòu)建分布式系統(tǒng)中一些常見模式的工具(例如配置管理,服務(wù)發(fā)現(xiàn),斷路器,智...
    卡卡羅2017閱讀 136,578評論 19 139
  • 原文https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html...
    梁行之閱讀 1,380評論 0 0
  • 文|孟永輝 繼阿里與順豐矛盾公開化之后,京東與天天快遞又被爆出不和傳聞。2017年7月19日,天天快遞被京東單方面...
    產(chǎn)業(yè)深觀閱讀 583評論 1 2
  • 前段時間,我爸媽一直跟我說《摔跤吧,爸爸》很好看,在加拿大的我被他們說的也有點(diǎn)心動,于是我就跑到了電影院。 在加拿...
    Denise0112閱讀 200評論 0 0
  • 好面兒 我曾經(jīng)去過離家很遠(yuǎn)的地方,只是為了尋找歲月的模樣。尋歲月的路,絕非是坦途,好多次我都要抓到歲月的衣角了,卻...
    半朽閱讀 563評論 11 29

友情鏈接更多精彩內(nèi)容