技术文章 - 翻译：HTTP1.1 Chunked-Encoding

您现在的位置：Hensence.com - 技术文章 - 翻译：HTTP1.1 Chunked-Encoding

翻译：HTTP1.1 Chunked-Encoding

关键字：HTTP,Chunked编码

2009-02-14 12:35:35

参考：RFC2068，翻译：Hendy
3.6 传输编码

传输编码值是用来表示一个传输实体为编码传输以确保其在网络上安全传输。这与那些传输实体的内容编码是不同的。这种编码只是消息的一个属性，而不是针对原始实体。(hendy注：即一个是对头部进行编码，而一个是对内容进行编码)

所有的传输编码值都是视情况而定的。HTTP/1.1在传输编码头部使用传输编码值。(参看14.40)

传输编码类似于MIME的内容传输编码。它的设计是为了使....

块状传输编码修改消息体以使可以传输一系列的大数据块，这些数据块都有自己的长度标识，并且接下来就是一个可有可无的包含实体头部域的页脚。这可以使一些必要的信息可以随动态产生的数据一起传输给接收者，以使接收者可以接收到完整的消息。

大数据块（头部）的组成：
*chunk
"0"CRLF ---CRLF即为\r\n，CR这\r，LF为\n -- 注意前面如果位数不够，后面用"0"(字符串0即0x20填充)
footer
CRLF

其中，chunk的组成为：
chunk-size[chunk-ext] CRLF -- 中括号中表示字段可有可无
chunk-data CRLF

chunk-size = hex-no-zero * HEX
hex-no-zero =

chunk-ext = *(";" chunk-ext-name [ "=" chunk-ext-value ] )
chunk-ext-name = token
chunk-ext-value = token | quoted-string
chunk-data = chunk-size(0CTET)

footer = *entity-header

块编码以一个带footer的大小为零的块结尾。使用footer的目的是为了提供一条高效的途径来提供有关动态生成的实体信息。应用程序不能在footer里发送没有显示定义对footer来说合适的头部域。比如Content-MD5或其它对HTPP的未来的数字签名或其它应用。

有关处理Chunked-body的解码的例子在19.4.6节中有陈述。

所有的HTTP/1.1的应用程序必须能够接收和解码"chunked"传输编码，并且必须忽略那些自己不能理解的传输编码扩展。如果一个服务器收到了一个自身不能理解的传输编码实体，它必须返回501（即没有实现）（给客户端）并且关闭连接。服务方不能发送传输编码给那些HTTP/1.0的客户端。

19.4.6 ——有关chunked encoding的解码伪语言表达方式

length = 0; // 初始化length为0,这个长度应该为所下载的文件长度
// 读入chunk-size及chunk-ext(如果有的话)及\r\n
read chunk-size, chunck-ext(if any) and CRLF
while(chunk-size > 0) // 如果chunk-size大于0
{
read chunk-data and CRLF // 读入chunk数据及\r\n
append chunk-data to entity-body // 将读到的chunk数据添加到实体（所下载的文件）
length = length + chunk-size; // 更新文件长度
read chunk-size and CRLF // 循环读入数据库大小
}
// 下面是干什么？循环读入entity-header并添加到已存在的header域中
read entity-header
while(entity-header not empty)
{
append entity-header to existing header fields
read entity-header
}

据理解，chunk-size是以字符串形式的十六进制表示的，并且以\r\n结尾。如：61 62 63 0d 0a则表示chunk-size = (HEX)abc = (DEC)2748。

参考PHP源码：
$header = "";
$response = "";

// connect
if (!($request=fsockopen('whatever.com',80,$errno,$errstr))) exit($errstr);
else {
socket_set_timeout($request,10);
// send request
fwrite($request,$post);
// get header -- 读出响应头
do $header.=fread($request,1); while (!preg_match('/\\r\\n\\r\\n$/',$header));
// check for chunked encoding
if (preg_match('/Transfer\\-Encoding:\\s+chunked\\r\\n/',$header))
do {
$byte = "";
$chunk_size="";
do {
$chunk_size.=$byte;
$byte=fread($request,1);
} while ($byte!="\\r"); // till we match the CR
fread($request, 1); // also drop off the LF
$chunk_size=hexdec($chunk_size); // convert to real number
$response.=fread($request,$chunk_size);
fread($request,2); // ditch the CRLF that trails the chunk
} while ($chunk_size); // till we reach the 0 length chunk (end marker)
else {
// check for specified content length
if (preg_match('/Content\\-Length:\\s+([0-9]*)\\r\\n/',$header,$matches)) {
$response=fread($request,$matches[1]);
} else {
// not a nice way to do it (may also result in extra CRLF which trails the real content???)
while (!feof($request)) $response .= fread($request, 4096);
}
}
// close connection
fclose($request);
}

// do something useful with the response
print($header);
print($response);