5

TCP链接主动关闭不发fin包奇怪行为分析

 2 years ago
source link: https://blogread.cn/it/article/4096?f=hot1
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

TCP链接主动关闭不发fin包奇怪行为分析

浏览:5028次  出处信息

问题描述:
多隆同学在做网络框架的时候,发现一条tcp链接在close的时候,对端会收到econnrest,而不是正常的fin包. 通过抓包发现close系统调用的时候,我端发出rst报文, 而不是正常的fin。这个问题比较有意思,我们来演示下:

$ erl
Erlang R14B03 (erts-5.8.4) 1 [64-bit] [smp:16:16] [rq:16] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.4  (abort with ^G)
1> {ok,Sock} = gen_tcp:connect("baidu.com", 80, [{active,false}]).
{ok,#Port<0.582>}
2> gen_tcp:send(Sock, "GET / HTTP/1.1\r\n\r\n").
ok
3> gen_tcp:close(Sock).
ok

我们往baidu的首页发了个http请求,百度会给我们回应报文的,我们send完立即调用close.

然后我们在另外一个终端开tcpdump抓包确认:

$ sudo tcpdump port 80 -i bond0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 96 bytes
17:22:38.246507 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: S 2228211568:2228211568(0) win 5840 <mss 1460,sackOK,timestamp 2607833238 0,nop,wscale 7>
17:22:38.284602 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: S 3250338304:3250338304(0) ack 2228211569 win 8190 <mss 1436>
17:22:38.284624 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: . ack 1 win 5840
17:22:52.748468 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: P 1:19(18) ack 1 win 5840
17:22:52.786855 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: . ack 19 win 5840
17:22:52.787194 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: P 1:179(178) ack 19 win 5840
17:22:52.787203 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: . ack 179 win 6432
17:22:52.787209 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: P 179:486(307) ack 19 win 5840
17:22:52.787214 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: . ack 486 win 7504
17:23:01.564358 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: R 19:19(0) ack 486 win 7504
...

我们可以清楚的看到 R 19:19(0) ack 486 win 7504,发了个rst包,通过strace系统调用也确认erlang确实调用了close系统调用。
那为什么呢? @淘宝雕梁,tcp协议栈专家回答了这个问题:

在net/ipv4/tcp.c:1900附近

...
/* As outlined in RFC 2525, section 2.17, we send a RST here because
* data was lost. To witness the awful effects of the old behavior of
* always doing a FIN, run an older 2.1.x kernel or 2.0.x, start a bulk
* GET in an FTP client, suspend the process, wait for the client to
* advertise a zero window, then kill -9 the FTP client, wheee...
* Note: timeout is always zero in such a case.
*/
if (data_was_unread) {
/* Unread data was tossed, zap the connection. */
NET_INC_STATS_USER(sock_net(sk), LINUX_MIB_TCPABORTONCLOSE);
tcp_set_state(sk, TCP_CLOSE);
tcp_send_active_reset(sk, sk->sk_allocation);
..

代码里面写的很清楚,如果你的接收缓冲去还有数据,协议栈就会发rst代替fin.
我们再来验证一下:

$ erl
Erlang R14B03 (erts-5.8.4) 1 [64-bit] [smp:16:16] [rq:16] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.4  (abort with ^G)
1> {ok,Sock} = gen_tcp:connect("baidu.com", 80, [{active,false}]).
{ok,#Port<0.582>}
2> gen_tcp:send(Sock, "GET / HTTP/1.1\r\n\r\n").
ok
3> gen_tcp:recv(Sock,0).
{ok,"HTTP/1.1 400 Bad Request\r\nDate: Fri, 01 Jul 2011 09:24:37 GMT\r\nServer: Apache\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n127\r\n<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<HTML><HEAD>\n<TITLE>400 Bad Request</TITLE>\n</HEAD><BODY>\n<H1>Bad Request</H1>\nYour browser sent a request that this server could not understand.<P>\nclient sent HTTP/1.1 request without hostname (see RFC2616 section 14.23): /<P>\n</BODY></HTML>\n\r\n0\r\n\r\n"}
4> gen_tcp:close(Sock).
ok
5>

这次我们把接收缓冲区里的东西拉干净了。

再看下tcpdump:

...
17:36:07.236627 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: S 3086473299:3086473299(0) win 5840 <mss 1460,sackOK,timestamp 2608642228 0,nop,wscale 7>
17:36:07.274661 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: S 738551248:738551248(0) ack 3086473300 win 8190 <mss 1436>
17:36:07.274685 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 1 win 5840
17:36:10.295795 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: P 1:19(18) ack 1 win 5840
17:36:10.334280 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: . ack 19 win 5840
17:36:10.334547 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: P 1:179(178) ack 19 win 5840
17:36:10.334554 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 179 win 6432
17:36:10.334563 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: P 179:486(307) ack 19 win 5840
17:36:10.334566 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 486 win 7504
17:36:19.671374 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: F 19:19(0) ack 486 win 7504
17:36:19.709619 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: . ack 20 win 5840
17:36:19.709643 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: F 486:486(0) ack 20 win 5840
17:36:19.709652 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 487 win 7504
...

这次是发fin包了。

多隆同学再进一步,找出来之前squid client代码中不能理解的一句话:
client_side.c

...
/* prevent those nasty RST packets */
{
char buf[SQUID_TCP_SO_RCVBUF];
while (FD_READ_METHOD(fd, buf, SQUID_TCP_SO_RCVBUF) > 0);
}
...

总算明白了这句话的意思了!

小结:认真学习协议栈太重要了。

建议继续学习:

QQ技术交流群:445447336,欢迎加入!
扫一扫订阅我的微信号:IT技术博客大学习

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK