Everything we found says that TCP is a connection-oriented service and that all applications that communicate over TCP must first establish a connection before they can communicate, and then remember to close the connection after they have done so.
But what is the TCP connection itself?
To conclude, the connection is actually a data structure in the operating system kernel called the TCP control block (TCB), or in linux, the tcp_sock structure. Not only the connection, but also the packets are controlled by a data structure known as the sk_buff structure in linux.
The state of TCP corresponds to the state of the TCP socket:
Why do we need TCB(TCP control block)?
When the application wants to write data, it does not send the data directly to the NIC driver, but puts it in a buffer first, and then according to a certain algorithm (after a certain amount or flush call), the data in the buffer is sent to the NIC (this is not accurate, in fact, the NIC actively copies the data from the buffer, but it does not affect our understanding).
When the data is received by the NIC, the packet first goes through the following steps:
- Packets are first verified as correct by the network card.
- The data link layer passes the packet to a different upper layer type (IP or other) depending on the type of header, and removes the data link layer header.
- The IP layer also needs to be verified, and then a different type (TCP or UDP) is selected depending on the IP header, the IP layer header is removed, and the remaining data is sent to the appropriate handler (tcp or udp).
- At the tcp layer, the handler selects a socket based on the port number in the tcp header and copies its load data into it.
So by now, we should know that each socket must have its own independent send buffer and receive buffer, and there should also be other control or flagging structures, which constitute the TCB, without which the received data simply does not know where to pass to?
By the way, there is one thing that is not mentioned here, which is that the data received by the network card is in the form of a data stream, and then when it goes to a certain layer (I forget), it is responsible for constructing the data stream into a form of data structure that the operating system can recognize, which will be very convenient for the operating system to operate.
Why is the quartet a unique identifier of the connection?
We have probably seen many times that a tcp connection is uniquely identified by a connection quaternion. A connection quaternion is defined as <\source ip, source port, target ip, target port>.
We won’t go into the details of why a quaternion is needed, because if any one of the four elements is missing, there will be a conflict. So how are these quartets utilized in the network reception process?
After reading the last part, you may have the illusion that the NIC is passing the data upward step by step. In fact, strictly speaking, this is not the case, when the NIC receives the data, after first verifying that there is no error, it will be sent directly to the memory buffer via DMA, and then send an interrupt signal to the CPU to notify the operating system that a packet has arrived.
Note: The memory buffer is not the socket receiving buffer, but a piece of memory requested from the operating system in advance by the NIC driver, and the driver will tell the NIC the address (physical address) and size of this memory in advance. If this memory buffer is not available, the NIC will simply drop the data.
After the operating system learns that a packet has arrived, it uses interrupt functions to execute and parse the packet step by step, knowing that the TCP layer, at which point TCP has to decide to send the packet to the socket’s receive buffer.
How do you find it? Here TCP is using a connection quaternion, and using that quaternion as a key, it looks up the hash table to find a pointer to the socket structure of the corresponding socket, and uses that pointer to find the receive buffer of the corresponding socket, and copies the load data into it.
So, if you want to launch an attack, then just fill in an IP address and the port number of the service the other party is running and send the data, which is ineffective, except that it will congestion the network, because there is no such quaternion, so the TCP protocol will not know which buffer to send the load data to.
So, a common attack is the SYN flood, which works on the principle that the server generates a lot of socket structures that take up a lot of memory, but no ACK data arrives, so that if there are thousands of SYN requests, the server will soon run out of memory and the server will go down.
HTTP short and long connections
Once you understand the above, you can really understand what a short connection and a long connection (persistent connection) is. There is nothing particularly technically complex about it.
For the HTTP 1.0 standard, the default connection is a short connection. That is, the server will close the connection after sending the last byte of data, i.e., it will recycle the tcp_sock structure, so that if the client sends data to the server again, it will be discarded. Even if the client still has this structure, the connection is said to be closed or broken.
Does the client know when the connection to the server is closed? I don’t know, both parties can close their connections at any time without having to notify each other. However, for a short connection, there is no point in notifying each other.
The downside of a short connection, as you may already know, is that if you want to send multiple requests to a server in a row, you need to establish a new connection for each request.
In order to reduce the time required to establish a connection, HTTP 1.1 introduced the concept of a long connection and made it the default connection method. What is a long connection? It is the fact that the socket structure is not recycled when a service is completed. Thus, any data sent by the client can be received by the server as long as the socket structure exists, which is called a long connection.
There is nothing particularly new about long connections compared to short connections, except that the socket structure takes longer to maintain. This is because an http long connection is more like a tcp long connection.
Another type of connection is a pipeline, or pipelined connection. What is this again? Actually, pipelined connections are a special form of long connections. We know that a long connection saves time to establish a connection, but for N consecutive requests, we still need to wait for the previous response to be received before we can send the next request, which is kind of like a sawtooth if you look at it in a timeline. Then we know that the network transfer time is very long, so if we need to initiate N requests and the client-to-server transfer time is t, then the total time is N * t * 2.
How can this time be shortened? Someone thought of a natural way to do this: could I just send the request without waiting for the previous response to come back? In the timeline diagram it’s like being in a pipe and sending requests all the time, and I don’t care about the state of the server.
The pipelined connection method does reduce the network transfer time, but it may also introduce new problems. Since the client doesn’t know when the server closes the connection, it’s possible that the requests in the pipe are partially processed by the server and then closed. But the client doesn’t know what the server handled? The client has the option of re-establishing the connection and retransmitting the request. If the requests are all for static data, that’s fine, but if they are for dynamic post data, such as an order data, then unfortunately the server has already processed the data and will process it again with a new copy. This is a far cry from the actual intent of the user.
For this reason, pipelined connections are best not used lightly.