I recently just started taking proper class in Networking and while I was deeply fascinated by the concept of networking, I found it a bit difficult understanding TCP(Transmission Control Protocol).
A couple of basic concepts we would be using are:
- Opening a network socket that allows us send TCP packets
- Sending a HTTP request to google.com using
- Getting and Reading the reply we get
Also note, correct error handling wasn't taken into place with this.
The first thing we will need is to make a handshake with google. Here's a way a TCP handshake works:
Assuming we have a two syllable word index which is broken down into IN-DEX.
The user sending a HTTP requesting gets to use: IN
Google who is accepting this request is assigned: INDEX
while me the user gets to be assigned: DEX
In a simple code, this will look like this:
# My local network IP src_ip = "192.168.0.11" # Google's IP dest_ip = "188.8.131.52" # IP header: this is coming from me, and going to Google ip_header = IP(dst=dest_ip, src=src_ip) # Specify a large random port number for myself (59333), # and port 80 for Google The "S" flag means this is # a SYN packet syn = TCP(dport=80, sport=59333, ack=0, flags="S") # Send the SYN packet to Google # scapy uses '/' to combine packets with headers response = srp(ip_header / syn) # Add the sequence number ack = TCP(dport=80, sport=self.src_port, ack=response.seq, flags="A") # Reply with the ACK srp(ip_header / ack)
The idea of TCP is ensuring that we're able to resend packets in a case where some packets go missing. Sequence numbers is a way to check if we have missed packets. In a case where google sends a 3 packets with a size of 100, 200, and 300 bytes. While also assuming the initial sequence number to be 0. Now, these packets will have numbers of 0, 100, 300 and 600.
A TCP Sequence Number is a 4-byte field in the TCP header that shows the first byte of the outgoing segment. It also keeps track of how much data has been transferred and received. The TCP Sequence Number field is always set.
An example is, the sequence number for a packet is X. The length for this packet is Y. If the packet is transferred to another side efficiently, then the sequence number for the next packet is X+Y.
When we send or resend packets, how does google then know we have a missed packet? Well, for every time a packet is received by google we also need to send an ACK saying we got the packet with the sequence number. If as at when the server notices the packet hasn't been ACKed(An ACK packet is any TCP packet that acknowledges receiving a message or series of packets), it will then resend it. Find out more concepts on TCP
If you ran the above code, you'll notice there was an error and we got a different packet. In this case what is happening is:
Python prgram: IN Google: INDEX Kernel: Didn't ask for this Python Program: ..
The question then lies in how do we go around the kernel? One way to do this is through ARP spoofing which is to act as though we have a different IP address
The exchange now looks like:
me: sends packets for 192.168.0.129 to the address router: goes through with it my Python program: IN (from 192.168.0.129) google: INDEX kernel: this isn't my IP address! <ignore> my Python program: uses ACK
If you notice, this then works and we can now send packets to get our responses without the kernel getting in the way.
To prevent google from sending the html for google we need to take into account the following:
- Ensuring to put together a packet with a HTTP GET request
- Adequately making sure we can listen for a lot of packets now just a single packet.
- Fixing bugs with sequence numbers
- Closing the connection properly
If you notice, once everything got working, using wiresharkto look at the packets being sent looks like this:
User/google: <tcp handshake> User: GET google.com google: 100 packets User: 3 ACKs google: <starts resending packets> User: a few more ACKs google: <reset connection>
In the above scenario, google will send packets faster that the python program could handle, sending ACKs. Google server will then assume there were possible network problems causing the user not to ACK the packets.
This will then reset the connection because google will decide there were connection problems. But we do know the connection is fine and the program was adequately responding. It was an issue where the python program was slow to acknowledge the packets.
One of the set-backs we received was how slow the python program. It is also important to properly understand the main concepts associated with TCPs, how they work and how to handle requests as these will help in ensuring you understand in-depth what TCPs entail and how to solve bugs when they are encountered.