TCP stands for Transmission Control Protocol a communications standard that enables application programs and computing devices to exchange messages over a network. It is designed to send packets across the internet and ensure the successful delivery of data and messages over networks.
The Transmission Control Protocol is documented as RFC 761. As opposed to UDP, TCP is a connection based protocol. In such a protocol, no data is sent until the server and the client have performed an initial exchange of control packets. This exchange is called a handshake. This establishes a connection, and from then on data can be sent. Each data packet that is received is acknowledged by the receiving party, and it does so by sending a packet called an ACK. As such, TCP always requires that the packets include a source port number, because it depends on the continual two-way exchange of messages.
TCP/IP is specified in documents called Requests for Comment (RFCs) which are published by the Internet Engineering Task Force (IETF). RFCs cover a wide range of standards and TCP/IP is just one of these. It is one of the most commonly used protocols within digital network communications and ensures end-to-end data delivery.
TCP organizes data so that it can be transmitted between a server and a client. It guarantees the integrity of the data being communicated over a network. Before it transmits data, TCP establishes a connection between a source and its destination, which it ensures remains live until communication begins. It then breaks large amounts of data into smaller packets, while ensuring data integrity is in place throughout the process.
TCP provides the following services:In-order delivery,• Receipt acknowledgment,• Error detection,• Flow and congestion control
As a result, high-level protocols that need to transmit data all use TCP Protocol. Examples include peer-to-peer sharing methods like File Transfer Protocol (FTP), Secure Shell (SSH), and Telnet. It is also used to send and receive email through Internet Message Access Protocol (IMAP), Post Office Protocol (POP), and Simple Mail Transfer Protocol (SMTP), and for web access through the Hypertext Transfer Protocol (HTTP).
UDP versus TCP
An alternative to TCP is the User Datagram Protocol (UDP), which is used to establish low-latency connections between applications and decrease transmissions time. TCP can be an expensive network tool as it includes absent or corrupted packets and protects data delivery with controls like acknowledgments, connection startup, and flow control.
The connectionless protocols are useful where the minimum transfer overhead is required, and where the occasional dropped packet is
not a big deal. TCP's reliability and congestion control comes at the cost of needing additional packets and round-trips, and the introduction of deliberate delays when packets are lost in order to prevent congestion.
UDP is also the main protocol that is used for DNS, which is interesting because most DNS queries fit inside a single packet, so TCP's streaming abilities aren't generally needed. DNS is also usually configured such that it does not depend upon a reliable connection. Most devices are configured with multiple DNS servers, and it's usually quicker to resend a query to a second server after a short timeout rather than wait for a TCP back-off period to expire.
IP addresses can be assigned to a device by a network administrator in one of two ways: statically, where the device's operating system is manually configured with the IP address, or dynamically, where the device's operating system is configured by using the Dynamic Host Configuration Protocol (DHCP).
Dynamic Host Configuration Protocol is a client/server protocol that automatically provides an Internet Protocol (IP) host with its IP address and other related configuration information such as the subnet mask and default gateway. DHCP manages the provision of all the nodes or devices added or dropped from the network. DHCP maintains the unique IP address of the host using a DHCP server. It sends a request to the DHCP server whenever a client/node/device, which is configured to work with DHCP, connects to a network. The server acknowledges by providing an IP address to the client/node/device.
DHCP runs at the application layer of the TCP/IP protocol stack to dynamically assign IP addresses to DHCP clients/nodes and to allocate TCP/IP configuration information to the DHCP clients. Information includes subnet mask information, default gateway, IP addresses and domain name system addresses.
IP addresses on the Internet
The IP address space is managed by an organization called the Internet Assigned Numbers Authority (IANA). IANA decides the global allocation of the IP address ranges and assigns blocks of addresses to Regional Internet Registries (RIRs) worldwide, who then allocate address blocks to countries and organizations. The receiving organizations have the freedom to allocate the addresses from their assigned blocks as they like within their own networks.
There are some special IP address ranges. IANA has defined ranges of private addresses. These ranges will never be assigned to any organization, and as such
these are available for anyone to use for their networks. Wherever a network using private addresses needs to communicate with the public Internet, a technique called Network Address Translation (NAT) is used, which essentially makes the traffic from the private network appear to be coming from a single valid public Internet address, and this effectively hides the private addresses from the Internet.
Network Address Translation (NAT) is a process in which one or more local IP address is translated into one or more Global IP address and vice versa in order to provide Internet access to the local hosts. Also, it does the translation of port numbers i.e. masks the port number of the host with another port number, in the packet that will be routed to the destination. It then makes the corresponding entries of IP address and port number in the NAT table. NAT generally operates on a router or firewall.
If you inspect the output of ip addr or ipconfig /all on your home network, then you will find that your devices are using private range addresses, which would have been assigned to them by your broadband router through DHCP.
A network is a discrete collection of computers, servers, mainframes, network devices, peripherals, or other devices connected to allow data sharing.
Routing with IP
Routers are able to route traffic toward a destination network, and implied that this is somehow done by using IP addresses and routing tables.
One perhaps obvious method for routers to determine the correct router to forward traffic to would be to program every router's routing table with a route for every IP address.An IP address can be interpreted as being made up of two logical parts: a network prefix and a host identifier. The network prefix uniquely identifies the network a device is on, and the device can use this to determine how to handle traffic that it generates, or receives for forwarding. The network prefix is the first n bits of the IP address when it's written out in binary (remember an IP address is really just a 32-bit number). The n bits are supplied by the network administrator as a part of a device's network configuration at the same time that it is given its IP address.
DNS is also a protocol, which devices use to query DNS servers for resolving hostnames to IP addresses (and vice-versa). DNS distributes the work of looking up hostnames by using an hierarchical system of caching servers. When connecting to a network, your network device will be given a local DNS server through either DHCP or manually, and it will query this local server when doing DNS lookups. If that server doesn't know the IP address, then it
will query its own configured higher tier server, and so on until an answer can be found. ISPs run their own DNS caching servers, and broadband routers often act as caching servers as well.
The application layer
Application layer protocols include HTTP, SMTP, IMAP, DNS, and FTP. Protocols may even become their own layers, where an application protocol is built on top of another application protocol. An example of this is the Simple Object Access Protocol (SOAP), which defines an XML-based protocol that can be used over almost any transport, including HTTP and SMTP.
Programming for TCP/IP networks
To round up, we're going to look at a few frequently encountered aspects of TCP/IP networks that can cause a lot of head-scratching for application developers who haven't encountered them before. These are: firewalls, Network Address Translation, and some of the differences between IPv4 and IPv6.
A firewall is a piece of hardware or software that inspects the network packets that flow through it and, based on the packet's properties, it filters what it lets through. It is a security mechanism for preventing unwanted traffic from moving from one part of a network to another. Firewalls can sit at network boundaries or can be run as applications on network clients and servers.
The filtering rules can be based on any property of the network traffic. The commonly used properties are: the transport layer protocol (that is, whether traffic uses TCP or UDP), the source and destination IP addresses, and the source and destination port numbers.
Firewalls can also block outbound traffic. This may be done to stop malicious software that finds its way onto internal network devices from calling home or
sending spam e-mail.
APIs in Action
A web API is a type of API that you interact with through the HTTP protocol. Nowadays, many web services provide a set of HTTP calls, which are designed to
be used programmatically by clients, that is, they are meant to be used by machines rather than by humans. Through these interfaces it's possible to automate interaction with the services and to perform tasks such as extracting data, configuring the service in some way, and uploading your own content into the service.
There are hundreds of services that offer web APIs. A quite comprehensive and evergrowing list of these services can be found at http://www.programmableweb.com
The Extensible Markup Language (XML) is a way of representing hierarchical data in a standard text format. When working with XML-based web APIs, we'll be
creating XML documents and sending them as the bodies of HTTP requests and receiving XML documents as the bodies of responses.
XML is a markup based format. It is from the same family of languages as HTML. The data is structured in an hierarchy formed by elements.Unlike HTML, XML is designed such that we can define our own tags and create our own data formats. Also, unlike HTML, the XML syntax is always strictly enforced.
The XML APIs
There are two main approaches to working with XML data:
• Reading in a whole document and creating an object-based representation of it, then manipulating it by using an object-oriented API
• Processing the document from start to end, and performing actions as specific tags are encountered
we're going to focus on the object-based approach by using a Python XML API called ElementTree. The second so-called pull or event-based approach (also
often called SAX, as SAX is one of the most popular APIs in this category) is more complicated to set up, and is only needed for processing large XML files.
The basics of ElementTree
The XML tree structure makes navigation, modification, and removal relatively simple programmatically. Python has a built in library, ElementTree, that has functions to read and manipulate XMLs (and other similarly structured files). The xml.etree.ElementTree module implements a simple and efficient API for parsing and creating XML data.
( The xml.etree.ElementTree module is not secure against maliciously constructed data. If you need to parse untrusted or unauthenticated data see XML vulnerabilities.)
import xml.etree.ElementTree as ET
Amazon Web Services version 4 authentication for the Python Requests library.
*Requests authentication for all AWS services that support AWS auth v4
*Independent signing key objects
*Automatic regeneration of keys when scope date boundary is passed
*Support for STS temporary credentials
This is a companion library for the Requests module that automatically handles signature generation for us. It's
available at PyPi. So, install it on a command line with the help of pip:
$ pip install requests-aws4auth