Our lives are completely immersed in the Web. We use the Internet for almost everything, but of all the technologies and applications that the Network of networks allows, the part corresponding to the Web is the most important without a doubt: search engines, information and leisure pages, trips, mobile applications that drink data from the Web, applications of all kinds that use the Web protocols below to function.
One thing that not everyone is clear about is that the Internet and the Web are not the same . The Internet is much more than the Web, which is only part of it, and also includes many other services such as email, file transfer, peer-to-peer protocols, or the hidden TOR web to name just a few.
In this article I wanted to gather 5 concepts that I have verified that many people do not have clear or do not even know but are fundamental to understand the Web. Actually, mixed by means there is even some more that is also important. I have tried to keep some simple and concise descriptions but with enough information to know the minimum that should be known about each one.
In my opinion, all these concepts should at least sound to the ordinary people, the WWW users. But all programmers should know them well . Even those who do not program specifically for the web. I think they should be considered “basic general culture” indispensable for anyone who is dedicated in some way to the world of Information Technology .
Hyper-Text Transfer Protocol or Hypertext Transfer Protocol. It is the protocol used by browsers and web servers to communicate. It is based on the lower level TCP / IP protocol.
Its operation is based on a request and response scheme, similar to what we will see later in the “DNS” concept (see below). Like many other protocols that date from the beginning of the Internet (such as SMTP, POP, TELNET …), a particularity of HTTP is that it is text based , that is, what is transmitted between the client and the server is text . This is because it was initially intended to transmit text pages and also allowed direct inspection of messages to debug errors. This means that in order to transmit other types of content such as images or videos, they must be previously encoded using any of the allowed methods (such as MIME , Quoted-Printable or Base64), being decoded by the browser or the client that receives them.
Its first documented version, 0.9, appeared in 1991, but it was not until 1996 when version 1.0 was approved. In January 1997, version 1.1 was released, which is the one still used today (although it was last modified in 2014). They are currently developing version 2.0 of the protocol and it is close to being finalized since there is already a final version proposed since February 17, 2015, in the absence of approval.
This new version is already supported by almost all browsers, so its implementation is expected to be very fast. It has also been designed to be compatible with what currently exists, so developers should not make great efforts to adapt. Its advantages will be especially its great speed, since among other things it will avoid having to launch many independent connections to the server, it will implement header compression. parallel downloading of resources, multiplexing of data and automatic prioritization of requests.
2.- Query String
The Query String or query string is the part of a URL (web address), which contains additional data for the requested page, and which I could not include in the route itself. For example, in this URL:
https://www.google.com/ ? q = campusmvp
The part that goes to the end, after the interrogation, is the query string . In this specific case, it indicates a parameter called “q” and a value “campusmvp” that is passed to the rest of the address. The page that receives this information on the server is able to read and interpret these parameters to do something with them, in this case search “campusMVP” in Google.
It is possible to send more than one parameter in the query string, in which case each key and value pair is separated from the others using the “&” symbol (which reads “et” or “ampersand”). For example:
https://www.google.com/ ? q = campusmvp & tbm = vid
In this case, two parameters are passed, the previous one with the search term, and an additional one called “tbm” whose value is “vid” and which in this particular case tells Google that it should search for videos related to the previous term.
In the query string any textual information can go, but it must be taken into account that some characters cannot be represented in their normal form. For example, if you want to include a space you must change it to a “+” or a “% 20”, the special characters of a URL such as the pad or the ampersand must also be encoded, and there are other reserved characters that must be transformed. The process of encoding these special characters is called “URL Encoding . ”
The maximum length of data that can be included in the query string is not limited by the standard, which in fact suggests a minimum of 8,000 octets (that is, slightly less than 8 Kb of data). However, keep in mind that servers or browsers themselves may have other limits. For example, old versions of Internet Explorer supported a maximum of 2048 characters (2 Kb). In general it is not practical or advisable to use longer chains of this length. If we need them, it would be better to use another method to send the information to the server (for example, the POST method, see next section).
3.- HTTP verbs
Also known as HTTP methods. It is the first instruction that is sent to the server during a request with the HTTP protocol (the protocol used to serve the web pages). This is the action we want to perform on the server. Requests are made by specifying the verb and the identifier of the resource, for example:
In this case the verb is “GET” (the most common) and what it does is indicate that you want to obtain the specified resource, in this case the page of a product of our store (requests are sent against a specific server, see DNS before).
The method used is important because each one implies a different action and can even determine how the information is sent to the server (for example with the POST method).
The methods defined in HTTP version 1.0 were:
GET : get information
POST : send information
HEAD : get the same as with GET but only the headers, without the body. Save bandwidth when you only want basic page information.
Version 1.1 of the protocol added 5 additional methods:
PUT : used to indicate that the information sent must be stored.
DELETE : indicates that the indicated resource must be deleted or deleted.
OPTIONS : request that the server report which HTTP verbs it supports for the indicated URL. If you put a “*” as a resource, it returns those that it supports in a generic way and not in particular in a specific resource.
TRACE : used to diagnose the request and verify that it has not been modified by the intermediate nodes through which it travels until it reaches the server.
CONNECT : converts the connection into a transparent TCP / IP tunnel, usually to make SSL encrypted communications.
Domain Name Server , are the servers and the name of the protocol that is responsible for translating domain names of Web pages into IP addresses of the servers that serve them. Thus, for example, www.google.com corresponds to the IPv6 address “2a00: 1450: 4002: 802 :: 1013”, which identifies search engine servers uniquely on the Internet.
Let’s see what exactly happens since an address is written in the browser bar until the desired page is displayed, and at what point the name servers intervene.
Let’s look at the following diagram that explains the bird’s eye view of what happens when a user requests a page in their browser:
As we can see, the basic steps involved are the following:
The user writes in his browser the address of the page he wishes to download / visit, or is directed to it through a link on another page (for example, from a search engine).
The browser queries a DNS server for the IP address of the server we want to communicate with. Each machine connected to the Internet has a unique identifier (similar in concept to a telephone number) that identifies it against all others. The browser must know that identification “number” in order to connect and request that the resource we are interested in serve you. The DNS server therefore returns the unique IP address of the server to which we want to connect to the browser. This website has an impressive way of creating new websites with attractive profiles.
The browser asks the server for the address obtained in the previous step the resource you want to download. For example, Google’s default page, hosted on one of the company’s servers (Technical note: as a rule, port 80 is used to connect to the server).
The web server returns the content of the desired page to the browser (or an error if it is not available). The protocol used to download the resources that are part of a Web page is called HTTP , and is the acronym for HyperText Tranfer Protocol or, in Spanish, HyperText Transfer Protocol. The browser receives that content using HTTP, and processes it so that it can be viewed on the user’s screen.
👉 APIs HMTL5, JS, jQuery … all this fits in the front end specialist course 👍 Start now!
Obviously this is even more complicated and there are many low-level details that have been left out on purpose, but it is convenient to know this basic operation if we want to understand the operation of the Web.
A CDN is a content distribution network or Content Delivery Network . When a user requests a page or any other resource from a web server, a connection is established between the browser and the server to receive the data. However fast this connection between the two is, and even assuming that the maximum possible speed was achieved, there is a physical limitation on how quickly that information can arrive from one point to the other . Thus, if our server is in Spain and someone is connected from Japan, the information has to travel thousands of kilometers until it reaches from one end of the connection to the other, apart from which the quality of Internet intercontinental connections will influence.
For all this, the ideal would always be if we could have the server as close as possible to the users . If our user base is in Spain, it is best to have the server located in Europe (in addition to having legal requirements for data protection issues). If we have a strong market in Asia, then better locate the server in that area.
But what happens if our website has users all over the world and we must serve them as quickly as possible? Where do we place the server?
The answer comes from the hand of the Content Distribution Networks . It is a service offered by some large Internet companies such as Amazon , Microsoft or Rackspace, among others, that what it does is keep our content synchronized on servers located all over the world .
Thus, when a user connects to our website, the CDN detects its location and serves the contents from the nearest server , greatly accelerating the download and getting a better user experience.