- When the only computers were people
- The Turing machine
- When computers were as big as a house
- From batch processing to time sharing
- From the ARPANET to the chaotic web
- The World Wide Web
- Search engines take over
- What Web are you looking for through Google?
- What is the indexable web?
- What is the visible web?
- What is the deep web?
- Summary and conclusions
Today we all give the internet for granted. That is not surprising, considering that over 3.5bln people are connected. Also, that number grows each second, faster and faster. Almost like a rocket heading toward the space undisputed, the web is conquering larger and larger pieces of the world population.
Indeed, as of today, around 46% of the world population has access to the internet. To put that into context, imagine that only in 1995, the world population connected to the internet was less than 1%! This means that each year in the last two decades the web users base has grown at a double-digit rate.
And although that rocket, which we call “web” sooner or later will stop, the only question that pops to the mind of any modern human is “how long will it take for the web to be used by 100% of the world population?”
Even though this question is legitimate, that is not the most important one. We make a case for the benefits that the web offers to its users. We may almost get horrified when we hear that China’s government is censoring the internet. We may also find ridicule when the French Government states that workers have the right to be disconnected, outside the working hours.
Yet, instead, that thinking about the web, from the quantitative standpoint (how many users will join in?); the real, most crucial question is “how will the web work in the future?” In other words, we want to make a qualitative assessment. So far we managed to create web technologies that although bettered our lives, also made us more stressed, confused and dehumanized. Is this where we are heading? Hopefully not. In this article I want to give you a very quick summary of the web. This is not a complete history, but just my point of view.
When the only computers were people
We all give for granted that a computer is a squaring device, that sits on your desk, and it is able to perform a variety of tasks. From computation to coding and video editing, computers nowadays are indispensable for any human. Yet the term “computer” was used starting from the early 17th century. How is that possible?
Before those machines that sit on our desk were invented, all the tasks related to computing were delegated to humans. In fact, what seemed a trivial task, actually had a crucial role in modern human societies, since thousands of years. From roman numbers to the invention of logarithms (by the Scottish mathematician John Napier, see “Computing before computers“), computing was made easier and easier. While in middle-age societies computing was mainly an accounting task, which made possible the birth and rise of empires and powerhouses (like the Medici Family in Florence and the Rothschild Family in Europe); it eventually became more and more important in the scientific and military field.
The first real attempt at substituting a human with a machine, to perform computational tasks was made nu Charles Babbage in 1828. The seed was planted.
The Turing machine
What we want is a machine that can learn from experience. Alan Turing, London 1947
Even though the quote above was pronounced in 1947, it still sounds revolutionary! It was pronounced by Alan Turing, the father of modern computation. Still, today is hard to think of a machine as “intelligent,” (yet Alan Touring deeply thought about machine intelligence, more than three-quarters of a century ago. His idea was simple but extremely powerful. Turing thought what if we made machines able to learn from experience and to solve problems by using rule-of-thumb principles fed to them, those machines could become extremely useful to humanity. In short, by making a machine working through a heuristic, a short-cut, that same machine could have solved a problem way faster that otherwise would have been possible.
It was the birth of the theoretical framework behind modern computers.
When computers were as big as a house
Time is running out. In one hour or so, you have the presentation of the quarterly financials. Your boss is waiting for you in his office. You are about to panic. Yet, you take a deep breath, relax and open an excel spreadsheet. Within that file, there are hundreds of tabs, tables, arrays. In a plethora of excel formulas, which go from what if statements, to VLOOKUP, you are ready to summarize the huge amount of data in a nice and clean Pivot Table. You are ready to rock!
That same file, which is saved on your desktop computer, stored in a device that weighs less than one kilogram, is something that you take for granted. Yet it took decades for computers to become as useful to humans as it was first imagined by Alan Turing. Also, computers were not the slim and light machines that we are used to but instead vast and voluminous machines that occupied entire rooms.
Thanks to Turing‘s new theoretical framework the first computers were created. It was the year 1943, in the middle of the Second World War. There were two huge battles going on. One was only visible to people. It was the war on the battlefields between the allies’ soldiers against Germany. Another battle, way less visible to common people, yet of extreme importance. It was the competition between the allies’ intelligence service against the Germans. It was not a battle fought through weapons but rather through the rise of a new technological tool, the modern computer.
In that scenario, in 1943, the British scientist Tommy Flowers, developed the Colossus. A computer used by Biriths to encrypt the German messages. Not long after Eckert and Mauchly from the University of Pennsylvania, developed the ENIAC. This machine looked more like a modern super-computer; in fact, it occupied 1,800 square feet (167 square meters), with 18,000 vacuum tubes and weighing almost 50 tons. Except the fact that to perform the same calculations that you can through your 11 inch Mac Air it probably took many ENIAC computers combined. In other words, imagine an entire castle occupied by those first digital computers, only to perform the calculations you are now able to do in a simple excel spreadsheet.
From batch processing to time sharing
Today programmers around the world write their lines of codes, press the enter button, and that’s it. The computer executes the command at the speed of light. Yet that is not how things worked back in the 1950s. In fact, at that time computers performed one task at the time. Those huge machines required a cooling system so powerful, that made sense to close those computers in a room. Yet those computers’ room was accessible only to few people, called “operators.” In other words, as Paul Alan, co-founder of Microsoft, narrates in his book “Idea Man;” programmers had to use a keypunch machine to convert a code into a punch card. Each punch card coincided with one line of code. Those punch cards were given to the operators, that according to their schedules and how they prioritized it, inserted those cards in those huge machines, to be eventually executed. This meant that if a punch card contained one single error, or it was bent, the whole work of the programmers was invalidated. On the other hand, the programmer wouldn’t have known it for days. This kind of system was called “batch processing,” and it made programmers’ life miserable.
It is not surprising then, that a new system came out only a few years later, in 1957. Rather, than having those computers controlled only by few operators; a remote connection was created. This remote connection allowed each programmer to communicate with the computer directly. In other words, finally, programmers could write their codes, without relying on punch cards, and operators. Therefore, multiple users could work on the same computer. “Time-sharing” was the first new revolutionary system, that allowed a leap forward in computer’s programming. In fact, finally, a computer could “communicate” with multiple users. Yet another step, which would have revolutionized the internet and therefore humanity, was the ability of computers to communicate with each other. How?
From the ARPANET to the chaotic web
As Kevin Kelly would put computers didn’t become interesting until they didn’t connect to the internet. Yet, the internet of today is not that of yesterday but evolved in few decades. Yet Kelly’s point is correct. Before, computers were those giants boring machines that performed clerical tasks. Before, computers could really make a difference in humans’ lives it took the advent of the web. But why did the web took off in the first place?
The World Wide Web
In less than a decade, the world wide web exploded! Today more than a billion websites comprise it!
How did it all start? Put it very shortly Sir Tim Berners-Lee in a side-project he was working on when at the CERN of Geneva figured out he could connect web pages with what we all know today as hypertext. Therefore, he came up with a protocol, which was the result of an extreme need to give a standard to the web. In fact, Tim Berners-Lee later said:
I just had to take the hypertext idea and connect it to the Transmission Control Protocol and domain name system ideas and—ta-da!—the World Wide Web … Creating the web was really an act of desperation, because the situation without it was very difficult when I was working at CERN later. Most of the technology involved in the web, like the hypertext, like the internet, multifont text objects, had all been designed already. I just had to put them together. It was a step of generalising, going to a higher level of abstraction, thinking about all the documentation systems out there as being possibly part of a larger imaginary documentation system.
However, surfing the web was still limited because you could just go from one page to the next through links: the effort it took to find what you were looking for was massive.
Search engines take over
In this chaotic web that was in some ways tamed by Sir Berners-Lee protocol, the Web was still a confusing place. That is why many ventured out in finding a way to search through those pages to see specific content to queries.
This idea led to the creation at the beginning of the 90s of search engines. In short, those were websites that allowed to scan through the web pages available at the time through the web to find what a user was looking for with the use of keywords.
That is until two young fellows and Ph.D. students at Standford University came up with an algorithm able to rank web pages by using a system similar to that used by research papers. In short, one of the most powerful ways to assess the popularity of a research paper was to look at the citations it received from other publications. That citation mechanism was also used to rank web pages. Those citations were represented by links. A website receiving a link from another website would receive a so-called backlink.
The more quality backlinks a website received, the more it could rank higher in the SERP. Backlinks are still the backbone of the web. However, on that spine, a new web blossomed.
What Web are you looking for through Google?
When you surf the web through a commercial search engine like Google you are only scratching the surface:
What you see on Google or other online communities is just the so-called surface web, which may comprise a small percentage of the total internet.
Therefore, if we had to proceed on layers, the following would be the layers of the web:
Surface Web > visible and indexable web
Deep web > non-indexable web and dark web
Let’s see the differences.
What is the indexable web?
Search engines use algorithms to rank web pages on the surface web. However, before to rank those pages, a commercial search engine has to form an index. To build that index the search engine uses software, called spiders that crawl web pages. Those web pages contain hyperlinks. By following those links, the web crawlers can index the web. However, what they index is only the so-called indexable web.
In fact, there are web pages which are not accessible to those crawlers. In other words, each website has a file called robots.txt that instructs a web crawler how to behave on a page. In short, it says something like “you ain’t index the page!”
All the pages indexed according to the robots.txt form the indexable web.
What is the visible web?
When you type something in Google‘s search bar, you’re not accessing the whole web but only their “point of view” of the internet, or if you want of the world.
Indeed, while Google‘s bots to crawl all the indexable pages, that doesn’t mean those pages will be shown to users, quite the opposite. Google‘s search algorithm will perform a sort of “censorship” of those pages to decide what is relevant and what not. That is how you get the results you want. Those results you get from Google, any other commercial search engine, but also from online communities such as Reddit, Facebook and so on are only partial versions of the indexable web. That is the visible web.
In short, the main difference between indexable and visible web stands in the “censorship” or “selection” applied by an algorithm to decide what the users can see.
What is the deep web?
Everything outside the indexable web is part of the deep web. Those can be pages that are protected by passwords, membership pages, but also pages that for instance company’s websites are not showing because are not useful to their users.
There is a subset of the deep web called dark web, where instead you can find anything from political activists to smugglers. That part of the web is accessible with ad hoc search engines like Tor.
Summary and conclusions
Throughout this article we went through the development of the web, starting from the first computer to the world wide web. Eventually, search engines like Google took over. Today Google, worldwide has a 92% market share, which means that every single day almost three billion and a half searches go through it! That is why it easy to confuse Google with the web. However, Google is not the web. I hope this very short guide could be of help to you. Yet far from being a definitive or complete guide (it would take a book for that) I hope things look much clearer now.