YaCy: The Distributed Search Engine Based on a Decentralized Web

YaCy (read “ya see”) is a free distributed search engine, built on principles of peer-to-peer (P2P) networks. Its core is a computer program written in Java distributed on several hundred computers, as of September 2006, so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database (so-called index) which is shared with other YaCy-peers using principles of P2P networks.

Which features does YaCy have?

YaCy is a free search engine, based on a distributed network of computers, driven by a non-proprietary algorithm.

Free search engine

Other search engines – like Google are free – however, they are commercial search engines. In short, you get the service for free, but the data that leaks through Google’s search gets sold to marketers or companies through its advertising networks

Distributed search engine

instead of being stored in a facility the whole web part of YaCy network is distributed among the computers part of the network. In other words, while another traditional search engine, like Google, store the web in a giant datacenter, YaCy uses all the computers part of the network.

Google datacenter

YaCy “datacenter”

YaCy-datacenter

Your computer is what stores (together with the other computers the web indexed by YaCy)

Non-proprietary algorithm

While Google – and other commercial search engines – keep their algorithm secret (Google algorithm is also called black box as you know the input and output but now what happens in between). YaCy algorithm instead can be tweaked by anyone. For instance, if you go to GitHub (an open community for developers) you can contribute to that.

yacy-github

How many computers there are in YaCy network?

According to yacy.net

YaCy has about 1.4 billion documents in its index (and growing – download and install YaCy to help out!) and more than 600 peer operators contribute each month. About 130,000 search queries are performed with this network each day.

Is YaCy web search comparable to Google?

If we just compare the index of YaCy with that of Google as of 2014 we can have a rough idea of how small is YaCy’s web compared to Google. In fact, while YaCy has in its index of 1.4 billion documents, in 2014 Google had already 30 trillion pages indexed.

Comparing YaCy and Google would be like comparing apple to oranges. However, YaCy is an alternative to the traditional commercial search engines model.

How does YaCy web crawling work?

How does YaCy network look like?

YaCy

Source: yacy.net

Leave a Reply

Scroll to Top
FourWeekMBA