, a depth first in grab
we usually see the love of spiders in Shanghai and Google robot is through the depth first and breadth first way to take up here, to make it easier for everyone to understand specially the author station to illustrate.
this way to grab, depth is constantly increasing. This is similar to the "home > company; > > products; price > company…" crawler to your site, along a column level down to grab the "Introduction" section is fetched after the sub column in the next. This width is grasping a certain reason, the site layout problem based on the page distance is often an important seed (seed site site is the starting point of the crawler grab) is relatively close, so customary.
recently love Shanghai enforcement of anti spam in increasing it makes a lot of rankings are subject to volatility, of course, the station is no exception, but the fish always understand the point search engine algorithm constantly adjusts itself in order to meet the user experience this point as long as we stand in the user’s point of view to the operation of their own station then website ranking nature is not bad. Today the station included this piece to share with spider climb two ways taken to change the site structure layout.
depth first grab similar sweep road station I, the homepage of > products; > road sweeper series, grab is preferred in such a way as to crawl, crawl up until these columns, in grasping the "road sweeper series" section of the article, this is similar to the depth first strategy. As for family relationships. The second son of the eldest son, and grandson is such a relationship.
so, you can see a large portal site, most likely to see some news, this is the distance from the site near the seed can be understood as more important pages; secondly, Chinese web depth without our imagination of so deep, reaching a "path is not only a crawler, so always find the nearest path to the current page, according to the relevant data show that Chinese web depth is 17; another point is that many reptiles cooperation strategy based on the rules of most of the initial web crawling for the station, the station will gradually turn to the link, closed grab is relatively strong.
is two, the breadth first crawl around
website for many of my friends have been a questioning of the topic, but we usually say sitemap produced there can not be small but it is the site of the hierarchical layout, why? The author will spider crawling two to one by one for you said:
> two grasping based on the above