โš™๏ธ
Technical SEO
Lesson 1 of 22 ยท Site Accessibility
FREE +50 XP
๐Ÿ‡ท๐Ÿ‡บ ะงะธั‚ะฐั‚ัŒ ะฝะฐ ั€ัƒััะบะพะผ

How Search Engines Find Pages

๐Ÿค–
Googlebot โ€” the search crawler
"Hi! I'm Googlebot. Every day I crawl billions of pages across the web. Want to know how I find your site and decide what makes it into search?"
๐Ÿ’ก Crawling โ€” the process of a search robot following links and downloading page content. Indexing โ€” analyzing and storing those pages in the search database.

How Does Googlebot Find Pages?

๐Ÿ”—
Links
Follows links from already known pages
โ†’
๐Ÿ—บ๏ธ
Sitemap
Reads the XML sitemap
โ†’
๐Ÿ“ค
Submit URL
Manual submission via Search Console

Three Conditions for Page Indexing

ConditionHow to check
โœ… The page has inbound linksAhrefs / GSC โ€” Internal links report
โœ… No Disallow in robots.txtCheck robots.txt in GSC
โœ… No noindex tag on the pageLook for <meta name="robots" content="noindex"> in source

Crawl Budget

Google cannot infinitely crawl your site. Each site is allocated a crawl budget โ€” the number of pages the bot will visit per session. For large sites, it's critical to spend that budget on the right pages.

How to avoid wasting crawl budget:

โ˜ Block filter/sort pages in robots.txt
โ˜ Fix 404 pages with 301 redirects
โ˜ Exclude duplicates via canonical tags
โ˜ Avoid infinite URLs with parameters
โ˜ Submit XML sitemap in GSC
๐Ÿง‘โ€๐Ÿ’ป
Alex audits a client's site
"Look โ€” you have 50,000 filter pages in the index. Googlebot burns its entire budget on them and never reaches the important product pages. That's why you have no traffic."
๐ŸŽฏ Remember: crawling โ‰  indexing. The bot can visit a page and not add it to the index (if the content is weak or there's a noindex tag). Track both metrics in Google Search Console.
๐ŸŽฎ Test yourself: select the conditions required for a page to be indexed!
๐ŸŽฏ
Lesson Task
Test your knowledge and earn +20 XP
โ† Course
Lesson 1 of 22
Go to Task โ†’