Have you launched a new website and later realized that you didn’t think through how to structure your categories?
Too many folks end up shooting themselves in the foot when it comes to taxonomies (a geeky, Greeky word describing the way the content is systematized).
But not you. Let’s look into which websites benefit from extensive categorizations and which don’t, as well as the SEO pitfalls to avoid.
But first, let’s review of the rise of taxonomies on the Web.
How Did It All Start
The origin of the categories craze has its roots in the past, when links as a basis for ranking websites felt like the invention of the wheel and Google had yet to become a powerful, publicly traded media company using AI in their algorithms.
You could say that In the early days, search engines resembled an injured football player. Back when search algorithms lacked sophistication, our digital athlete had to rely on crutches—meta keywords tag, keyword repetition in content, keyword cannibalized categories, etc.—simply to be able to function.
The name “Google Directory” sounds so strange today, doesn’t it?
With directories, you browsed, you dug, you drilled until you struck something.
But, apart from death and taxonomies, nothing stays the same.
A recovered athlete doesn’t need braces to walk. And as I’ve watched search engines over the last two decades make incredible leaps forward in terms of understanding user intent, taxonomies have become less important.
Users don’t drill down into categories. Search engines take you directly to the page you want.
When was the last time you turned to a general web directory for answers?
I recall the excitement I felt when accepted as an Open Directory Editor in a low-level category. And speaking of drilling-down, I unearthed this credential from one of my ancient resumes:
“Promote quality web sites as editor for major Internet directories such as DMOZ, Zeal, JoeAnt and others.”
Strange, indeed. But I find knowing how things worked back then to be helpful.
Which Websites Benefit the Most from Categories?
To help you navigate through taxonomies, let’s look at the types of websites that benefit most from broad category listings:
- eCommerce sites
- Large informational websites
- News and media outlets
As you see, the common denominator is having a large number of pages (measured in thousands).
Take Fox News and Amazon (imagine Amazon with no categories, LOL) as examples. With millions of pages, there must be a logical category structure to systematize the content. The same applies to their smaller competitors.
In addition, they pull additional convertible organic search traffic because the content on a category page is closely related to highly searched keywords.
Just think of a user searching “buy fridge online” and landing on the Fridges category of an online electronics store—or looking for “political news” and landing on the Politics category on Fox News.
You get to see a nice selection of goodies that you can browse, research, and compare. Just what you want!
Unfortunately, you can’t apply the same logic to these cases:
- Business sites
- Small informational websites (in narrow niches)
For these types of sites, I prefer to apply the “less is more” approach as a rule of thumb. Here’s why:
1. Ranking categories can bring trouble.
Ranking these types of sites in organic search with category pages, usually by placing some content below the categorized items, may be putting your resources in the wrong place.
The reason why is because taxonomy pages on such websites make for a poor user experience, which is a negative if interpreted that way by Google.
Let me show you what I mean.
First, take a look at the average screen resolutions for desktop computers:
The golden rule of usability claims that the main content must be accessible on the first screen (the content a user immediately sees when viewing a page) so that your visitors won’t get frustrated and leave in seconds.
And that’s where categories fail because those pages by definition have to show the categorized content first.
Here’s the first screen of a category page in 1920×1080 resolution:
If it was to rank with the content placed after the posts, the users with 1920×1080 resolution would have to scroll down 4.5 screens to get what they came for.
And with mobile devices, it’s way worse.
Will your visitors subject themselves to such an ordeal?
2. Categories may outrank content pages in personalized search.
The scenario described above, although real, happens quite rarely as search engines typically prioritize content pages over taxonomies.
However, things go south pretty quickly when we add into the equation personalized search, a customization feature that pushes the websites you regularly visit to the top of your organic results.
While it helps Google better predict your needs, it can hurt your most dedicated audience since categories may compete with your other pages for the same keywords—and even win.
Look at how a category can outrank a content page for the same term:
Sure, you can argue that personalization is done on merely a small fraction of overall searches.
But, since some users have a slightly different set of search results for the same term, you can never know when the excessive categorization hurts you or even estimate the damage done by it.
3. Categories promote duplicate content.
Duplicate content clutters search results and forces search engines to choose a winning page.
Since most content management systems (CMS) were not created with much forethought of SEO, taxonomy pages can repeat the same content.
The screenshot below showcases how Google perceives taxonomy pages which haven’t been optimized properly:
In short, the content of the original blog post gets duplicated on several different pages within the same website, an issue which tends to pile up as a site grows.
In other words, you see it as categorization. But to search engines, it’s clutter, which may impact your relationship with them.
4. Redundant categories distract your visitors from the main content.
I remember working at a search agency once where a team member thought he was doing a good thing by deeply thinking through all the potential categories that would work and assigning them to a client WordPress site. This resulted in total taxonomy overkill. I believe we ended up with more categories than posts.
Ask yourself if users would like to see that extra bunch of categories.
Every element of your website must facilitate reaching your business goals. Anything else must be tossed into the trash bin—and categories are no exception.
Visitors derive value from content.
When did you last get really excited about a business blog’s category page, so overwhelmed that you just had to tell someone about it?
There’s only one database on the Web that you need to find anything—and that’s Google.
Cats, not categories, rule the Internet.
How Many Is Too Many
Now, before you destroy every category on your website with a vengeance, remember the golden rule of creating a site structure: Every category must have a clear purpose.
Point blank, period. A powerful website structure charms your prospects into taking the actions you want them to. The rest should be discarded.
Take our very own Leverable SEO blog. It exclusively covers one topic: SEO. And while one may advocate dividing it into various subcategories, that doesn’t serve any purpose as the site is relatively small.
The same logic applies to other cases. Sure, no rigid rule exists insofar as how many categories you should have. But asking yourself these questions will help you approach the process of creating a site structure from the right angle:
- What’s the purpose of my website?
- How many pages will the site have?
- What types of pages are they?
- What page types do you want to rank in search?
Also, stay on the lookout for other taxonomy pages syphoning away Google’s attention:
- Tag pages
- Author pages
- Pagination pages
It’s funny that on a WordPress site with 12 awesome posts that you worked hard to write, you could see 4 category pages, 6 tag pages, 2 author pages, and 12 archive pages in one year. That’s twice the number of taxonomy pages than posts. Many WordPress sites I see have more taxonomy pages indexed than posts.
So, how can you get rid of the dead weight?
Here’s one way:
While you can easily delete unnecessary taxonomies, most of the time you’ll just want to remove those pages from the search results.
How to Prevent Search Engines from Indexing Taxonomy Pages
Leverable optimizes a lot of WordPress sites. WordPress forces you to have at least one category if you make a blog post. And people don’t always realize that when they tag a WordPress post in the admin, it creates a taxonomy page that Google indexes.
You can deny search engines access to some of the pages and folders on your website in multiple ways. Here are some of them:
Method #1: SEO Plugins for WordPress Websites
Let’s start off with the easiest way. If you happen to use WordPress, as roughly 51% of all websites do, blocking taxonomies from indexing won’t take more than five minutes of your time.
Chances are, you have one of two SEO plugins installed on your website (and you better get one if not): All in One SEO Pack or Yoast SEO.
These two have been dominating the WordPress realm for years, so I’ll focus on them. In case you opted for the first plugin, here’s how to solve the taxonomy riddle:
In the dashboard, hover over “All in One SEO Pack” and choose “General Settings.”
Once there, scroll down the settings, check the boxes shown below to block the corresponding taxonomy pages from indexing, and click “Update options” to apply the changes.
Here’s how you can do the same in Yoast SEO:
First, choose “Search Appearance” from the plugin menu.
Then, select the Taxonomies tab and toggle the switch to “No” to block categories (if you need to), tags, and formats (another type of duplicate content from an SEO perspective).
Then, move to the Archives tab and block author and date archives as well.
Easy, peasy, lemon squeezy.
Method #2: Robots Meta Tag
Another quick-and-dirty way to restrict search engines from indexing a page is to use the robots meta tag, which should be placed into the header section of a page (the space between <head> and </head>).
Simply copy and paste this line of code into the header:
<meta name="robots" content="noindex" />
Method #3: Disallow in Robots.txt
If you happen to be in the proud minority of website owners who resist the WordPress hegemony, this versatile method can be applied to any CMS and makes it possible to block hundreds of pages in one go.
We won’t delve into how to create and upload robots.txt to your website as Google has a great brief guide on that.
Instead, here’s a quick dive into the syntax of robots.txt so that we can get down to business straight away:
|User-agent: *||Address all search engine robots (Google, Yahoo, Bing, etc.)|
|Disallow: /folder/||Disallow indexing of a specific folder|
|Disallow: /page.html||Disallow indexing of a specific page|
|Allow: /folder||Allow indexing of a specific folder|
|Allow: /page.html||Allow indexing of a specific page|
You must know the basics so you won’t harm your own website. If you’ve mistakenly blocked certain URLs via robots.txt, search engines can no longer crawl through those pages to discover others.
Let’s begin the creation of a generic robots.txt file for WordPress sites (notice how the path to a page or folder always starts with a slash):
In general, if you copy and paste this into your robots.txt, it deals with most of the duplicates that tend to fall through the cracks. Just a few notes to make sure we’re on the same page:
“Disallow: /?s=*” blocks all the pages generated by users running search queries on your website.
The last rule involving categories is optional and should be applied based on your site structure.
As for archives, in WordPress, the URLs are built this way by default:
Hence, apply this simple rule to deny search engines access to all the archive pages created in 2019:
An important note: always, always, always check whether a URL has a trailing slash (at the end of a URL). Mark the difference with these examples:
Example #1: https://leverable.com/seo-blog/
Example #2: https://leverable.com/seo-blog
That way, robots.txt makes it possible for you to be in charge of the way search engines crawl your website—and we haven’t even scratched the surface.
You now know how to deal with taxonomies easily, painlessly, and permanently.
“Little by little, a little becomes a lot.”
This Tanzanian proverb perfectly sums up how small, seemingly minuscule things can make a huge difference in the long run.
And from an SEO perspective, taxonomy pages are the little thing you shouldn’t let remain under the radar.
Remember, every single category must have a purpose. Every single taxonomy page has to bring real value. Expunge or block the rest.
How do you categorize your website?