WTSFest Philly is back on October 1st!
Author: Kaitlin McMichael
Last updated: 08/06/2026
In 2019, I had a problem. I was working at a company in which I had a small team – myself, and two new folks. We were under pressure from the directors and the CMO to deliver increased SEO traffic to our websites.
There was, as always, external competitive pressure too – I knew our main competitors had bigger budgets, testing tools, bigger teams.
But there was also another type of internal pressure – every other team in the business used data-driven decision-making: they were running product pricing tests, geo tests, nearly every change to our websites was A/B tested, except for SEO-related changes. I wanted to be able to run SEO tests, and show the results of those tests to my managers – just like the other product managers in the business were doing.
I planned to use SEO testing to provide direction on what would work, what wouldn’t, and to settle debates with other internal teams. Ultimately, I wanted to use these tests to leapfrog the competition.
I got quotes from a range of SEO testing vendors, put together a business case, and pitched it to the decision-makers.
Job done, right?
Actually, there were still obstacles. Few people at the company wanted to do SEO testing. There was little appetite to support a new type of testing tool, especially since there was already an A/B testing tool being used.
There wasn’t a clear opportunity to educate decision-makers on the unique value of an SEO testing tool, and there wasn’t an appetite to purchase more SEO-specific tools. Instead the desire was for more broadly applicable tools and solutions. Furthermore, engineering was not on board with a tool that would allow non-engineers to make changes to the website, outside the normal scope of changes deployed via the CMS. It seemed risky and new.
I had to go through the process of getting buy-in, first from my manager, and then from engineering and other decision-makers. Once I convinced my manager getting buy-in was a lot easier; and it helped that I showed our competitors were also using this tool, as well as others in our industry. I also overcame concerns both verbally in meetings, and via written documentation with detailed FAQs. In 2021, I secured the budget I needed to proceed. I went through this rigorous process again at the next company I worked at too.
In the first 6 months, we went from zero to 18 SEO tests. We moved winning experiences into production (and tests that were unsuccessful were prevented from moving to production). Debates were settled about best practices and other deeply-held convictions, and we raised the bar on test design quality.
In this article I’ll talk you through my journey with SEO testing, and what I learned along the way – if you’re in the process of implementing SEO testing in your organization, I hope you find it useful.
SEO testing is full funnel testing, covering Discovery, Engagement, Consideration, and Conversion.
Whereas traditional A/B testing is either conversion-rate focused, or focused on improving on-page engagement, SEO testing targets people in the discovery phase, i.e. people who are searching Google before visiting your website, are the test audience.
In this sense, when companies support A/B testing but not SEO testing, they are not testing at the top of the funnel. In fact, they could be releasing changes that have a detrimental effect on the discoverability of those pages in search engines and answer engines.
So SEO testing is important if your company relies upon or wants to increase discoverability in SEO and AEO. And yet, very few companies have actually implemented true SEO testing solutions that are baked into their website strategy.
Many marketers say they have done SEO tests, but when you ask how they set up the test, they say they made a change and then evaluated the impact. This could mean that they conducted a before/after test (also referred to as a pre/post test or a timed test), or it could mean they simply did something new to see what would happen.
There are several issues with this approach to SEO testing. When you conduct a test, you need to make sure it is a fair test in order to be confident in your results. Let’s say you publish new pages, or launch a new campaign, and then you analyze the results. How do you know that the results are better or worse than anything else you could have done?
You need a clear comparison in order to evaluate your results. This is how A/B testing works – by having an A and a B group that are very similar, with a controlled change that helps to eliminate bias or other issues that might muddy the waters and make it difficult to draw conclusions with confidence.
Without a control group, you can’t tell if your test was actually successful.
The biggest issue with before/after SEO tests is false correlations. It’s also a fairly slow way to conduct analysis.
I remember wanting to get away from that type of testing at my former companies – I felt hampered by the inability to conduct multiple tests at once and see when the results were statistically significant. Deep down I knew these tests were flawed. The websites I worked on always had seasonality, not to mention volatility due to Google algorithm updates, and sometimes even unrelated bugs could go out (without my knowledge) causing traffic fluctuations. With A/B testing, you can control for these things much better than with before/after tests.
Many websites use traditional A/B testing tools; according to Research and Markets the market was worth $1.3 billion in 2025, and it is expected to be worth $2.73 billion by 2032.
However, these tools are not built with SEO in mind, and search engines can get confused by traditional A/B tests. Google documentation states that for minor changes you don’t need to do anything special for Googlebot, because Google will essentially ignore the test experience.
But what if you don’t want Googlebot to ignore the test experience? With SEO testing, we want Googlebot to pay attention – to crawl, index and rank a change.
Search engines get confused by traditional tests.
SEO testing tools are built with search engines in mind. Googlebot needs consistent page experiences if they are going to index and rank those changes. Since bots do not accept cookies, a testing tool that tracks a user’s browser cookies and opts them into one page experience or another will not work. The page that the search engine requests would flip-flop back and forth between the test and control versions each time the search engine bot visits, which can be several times per day or week.
Whereas traditional A/B testing takes one page and splits it into A and B versions based on the amount of people coming to visit, SEO testing splits the A and B versions based on groups of similar pages.
In other words, traditional A/B testing is based on user groups, while SEO testing is based on page groups.
SEO tests run on groups of pages, not groups of users.
In order to run SEO tests, you need groups of similar pages, such as pages with a shared template. These groups of pages can be split up into your test and control versions, using SEO traffic as the main KPI.
There are tools on the market that will ensure the test and control versions have a high correlation for similar SEO visits in the 90 days leading up to the start of the test. This will give high confidence that any differences after the start of the test between the test and control groups are because of the changes which were made.
This type of testing works best when there are a significant amount of pages and traffic to those pages. Ideally you want more than 50-100 pages in a test group, with at least 1000 SEO visits per day. For smaller sites, this can be tricky. You may need to make your test groups larger or run for longer.
There are several vendors on the market who will automate this process for you. Here are the ones I’m aware of: SEOTesting; seoClarity; SEMRUSH; SearchPilot; seoscout; and STATSIG.
You can use SEO testing tools to run tests on virtually anything.
Using one of the SEO testing tool vendors I mentioned above, you can go from a test idea to starting a test within just a few minutes. Plus, depending on how the vendor script is implemented on your site, you can make virtually any customer-facing or HTML change. Here are a few examples of things you can test on a page:
The easiest SEO testing solutions to implement run via third-party JavaScript tags, which are usually deployed via a tag management system. To increase security control, you could look into first-party hosting.
To implement one of these types of solutions you’ll need to work with engineering to implement data validation and sanitation to decrease the ability for cross-site scripting hacks.
Whilst the set-up for these types of solutions is pretty hassle-free, one of the downsides is that they do not allow you to test interactive changes, you’re limited to the static HTML changes listed above.
Another downside is there can be a slight flicker when the page initially loads, which is especially noticeable when the user is on a slow connection. This is a concern to pay attention to especially when you test above-the-fold changes.
To remove the flicker effect and speed up the time it takes to render your test changes, you can use pre-rendering, dynamic rendering, or server-side rendering. Another option is often called middleware, in which a vendor’s CDN sits on top of your site’s CDN, and makes the update when data is sent to the CDN edge locations.
Another issue is that for AEO testing, in which you make changes in anticipation of it increasing your brand’s visibility in answer engines/LLMs, you’ll need a server-side rendering implementation. This is because LLMs other than Google Gemini do not render JavaScript, and so your test changes would be invisible to them (i.e. to ChatGPT, Perplexity, Claude).
Alternatively, you can use the vendor’s API to request changes be delivered at request or build time. With the API, you could then server-side render those changes. This allows for maximum control. These options will also allow for AEO testing since all search and answer engine bots would see the test changes delivered when they crawl the pages.
When conducting SEO tests, it’s important to start with well-crafted hypotheses, and to include an impact estimate. If you don’t, it will be hard to know where to start or where to prioritize.
If you are using an agentic integrated development environment (IDE) across your organization, you could create a simple agent skill that you can share with others in your company. It should prompt the user for the pertinent info – or better yet – provide MCP access to your website’s analytics to retrieve the required analytics data on the fly. It should also format the output of the hypothesis in a consistent way.
I recognise it can be tricky to know where to start with SEO testing – here are some examples of tests I have run:
In 2025, several of the tests I conducted were based on whether using AI-generated content was better or worse for SEO than human-generated content. I tested AI-generated titles on several pages types, vs the human-written title tags. I also tried this with meta descriptions, and page summaries.
In the cases where the human-written title tags were already well-crafted by SEO experts, there was no lift. But in the cases where title tags were not optimized at all, there was significant lift.
I also tested adding plain language descriptions of products to H1 tags, and in another test, adding the word “free” to the pages describing free-tier eligible products. I also tested dynamic headings based on product categories.
Some other tests I ran centered around the inclusion of FAQs for a couple of different page types, altered sales messaging, plus I also tested the impact of implementing a variety of types of schema.org JSON-LD markup.
AEO, or answer engine optimization, is built on the foundation of SEO. When you ask ChatGPT or another LLM a question, they will attempt to answer from their data model, but if they need to augment it, they will conduct searches on your behalf. Those searches will retrieve information from the sites that rank well in search engines.
Furthermore, LLM data models are trained in part on the datasets retrieved from search engines. So the technical foundations of SEO still hold true – pages still need to be discoverable in search and many of the quality signals that are true for organic search are also true for AI answer engines.
However, there are also differences.
Because LLMs attempt to answer the user’s question within their conversational interface, LLMs do not typically drive traffic to websites at the same rate that search engines do. The lack of AI referring traffic volume may make reaching statistical significance for your AEO testing tricky to achieve.
Plus, as discussed earlier, most answer engine bots do not yet read JavaScript – except for Gemini. So while with SEO testing you can get away with a client-side testing solution, you can’t do that for AEO testing, unless you only want to focus on Gemini.
Perhaps most problematic is the probabilistic nature of LLMs. LLMs are doing quick math every time you ask them something. Each word they produce is a string of next best guesses based on what the most common amalgamation of responses is. If you ask an LLM the same question over and over, you will likely get different responses each time.
LLM responses are also even more personalized than organic search results – each response you get from an LLM is based on your past prompts.
Only a few SEO testing solutions vendors currently offer AEO testing products – at least publicly.
Of these, most recommend using LLM referrer visits as the main KPI. The methodology of SEO split testing based on URL groups remains the same with AEO testing. But instead of using SEO visits as your KPI, swap it for LLM referrer visits, which you can retrieve from your website’s analytics. Just segment the LLM referrers you want to track. It’s likely the traffic to your website from LLMs is small, so you will need to run tests over a longer period than with SEO tests.
Another option to consider is that because LLM datasets are fairly static; only RAG (retrieval-augmented generation) will cause LLMs to visit a website and use this information to inform prompt responses, in near-real-time.
As such, you might elect to focus on the types of prompts where LLMs use RAG to generate responses. Because LLMs only use RAG when they need to augment the information in their model, this could mean that fresh content will perform best. If LLMs notice that your pages are dynamic and often updated, they may be more likely to visit often and to look to supplement their answers with content from your pages.
Think about tests like adding the top five most recent relevant blog posts to otherwise static category pages. Does that dynamic, fresh content cause the page to receive more LLM referral traffic? Sounds like the start of a promising test hypothesis.
In this era of rapid innovation, it's more important than ever to make sure that you are moving fast with confidence. If you rely upon what other people say worked for them, you are leaving your SEO strategy up to chance. And with AI search, you need to be even more confident that you are making the best choices for optimizing your brand's web presence.
Implementing an SEO testing solution has allowed me to be more creative, and rigorous in my work, and it's accelerated my ability to deliver meaningful results for the organization I work in.
I’m now able to quickly implement tests that would previously have taken months to get design, product, and engineering sign-off. I’m then able to showcase my test results, say with confidence what the impact of a change was, and what it could be if it were deployed across the site.
Without SEO testing, you can still move fast. But what you lack is the certainty that the changes you’re making are actually having an impact.
Implementing an SEO testing solution has been game-changing for me. And, if you’re struggling to deliver results, and secure investment in SEO in your organization, I suspect it will be game-changing for you too.
Kaitlin McMichael - Sr. Product Manager - Technical, Search, at Amazon Web Services (AWS)
Kaitlin is a digital marketing and SEO leader with more than 15 years of experience across in-house, agency, and consultancy roles. She currently works within the Marketing organisation at Amazon Web Services (AWS), where she partners with engineering teams to develop products, tools, and web experiences that improve the customer journey. Previously, she was Senior SEO Manager at Getty Images, leading teams of SEO strategists and data analysts. With expertise in technical and international SEO, content strategy, analytics, and SEO testing, Kaitlin is passionate about using data-driven approaches to drive growth and create better digital experiences.
We pay our authors, speakers & team to bring you helpful content like this.
We aim to always keep our content and community free and accessible.
If you've found value in WTS, please consider supporting us through our Buy Me a Coffee initiative.