How does ChatGPT decide which sites to cite in responses?

The logic behind ChatGPT's source selection — and what you can do to be chosen

ChatGPT selects the sources it cites in responses based on three main factors: the site's ranking in Bing's index (which is queried in real-time when web browsing is active), the relevance and factual density of the content for the specific query, and the text structure for automated extraction by RAG systems. There is no fixed list of approved sources — the process is dynamic and changes with each response.

How ChatGPT's search works behind the response

When a user asks a question that requires current or specific information, ChatGPT activates web browsing and queries Bing. The technical process works as follows:

ChatGPT sends the query (or a reformulated version) to Bing's search API
Bing returns a list of ranked pages with text snippets from each result
ChatGPT processes these snippets and selects the most relevant ones to include in the response
The source is cited if the snippet was used as the basis for part of the response

This means ChatGPT visibility depends directly on Bing visibility. Sites that don't appear on Bing for a given query will not be consulted by ChatGPT for that same query.

What determines which snippet is selected

Within the results Bing returns, ChatGPT applies its own extraction logic based on RAG (Retrieval-Augmented Generation). Snippets with the highest probability of being selected have specific characteristics:

Direct answer at the start of the paragraph: RAG systems process text blocks and extract the most "extractable" snippet — the one that answers the question completely without depending on prior context. An article that begins with a direct answer to the title question has an immediate advantage.

Concrete and verifiable data: Princeton University research (2023) showed that content with statistics, percentages, and specific facts is cited up to 40% more frequently than narrative content without data. The model tends to trust quantified claims more.

Semantic match with the query: ChatGPT doesn't search for exact keywords, but for intent matching. An article that answers exactly the question the user asked has an advantage over one that addresses the topic tangentially.

Clear heading structure: H2 and H3 that function as direct questions or statements allow RAG to identify relevant sections without having to process the entire article.

What ChatGPT doesn't consider (or considers less)

Domain authority on Google: the Domain Rating or Page Authority calculated for Google has little correlation with ChatGPT visibility. Bing has its own ranking model, and a site with high Google authority may be less relevant on Bing for the same query.

Site size or history: new sites with well-structured and well-indexed content on Bing can appear in ChatGPT before established sites with generic content.

Keywords in meta tags: ChatGPT processes page content, not SEO metadata. Meta description and meta keywords don't influence source selection.

The difference between being indexed by ChatGPT and being cited

There are two distinct layers of how ChatGPT processes content:

Model knowledge (training): GPT-4 was trained with data up to a certain cutoff date. Content published before that cutoff may have been incorporated into the model's internal knowledge — without needing real-time search. For this content, the GPTBot (OpenAI's crawling agent) needed to have accessed the site before training.

Real-time search (Browse): for queries requiring recent or specific information, ChatGPT queries Bing at the moment of the response. Here, Bing visibility is the only factor that matters.

For companies publishing new content, the focus should be on real-time search — ensuring Bing correctly indexes the site and that GPTBot isn't blocked.

How to monitor if you're being cited

The most direct method is to manually test the most relevant queries in your market directly in ChatGPT. Search for the questions your customers ask and observe which sources are cited. If competitors appear and you don't, there's a diagnosis to make.

FRT Digital performs this diagnosis in a structured way as part of the AIO Score audit, which includes Bing visibility analysis and citation testing for the client's strategic queries. Learn about the complete AIO service.

Ready to take the next step?