Home | Anecdotes | Articles | Icebreakers | Programs | Quotations | ShopDashboard | Log in | Sign up

ChatGPT Now Cites its Sources


Image generated by ChatGPT

I stopped using AI when I discovered that that Meta, the parent company of Facebook, Instagram and WhatsApp, was using LibGen’s database of pirated material, to train its AI models. I also discovered that my book, Managing the Training Process, had been uploaded to LinGen.

I then had a question which was best answered by AI. So, not wanting to use Meta, I decided to go back to ChatGPT.

While ChatGPT was answering my question, I noticed an image that I hadn’t seen before:

Clicking on the image displayed the following information:

As I had missed the introduction of citations, I was curious as to when ChatGPT introduced them. The obvious way to find out was to ask ChatGPT itself and this is the response I got:

ChatGPT began displaying the sources of its information in July 2024 with the introduction of a feature called ChatGPT Search (originally known as SearchGPT). This feature was first launched as a prototype on July 25, 2024, allowing ChatGPT to perform real-time web searches and include citations to external websites in its responses (Wikipedia).

The feature was initially available to a limited group of test users but was gradually rolled out to a broader audience. By October 31, 2024, ChatGPT Search became available to ChatGPT Plus and Team subscribers , and by December 16, 2024, it was accessible to all logged-in users in regions where ChatGPT is available (WikipediaOpenAI).

With ChatGPT Search, users can receive timely answers that include links to relevant web sources, enhancing the chatbot’s ability to provide up-to-date and verifiable information. This development marked a significant step in improving the transparency and reliability of ChatGPT’s responses (OpenAI).

Note that ChatGPT also included citations in this response. Not every response includes citations, so I asked why:

ChatGPT doesn’t provide citations for all responses for a few key reasons:

1. Not All Answers Are Based on Specific Sources
Many responses are generated from general knowledge learned during training, which includes patterns from a wide range of texts rather than specific documents. For example, basic math, definitions, or common facts (e.g., “the Earth orbits the Sun”) don’t typically require citation because they’re widely agreed upon and not sourced from a specific article.

2. The Model Doesn’t Track Sources Natively
During training, ChatGPT doesn’t memorize or store specific sources. It learns patterns in language and information from a massive dataset, so it can’t “look back” and say exactly where a piece of information came from — much like how a person might remember a fact without remembering the original book or article.

3. When Citations Are Needed, They Must Be Retrieved Actively When citations are crucial (e.g., in research or when the user asks for sources), the model can:

- Use tools (like web search) to find and cite current, specific sources.

- Generate references when prompted, though these can sometimes be fabricated unless real-time search is used.

4. Some Responses Are Based on Reasoning or Synthesis
Many answers are a combination of different ideas or logical steps. For example, an explanation of a complex concept might draw on many influences and doesn’t match a single source that could be cited.

If you need citations or source references in a response, you can ask for them directly (e.g., “Can you provide sources?”), and I can either generate credible references or use real-time search to find accurate ones.

Conclusion

Although there is quite a way for AI to become more trustworthy, this is a step in the right direction.

I also have to ask whether AI is the real cause of writers’ work being exploited. More about this in a future story.

More technology articles
🍯 Leave a tip
538   601


About Learning Pages | Support us