Skip to Main Content

Artificial Intelligence: Student Guide to ChatGPT

Fact-checking is always needed

AI "hallucination"
The official term in the field of AI is "hallucination." This refers to the fact that it sometimes "makes stuff up." This is because these systems are probabilistic, not deterministic.

Which models are less prone to this?
GPT-4 (the more capable model behind ChatGPT Plus and Microsoft Co-Pilot) has improved and is less prone to hallucination. According to OpenAI, it's "40% more likely to produce factual responses than GPT-3.5 on our internal evaluations." But it's still not perfect. So verification of the output is still needed.
 

ChatGPT makes up fictional sources
One area where ChatGPT usually gives fictional answers is when asked to create a list of sources. See the Twitter thread, "Why does chatGPT make up fake academic papers?" for a useful explanation of why this happens.

Since we've had many questions from students about this, we offer this FAQ:
I can’t find the citations that ChatGPT gave me. What should I do?
 

There is progress in making these models more truthful
However, there is progress in making these systems more truthful by grounding them in external sources of knowledge. Some examples are Microsoft Co-Pilot and Perplexity AI, which use internet search results to ground answers. However, the Internet sources used, could also contain misinformation or disinformation. But at least with Microsoft Co-Pilot and Perplexity you can link to the sources used to begin verification.
 

Scholarly sources as grounding
There are also systems that combine language models with scholarly sources. For example:

  • Elicit
    A research assistant using language models like GPT-3 to automate parts of researchers’ workflows. Currently, the main workflow in Elicit is Literature Review. If you ask a question, Elicit will show relevant papers and summaries of key information about those papers in an easy-to-use table. 
  • Consensus

    A search engine that uses AI to search for and surface claims made in peer-reviewed research papers. Ask a plain English research question, and get word-for-word quotes from research papers related to your question. The source material used in Consensus comes from the Semantic Scholar database, which includes over 200M papers across all domains of science.

Checking For Credibility

 

Evaluating all information for credibility is highly recommended, regardless where you find it. This is true for generative AI responses, especially given the information presented above. There are many different tools, checklists, and strategies to help you evaluate your sources. None of them are black-and-white checklists for determining if a source is credible and if you should use it.

Here are two strategies for evaluating information provided by generative AI tools:

1. Lateral Reading

Don't take what ChatGPT tells you at face value. Look to see if other reliable sources contain the same information and can confirm what ChatGPT says. This could be as simple as searching for a Wikipedia entry on the topic or doing a Google search to see if a person ChatGPT mentions exists. When you look at multiple sources, you maximize lateral reading and can help avoid bias from a single source.

Watch Crash Course's "Check Yourself with Lateral Reading" video (14 min) to learn more.

2. Verify Citations

If a generative AI tool provides a reference, confirm that the source exists. Trying copying the citation into a search tool like Google Scholar or the Library's Search Everything search tool. Do a Google search for the lead author. Check for the publication in the Journal Title Lookup Tool 

Second, if the source is real, check that it contains what ChatGPT says it does. Read the source or its abstract.