AI and Academics at Canisius University

AIs and University Academics: Implications

Briefly, Large Language Model AIs can do a variety of tasks within higher education. These include:

Create or augmenting lessons, activities or assignment prompts.
Write a draft or boilerplate letter of recommendation.
Create hypothetical scenario for use in a classroom case study.
Provide code for use in a larger computer programming project.
Enable students to quickly generate text to dishonestly submit in their own name, in response to an assignment. This might range from simple, short essays that summarize a text (even a .pdf), to a more complex thesis-driven term paper.

The last example has perhaps dominated much of the online conversation surrounding LLM AIs, and their relationship to academia. It and related academic dishonesty scenarios are important implications, but should not be the limit of our considerations surrounding AI.

Also important is that the quality of the above products might vary whether the AI employed is free or not. For example, OpenAI’s subscription-based GPT 4 engine, empowered by plugins, may allow some tasks that the free AIs do not.

LLM AIs have characteristics that mean several things when employed by students and faculty:

Simulator

The generative AI chatbots are remarkably capable of producing lucid, grammatically correct text in human languages, and represent the development of powerful new technology, quite unlike many previous attempts at machine intelligence. For an explanation of how these chatbots’ underlying engines work, and how they came about, watch Dr. David Sheets’ Introduction to Large Language Models.

Within our cultural context of science fiction, humans interacting with the AI chatbots often develop the sense that they are conversing with a thinking machine akin to C-3PO, the Geth, or HAL 9000. However, generative AIs are quite different than these fictional characters: Google Bard, ChatGPT, and Bing Chat are simulations of persons, rather than sentient, self-aware artificial automata of sci-fi lore. As Professor Sheets notes, “The goal of the model is to assemble sentences typical of what the system was trained on.” In COLI, we have taken to describing an LLM AI as a machine that simulates a person who knows what they are talking about responding to questions.

ChatGPT introduces itself. June 5th, 2023

LLM AIs’ nature as simulations both limit and augment their capabilities. While ChatGPT may not be able to pilot a drone aircraft or operate a robotic kitchen, it can simulate many styles of writing.

For example, a class studying Renaissance English sonnets might better understand the genre by asking an AI to compose some sonnets, and analyzing them. It may be humorous to prompt an AI to write sonnets about mundane, modern consumer products, but even silly or poorly-written poetry can be a platform for learning analysis.

Bing Chat replies to “compose an ad for toothpaste as an Elizabethan Sonnet.” May 25th, 2023

My lawnmower, fairest of all machines,
How doth thy blade so smoothly cut the green?
A thing of beauty, yet a thing of might,
I stand in awe of thy great power and might.

The engine roars, the blades do spin around,
And in a moment, all the grass is down.
I watch thee work, and I am filled with joy,
For thou hast made my yard a thing of joy.

The sun shines down, the birds do sing, and I
Am happy in thy presence, oh my lawnmower.
I thank thee for thy service, and I pray
That thou wilt continue to serve me well for many a day.

Oh, lawnmower, fairest of all machines,
I love thee more than words can ever say. — Google Bard pens an Elizabethan sonnet about a lawnmower.
May 30th, 2023

Prompt Engineering

Generative AIs can perform a wide variety of writing and image-creating tasks but even subtle differences in how users write prompts can create an almost infinite variety of responses.

This has led some educators to suggest that prompt engineering itself should be a skill and learning objective in various disciplines. The prompt process will likely undergo changes as well, so it is not clear what that skillset might look like in the future. But having students learn to use, and what to expect from AIs today can be a valuable information literacy lesson. Students may be directed to use an AI for an assignment, but the professor must include instructions on how to get the best out of the AI, or explicitly require students to experiment with crafting different prompts.

The chatbot AIs’ penchant for varying responses, together with subtleties in prompt engineering, mean that efforts to determine if an AI wrote a particular text is complicated. Even with an identical prompt, the same AI will regenerate similar, but not identical texts in response to a prompt. Beyond that, it can be difficult to generate similarity to a given text, without knowing the initial prompt for that text.

Hallucination

A major limit to the three widely-available LLM AI chatbots is their capacity for mistakes or falsehoods when composing text on various topics. The AIs themselves may not entirely be able to describe the phenomenon. When asked why it occasionally gets things wrong, ChatGPT replies that its answers will reflect shortcomings in its training data: biases, incomplete or wrong information, or ambiguity. Plus, it may struggle to interpret language within that training corpus. All this is potential correct and it is good that ChatGPT can provide this warning. But these issues are only part of the problem.

The AI Chatbots present simulations of humans, and do not possess human capacities for judgment of what is correct or not. “Truth,” Dr. Sheets points out, “is not necessarily a feature of these models.” If an LLM AI is prompted to answer a question for which it does not have training data it may decline to answer, or it may provide a plausible but fictional answer. These are what AI developers refer to as “hallucinations.” Some examples of these fictions could be:

descriptions of a book whose text or detailed summaries of the same are not in the AI’s training data. The AI might develop a plausible but false interpretation or summary based on the book’s title, or what other information it may have on the book’s subject. In COLI, we find that copyrighted works seem particularly troublesome in this respect. It is likely that such works, and text descriptions of those works, have not made it into LLM AI training data.
scientific or engineering explanations of complex phenomena. One example is rocketry.
biographies of non-famous individuals. Try asking for a short biography of you and your job title, even – perhaps especially – if it is already publicly available on the web. For example, ChatGPT had this to say about COLI Director Dr. Mark Gallimore: “Gallimore holds a Bachelor of Arts degree in English from St. Bonaventure University and a Master of Science degree in Education from Canisius University. He began his career in education as a high school teacher, teaching English and social studies in Buffalo, New York.” While he has the highest respect for the aforementioned institutions and high school teachers, this biography of Dr. Gallimore is in every respect false.
Explanations of its own processes. The AIs may reliably describe what they do in general terms, but suppose you ask it specifically how it generated a particular text: “How did you arrive at that solution?” “How did you determine that x is the case?” The AI cannot necessarily respond to these questions accurately, because it doesn’t reason, so much as generate (more) plausible text. “If you ask it to explain why it wrote something, it will give you a plausible answer that is completely made up,” Explains Professor Ethan Mollick, “When you ask it for its thought process, is not interrogating its own actions, it is just generating text that sounds like it is doing so.” Mollick warns that because of this, it is unlikely AIs can successfully interrogate their own biases.

We may say that LLM AIs “making stuff up” or “get it wrong,” but they are not malfunctioning. They simulate human composition. Here at Canisius Dr. Sheets explains how LLM AIs reply with false or inconsistent statements, and journalists such as Ars Technica’s Benj Edwards have periodically commented on this phenomenon since late 2022.

Likely, as Professor Sheets comments, specialist AI-powered products will appear built atop AIs that have been finely trained on particular training content suitable for a narrower set of task, such as supporting attorneys or healthcare professionals. The general-use chatbots accessible free of charge on the web struggle with a wider, more ambitious range of prompts and a much larger landscape of information. This explains the caveats offered by Google and OpenAI; In late 2022, OpenAI CEO Sam Altman warned that “ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness. It’s a mistake to be relying on it for anything important right now. It’s a preview of progress; we have lots of work to do on robustness and truthfulness.” His cofounder, Ilya Sutskever, admitted in March 2023 that “The thing that stands in the way of ChatGPT being really useful to many people for many tasks is reliability.”

OpenAI comments that their subscriber-only ChatGPT Plus bot, powered by GPT-4, shows a 40% improvement in avoiding wrong or bad answers, over the GPT 3.5-powered chatbot that’s free to access (and that powers many other apps). OpenAI’s Sutskever, remarked that “GPT-4 isn’t there yet, but it is a lot closer.” Despite improvements, ChatGPT Plus cannot be regarded as consistently reliable.

Whatever we do in our courses regarding LLM AIs, conveying this basic truth to students that these tools can smoothly invent things should be part of it. But it doesn’t preclude use of AI chatbots altogether.

What Is Or Is Not An AI?

This tentative guide is concerned with Large Language Model AIs, which are (at the time this is written) the most powerful artificial text generation tools available. Other tools, be it a customer service chatbot, a grammar assistant, or even the non-playable antagonist and their henchmen in video games, can be considered artificial intelligence, although they generally follow much simpler routines than LLM AIs. However, in COLI we expect to see in the next several months and years LLM AI-powered tools and features appear across the internet, in software and on mobile devices. So this distinction may not be relevant over time.

Back to Resource Beginning