Here, COLI is assembling a list of resources concerning artificial intelligence, and it's possible implications for pedagogy and scholarship. This resource is tentative, certainly incomplete, might age rapidly, and probably wrong in places.
Understanding Large Language Model (LLM) Generative Artificial Intelligence (AI)s
Large Language Model AIs, or more properly their chat versions, such as ChatGPT, Google Bard, or Bing Chat, are designed to simulate human typed-text conversation. They are computer programs that, in addition to code written by software engineers and developers, have been trained on large quantities of (mostly) human-generated text. Much of this is the open internet, but other sources occasionally have been added to the training corpus as well. Versions have been in development for years, but LLM AIs emerged into broader public attention in late 2022, when the firm OpenAI made available, for free use by anyone on the internet, ChatGPT, which was powered by the GPT-3 LLM.
What LLM AIs Do
These AIs can create things or do by generating text. Just some examples are:
recipes for food dishes,
- lesson plane for secondary school science classes,
- a cover letter accompanying a job application,
- a thank-you note,
- an essay on the development of the Code Napoleon,
- simulate a fifteen-year-old blogger reviewing a video game,
- code a module or particular task within a computer program.
LLM AI Google Bard describes itself as follows:
Google BardAs an AI, how are you classified? "I am classified as a large language model, also known as a conversational AI or chatbot trained to be informative and comprehensive. I am trained on a massive amount of text data, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions. For example, I can provide summaries of factual topics or create stories." Can you describe your training corpus? "Sure. My training corpus is a massive dataset of text and code that I was trained on. It includes text from a variety of sources, such as books, articles, websites, and code. The corpus is designed to teach me about the structure of language and how to use language to communicate effectively. The corpus is also designed to teach me about the world. It includes text about a variety of topics, such as science, history, and current events. This helps me to understand the world around me and to generate text that is relevant and informative. The corpus is constantly being updated with new text. This helps me to stay up-to-date on the latest information and to generate text that is accurate and up-to-date." |
Any successful LLM AI production requires a carefully written prompt from the user. This must properly describe the user's intent, and if the AI's response is not to be dependent on its training corpus, it the user's prompt must supply relevant data as well. Already, education and technology experts speak of "prompt engineering" as a probably skill necessary for students in the future, but it remains to be seen whether that is true, as these tools (more or less) evolve.
Limits to LLM AIs
Importantly, the current LLM AIs have limits. If you ask them to describe those limits, they will usually enumerate. For example, when asked why it occasionally gets things wrong, ChatGPT replies that its answers will reflect shortcomings in its training data: biases, incomplete or wrong information, or ambiguity. Plus, it may struggle to interpret language within that training corpus.
Wrong Answers
Perhaps most important, although something that LLMs might not be able to articulate themselves, is that they present simulations of humans, instead of possessing human concepts of truth or correctness. If an LLM AI is prompted to answer a question for which it does not have training data it may decline to answer, or it may provide a plausible, but fictional answer. These are what AI developers refer to as "hallucinations." Some examples of these fictions could be:
- descriptions of a book whose text or detailed summaries of the same are not in the AI's training data. The AI might develop a plausible but false interpretation or summary based on the book's title, or what information it may have on the book's subject. In COLI, we find that copyrighted works seem particularly troublesome in this respect. It is likely that such works, and text descriptions of those works, have not made it into LLM AI training data.
- scientific or engineering explanations of complex phenomena.
- biographies of non-famous individuals. (Try asking for a short biography of you and your title, if it is already publicly available on the web. You may receive a fantastic, if false biography.)
This amounts to LLM AIs "making stuff up" or "getting it wrong," which may be true to us, but they are not malfunctioning, in strict accordance to their development. They simulate human composition. Ars Technica's Benj Edwards offers a good explanation of this. But the short version was offered by OpenAI CEO Sam Altman, soon after ChatGPT became available in late 2022: "ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness. It's a mistake to be relying on it for anything important right now. It’s a preview of progress; we have lots of work to do on robustness and truthfulness."
Whatever we do in our courses regarding LLM AIs, conveying this basic truth to students that these tools can smoothly invent things should be part of it. But it doesn't preclude their use altogether. Used for brainstorming, or when we can provide the AI with the facts of the situation as part of the prompt
How AIs do or do not develop to overcome these limits will be important for education. At any point, in what ways can students and faculty responsibly use these AIs? In what ways are these AIs not appropriate, given their limitations?
Prompt Engineering
At least in the present, how the user crafts a question or command for an LLM AI, prompt engineering, determines the AI's product. Even subtle re-wording of the same (to humans) questions can produce radically different results from the AI. It seems that, for the time being, prompt engineering might be a useful skill to cultivate in students. What specific practices or procedures produce different kinds of outcomes may depend on different disciplines. But the subtleties of prompt engineering may mean that using LLM AIs, honestly or otherwise in academics may occasionally be frustrating.
What Is Or Is Not An AI?
This tentative guide is concerned with Large Language Model AIs, which are (at the time this is written) the most powerful artificial text generation tools available. Other tools, be it a customer service chatbot, a grammar assistant, or even the non-playable antagonist and their henchmen in video games, can be considered artificial intelligence, although they generally follow much simpler routines than LLM AIs. However, in COLI we expect to see in the next several months and years LLM AI-powered tools and features appear across the internet, in software and on mobile devices. So this distinction may not be relevant over time.
Pedagogy
Each discipline will need to determine the extent to which LLM AIs compel adaptation or alteration of their curriculum. Each faculty member will also likely need to consider the relationship between their teaching style and methods, and LLM AIs. However, here is a list of things that might spur some inspiration.
Sources
LLM AIs have learned primarily on open-sourced content. This might be on the internet, or books that are out of copyright. There may be exceptions in unpublished training aids. But much of what we assign is copyrighted content, out of necessity, since that is where specialized disciplinary knowledge is found. Writing assignments that ask students to focus on these specialized resources will not be accessible to generative AIs.
Similarly, having students do primary research is both pedagogically sound as well as irrelevant to AIs. If students must do the lab work, or labor in the archives, they acquire familiarity with the foundations of knowledge. ChatGPT itself points to "original research" as something it cannot perform or simulate:
Micro Examples
LLM AIs will not have extensive access to specific examples that illustrate larger trends. For example, asking students to read testimonies, letters, or documents written in the past, but are not particularly famous, can help them connect greater ideas to specific people or events. Aside from the issue of LLM AIs, this often generates greater interest among students. For example, having students read a letter written by a nurse during the 1918 Influenza epidemic, or read a Treasury Department report about a specific corporate fraud case, can help students understand larger arguments or legal conceptions, within the structure of a compelling story. Since LLM AIs may not be able to write with authority about these cases, since they are not published on the open internet, this gives students the opportunity to draw their own conclusions.
Scaffolded Work
"One-and-Done" assignments are where LLM AIs shine. If you require students to complete a project in stages, providing formative feedback at each stage, students are more likely to learn research, writing computational, and other skills, and acquire more confidence along the way. This isn't something they can hand off to AIs.
Reflective Writing
Have students write reflections on course concepts or their learning. For example, have a student describe how they arrive at a (perhaps tentative) conclusion based on available evidence. Have a student describe how they arrived at their method for coding a program.
At the Top of Bloom's Taxonomy
Assignments that require creation or evaluation are particularly suited to humans and not AIs. Have students make arguments based on (original or primary) evidence. Or have students provide an interpretation, or assessment of quality, of a particular composition or source.
Creative Production That Isn't Text
Have students create narrated videos: documentaries, tutorials, explainers, and so on. While these could in theory be scripted by an AI, you may reasonably require composition that is more closely tied to visuals on screen, which might make AI-generated information less useful. As with all above, this is a solid pedagogy regardless of AIs, since students are compelled to think critically about media that they are more likely to encounter than a traditional college essay.
To Do
Update Your Course Policies
To start with, mention generative AIs in your syllabus. Should students avoid them altogether, for some or all assignments? Can students use them for certain purposes? This policy may be imperfect at first, until you acquire greater familiarity with LLM AI capabilities. But it is better than nothing.
Periodically reflect on what exactly your position on AIs is from a curricular or pedagogical standpoint. Do they have no place in your classroom or activities? Could they reasonably assist students in some parts of their work, so that students can better focus their efforts on other, more important things? Would you like students to experiment with AIs, to determine for themselves a sense of that LLM AIs are or are not capable of in your discipline?
Experiment With AIs, Generally
On Your Own
Try having conversations with the LLM AIs on topics both within your professional discipline, or just about anything else. Get a sense of how they respond.
With Your Students
When we say that AIs are likely a part of the professional future for many of our students, and we need to prepare them to work with or around AIs, what we might mean is simply making students aware of AI's current likely behaviors, and developing in students a habit of remaining aware of trends in AI. For example, if you are teaching economics or anthropology, you might periodically prompt AIs to discuss the day's class subject, content or activity, and then discuss with students how the AI responds.
Companies like OpenAI and Google are sensitive to charges that AIs inherit bias and discrimination present in their human creators or training data. They have taken steps to prevent this, but their products remain controversial. This might be an important conversation to have with students, based on various perspectives presented in sources. However, one should be cautious about in-class or assigned experimentation with, for example, engineering prompts designed to provoke racist replies.
Another concern can be that students must create accounts at OpenAI, Google, and Microsoft to experiment with these AIs. Asking students to provide these companies with personally identifiable information (PII) may be problematic. Students should be encouraged to consult and understand the terms of service, even if it is optional.
But classroom experimentation with AIs might be beneficial, to determine how forms of knowledge and ways of thinking in your discipline interact with AIs. Use a single account, and project it on the big screen in class. Work together as a class to generate or modify prompts. Students might see how AIs stumble with certain questions, or provide simulated but incorrect answers. If an AI cannot perform the kinds of analyses, creativity, or other skills you hope students learn in the course and in the process of assignments, it is good for students to see that for themselves while you are present to answer questions. In engineering prompts and discussing AI outcomes, you also have an opportunity to demonstrate ways of thinking, habits, practices, and procedures that are the substance of your course learning objectives.
Ask AIs to Do Your Assignments
If you suspect that an LLM AI might be able to complete your students' assignments for them, ask the LLM AIs to do the assignments, and see how they do. You may need to vary your prompt a bit to ensure that the AI understands exactly what is asked since, among other reasons, the AI hasn't been a student in your class for several weeks before the assignment is to be completed.
Based on the AI's performance, you can determine which assignments might need to be scrapped, which need to be altered, and which assignments prompt poor responses from the AI. If you ask the AI to regenerate those responses several times, you will likely see familiar patterns, since in response to the identical prompt, it is less likely to provide radically different answers.
Look for falsehoods. For example, if you ask it to draw details from a copyrighted book in order to make a case, you may see where the AI provides fictional, or more properly, simulated details. If you have a good grasp of the source in question, this is obvious.
Determine AI's effectiveness
Beyond your current assignment prompts, ask the AIs to perform the types of analyses that are core skillsets for your discipline. Can it accurately perform calculations of a sort? Can it interpret types of evidence commonly used by professionals? Can it identify important elements in a given text, according to certain scholarly or professional priorities? Can it discuss relevant literature on a particular topic, or describe debates within a discipline? And with all of the above, can it provide depth, detail, or precision that you expect students to exhibit when completing assignments?
Ask the AI
As a starting point, it can be helpful to ask the AIs directly what they cannot do, but that are typical learning outcomes of a discipline. They might answer with good insights. For example, Chat GPT suggests several things that undergraduate students should learn to do in college classes, but that LLM AIs will not be able to do effectively:
History
|
Biology
|
Management
|
To generate answers similar to these, here's the prompt: What are some assignments for a undergraduate university (discipline) course that have students practice or demonstrate things LLM AIs cannot do for them?
Like a lot of responses from LLM AIs, these suggestions are typically vague. And an LLM AI–ChatGPT, for example-- may attempt to simulate the things it tells it can not do very well, if prompted by you (or a student.) Therefore, the faculty member needs strong command of any disciplinary knowledge involved in the assignment, if they are to assess student work for accuracy or integrity. But these can be a good starting line for your process of thinking about assignments that are "AI proof"
Innovative Pedagogy
If you need to "AI-proof" your course, you have the opportunity to do something more. Can you make your assignments more effective as opportunities for students to practice or demonstrate the skills embodied in your learning objectives and goals?
Academic and professional disciplines across campus usually advertise some or another form of critical thinking and analysis skills as part of their learning outcomes. This is also present in our general education programs, the Core Curriculum and All-College Honors Programs. These tend to correlate with higher levels of Bloom's Taxonomy. They can also be especially challenging to assess on classroom exams, at least in something approaching a real-world scenario.
But perhaps we can develop authentic assessments that challenge students to complete tasks poorly done, or altogether inaccessible to LLM AIs. Many of these assignments may have been especially valuable before AIs existed.
Sources
https://cndls.georgetown.edu/ai-composition-tools/
Chat GPT Cheat Sheet: https://drive.google.com/file/d/1UOfN0iB_A0rEGYc2CbYnpIF44FupQn2I/view?usp=sharing
OpenAI's ChatGPT blog: https://openai.com/blog/chatgpt/
"Practical Responses to ChatGPT" https://www.montclair.edu/faculty-excellence/practical-responses-to-chat-gpt
Good list of links concerned with AI and pedagogy https://www.chronicle.com/newsletter/teaching/2023-03-16