Week 13: Images and machine-learning AI

Week 13 Overview
Week 13 Reading and Discussion
Week 13 Lab: Machine-learning computer-vision classification

Week 13 Overview

Images, machine learning and AI

As with last week, we’ll be focusing on some of the “What’s next” in digital history. Unlike last week, we’ll be looking at fairly well established ideas of what’s next: machine learning (where humans give a machine a task and a dataset, and the task has built-in loops where the machine rewrites some of what it’s expected to do/look for). Auto-image-detection is fairly new in digital humanities, but machine learning has been around for a while, so we’ll tackle the image-analysis-as-new aspect of AI with machine learning rather than with generative AI like Bing or MidJourney.

Reading: Our independent reading looks at how image-based machine learning works from both a technical perspective (Arnold) and a historian’s cultural/analytical perspective (Tilton).

Lab: Our lab this week uses Google Colab and a computer-vision machine learning task to classify newspaper advertisements. The computer vision here combines textual and visual analysis, so it’s a nice bridge from the largely textual focus we’ve had so far to a more image-focused analytical process. *We’ll also tackle AI ethics using this machine-learning tutorial. Keep an eye out for that at the end of the lab.**

Ethics in AI

Go back to the Programming Historian tutorial and look at the 2 paragraphs on “Transfer Learning”, which says: a machine-learning process trained on one dataset can then “transfer” the classification strategies it “learned” to the training of a new, unfamiliar dataset.

What happens to our ability to contextualize a dataset and its training outcome when transfer learning is at play?
1. How do we know what’s in that dataset?
2. Do the people who contributed to the original dataset know how it’s being used to shape newly generated text and images?
How does our understanding of “what a dataset was meant for” change as we practice history in the age of generative AI?
1. That is, we use newspapers all the time. Newspaper reporters know historians exist. Did letter writers? Do we need to rethink how we use archives because we now understand the concept of transfer learning?
Where is your personal line for the use of generative AI? Of non-generative but still AI-based machine-learning analysis?
1. Consider what you know about how machine-learning approaches and transfer learning underlie the generative part of generative AI.
2. Think about the likely datasets that contribute to generative-AI chatbot “behavior”
3. Consider the lawsuits that have been brought against generative-AI companies and what that tells you about where they got their data.

OPTIONAL BUT REALLY GREAT: For a long-watch that ties the technical development of LLMs to the ethics of LLM use, check out Emily Bender (a famous computational linguist) on “ChatGP-why: When, if ever, is synthetic text safe, appropriate, and desirable?”.

This talk is about Bender’s take on uses of ChatGPT and does a great job of:

explaining the technology behind Large Language Models (you’ll recognize distributional analysis like topic modeling and n-grams)
why Large Language Models aren’t actually creating meaning
how this lack of actual “generation” is an issue.

I recommend that you watch this at some point soon and save this for your own records.

Week 13 Reading and Discussion

How does computer vision work?

The key to understanding new and emerging technologies is to find an analog in what you already know. Here, we want to think about pixels.

A pixel in a photograph, stripped to its most essential characteristic, is a single block surrounded by 8 other blocks:

1	2	3
4	PX	5
6	7	8

Each pixel in a photograph has a red/green/blue numeric value that, mixed together using additive light-based color theory, displays a color that is visible to a human being. THIS GREEN COLOR , for instance has a Red 6 (out of 255), Green 255, Blue 6 value. Because these are numbers, computer vision can compare the numbers for each pixel and its surrounding pixels one by one and use that in a giant dataset to look for patterns that are visible in numeric form to the computer and in visual light-based-color form to the human eye.

Then read Taylor Arnold, Lauren Tilton, Justin Wigard. “Understanding Peanuts and Schulzian Symmetry: Panel Detection, Caption Detection, and Gag Panels in 17,897 Comic Strips Through Distant Viewing.” Journal of Cultural Analytics, vol. 8, no. 3, Sept. 2023, https://doi.org/10.22148/001c.87560.

As you read, consider two overview concepts and track 3 related specific things in the article.

As you read, we’ll use prompts that are similar to last week.

Details
- what makes sense based on your existing training as a digital historian and on your existing technical skillset (which varies for each one of you)?
- what doesn’t make sense or is worded in such technical language that you get lost?
- where do the analogies and simplifications the authors provide help you navigate the parts that don’t make sense?
Overview concepts
- How do you become a translator for other people who are less well-trained in digital history?
  - We each have to find our own idiom when we’re describing how digital history works and helping other people who aren’t trained in any aspect of digital history encounter and assess new tools.
- What can you learn about digital humanities more generally (not just about computer vision) from how Arnold/Tilton/Wigard explain their own computer-vision research?

Week 13 Lab: Machine-learning computer-vision classification

Lab background

This week, our lab will focus on the technical steps to classify advertisements in a newspaper using machine learning: Daniel van Strien, Kaspar Beelen, Melvin Wevers, Thomas Smits, and Katherine McDonough, “Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1),” Programming Historian 11 (2022), https://doi.org/10.46430/phen0101 .

You may or may not have the facility to do this as a hands-on tutorial, but you should all read this at least once.

For everyone

Read. Carefully. Twice. (but slightly differently, depending on whether you tackle this hands on with Google Colab or not)
Try to make as much sense of the section on “An In-Depth Guide to Computer Vision using Deep Learning” as possible.

Other resources

Giles Bergel et al and their library of visual computational analysis: https://www.robots.ox.ac.uk/~vgg/software/

If you want to try using Google Colab WHICH WORKS THIS WEEK WITH MINOR CHANGES | If the thought of customizing Google Colab or installing a new piece of software makes you itch
—|—

Read. Carefully. Go all the way through once before trying the hands-on version.
Make sure you use the Google Colab notebook provided at the beginning of the tutorial.

There are 2 places where you’ll get errors, both of which have cells that contain this code:

%matplotlib inline  
import matplotlib.pyplot as plt  
plt.style.use('seaborn')  
replace everything in those cells with this and then run them

!pip install seaborn  
%matplotlib inline  
import matplotlib.pyplot as plt  
import seaborn as sns # Import the seaborn library  
plt.style.use('seaborn-v0_8') #set the seaboard style  

Read. Carefully. Twice.
Pay special attention to the commands themselves, the order in which they are presented, and how the syntax works
Focus on the fact that you are learning a second language, not learning a computer-specific skill.

Fall 2024 H699 Week 13