E IV Studio Spring 2020
January 14, 2020 — February 27, 2020
Background: The Turing Test was developed by Alan Turing in 1950, and is used to analyze a machine’s ability to exhibit intelligent behavior equivalent to, indistinguishable from, that of a human. The Loebner Prize is an annual competition in artificial intelligence to award computer programs by judges to be the most human-like.
Project: Computers are advancing and progressing, but some argue that human’s ability to communicate is also regressing. We wanted to create an experience to visualize and reflect on the human-ness of conversations today. The process is as follows: there are two human participants: one judge and one competitor, as well as one bot. The human competitor and the bot are competing to prove their “human-ness”, and the judge infers which response between the two is the human to the best of their ability. As the test progresses, a 3D print is generated in real-time.
The end product is a conversational piece, and tangible artifact of the experience. At the end of the experience, we hope to leave the users a couple of takeaway questions:
Will this change the way you approach conversations in the future? How do you think human conversation will change in the future? Where does computer conversation fit within this?
Week 1: Research and Exploration
First week of classes. Defining and understanding the project theme: Autographic visualizations.
Autographic visualization is a speculative counter-model to data visualization based on the premise that data are something material rather than something abstract and symbolic.
We have looked at some examples of things that seem to create a record or trace of something that’s happened to them — how they are used by people, or how something in the surrounding environment affects them. Sometimes this is accidental or incidental, or solely a side-effect of a property of the materials.
Week 2 (1/21)
Assigned the project brief:
“Create a physical or digital (or hybrid) prototype, which can either be placed/used in a particular environment, or is itself a designed environment.”
Both individually and within groups in class, we discussed a breadth of possible ideas and concepts that we were interested in:
- respiration systems / breathing / awareness of breathing rate and intensity
- acceptance (in physical, mental, etc. contexts)
- conversation (dynamics of conversation, intensity, words spoken, communication across humans and across robots)
- microbes (natural and technical)
- story-telling, throughout generations, cultures, etc.
- sound waves
- digestive systems
The first breadth of ideas I explored ranged a variety of different of themes and mediums, approaching this challenge from at times phenomenon first, and other times medium / execution first.
Week 3 (1/28)
From this, Davis and I further explored the ideas of conversation and autonomy. We were particularly inspired by the Turing test — “developed by Alan Turing in 1950, is a test of a machine’s ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human.” (Wikipedia). When reading more about this test, we came across the Loebner Prize, the “most human computer”, and the “most human human”.
“The Loebner Prize is an annual competition in artificial intelligence that awards prizes to the computer programs considered by the judges to be the most human-like…the format of the competition is that of standard Turing Test” (Wikipedia)
In particular, we found the One Word Turing Test particularly interesting — the prompt of this exercise was to convince a human judge that you were the human (against a computer), given that you only have one word to do so. In this graph, it was concluded that “poop” was the most successful word in convincing the human judge that the competitor was human.
Computers are advancing and progressing, but, some people argue that human’s ability to communicate is also regressing. Through text messages, emojis, gifs, video chats, and other various digital mediums, the ways in which humans have communicated with each other has greatly shifted, and in many ways, changing the ways in which we interact with each other. In previous years, the question of whether a human can be fooled into thinking a computer was a human was quite out of question, however, especially so in the context of one word, we see that humans are genuinely getting more muddled on detecting whether or not a robot is a robot, and human, a human.
We thought that one interesting approach with this would be to, in some form, visualize this phenomenon of human-computer conversation. For example, using 3-d prints to translate real-time information about the conversation.
Stemming from this, Davis and I broke down the experience into two main parts — the 3-d print experience, and the overall test experience.
Davis started off playing with Gcode:
And I began trying out different methods of testing:
Week 4 (2/4)
To frame and outline the final manifestation of our project, Davis and I diagrammed out our system at large. Overall, the system will include:
- one human judge
- one human competitor
- one bot competitor
- a chat interface (housed on two monitors)
- a 3-d printed piece
- Two human participants (one judge, one competitor), and one bot enter an online chat.
- The human and the bot compete in proving their “human-ness” to the human judge through a series of questions (one word, long, TBA, etc.)
- The human judge then infers which is human, to the best of their ability.
As the test progresses, a 3D print is being generated, real-time. The end product is a conversational piece and tangible artifact to the experience.
A couple of the questions for reflection that we aspire visitors to take from the experience:
- Will this change the way you approach conversations in the future?
- How do you think human conversation will change the future?
- Where does computer conversation fit within all this?
Currently, Davis has managed to manipulate the prints to look as such:
And on the front end, I have managed to find ways for windows to communicate to each other (yay! basic framework together?)
Week 5 (2/11)
Mid-point check-in — taking a step back and going into more basic context, we greater in depth explored the logistics of the test.
A couple of questions to consider, raised from Dan and the class:
- What is the pro / con of being able to glance at the final artificat and immediately see how “many” times there was error?
- In what way, should someone be able to see the wrong mistake and reflect upon what exactly was said?
- Stepping back and conducting an in-person “one-word” survey amongst humans (human vs ai)
- To what extent does error, and certain words, bias people into detecting a human vs a robot?
- How can you interpret the different pauses in human judges, to make the final artifact less of a binary product?
Week 6 (2/18)
A/B User Testing for One Word
From this user testing, we found quite interesting results in regards to the varying types of answers. For example, the most notable differences for each question were as follows:
1. What did you do today?
- “I was bawling my eyes out because I just found out my best friend got into a car accident right when I got out of the shower this morning… and like… I don’t know what to do or think. Everything just hurts.”
- “it’s my personal information”
- “SSS, commuted to work, and now working”
- “I woke up and ate breakfast”
- “ go to class”
- “went to classes, had team meetings”
2. Describe where the cup is.
- “so if you imagine yourself sitting at a teeny tiny desk, there is a cute plant by your right… A laptop and book in front of you. And a cup at the reach of your right hand. Convenient if you’re a righty. Inconvenient if you’re not.”
- “on the table”
- “on the table in front of the plant”
- “on the table next to the plant”
- “it’s on the table”
- “on the right side of the table next to the plant”
3. Give one word to convey you are human.
- “screw you”
Analysis from One-Word Tests
From this, it was interesting to see the lengths that some people went in rewiring either the way they talked, or the information they gave, when aware that they were competing with AI. For example, in the case that of “bawling my eyes out because of a friend’s death”, there was an extreme push to heavily hammer in the aspect of “emotion”.
Another very common pattern for the test group with regards to the one word test was a gravitation towards juvenile words, or a mish-mash of words in order to pass off as one “word”.
Week 7 (2/25)
This week, Davis and I further worked on our respective aspects of the project — Davis working on implementing real-time G-Code, and me working on finalizing the skeleton framework on Processing to house the test.
We also came together during this week to finalize the spatial set up of the project. The possible solutions were as follows:
Human competitor can see the 3D print, however human judge can not. Neither human competitor nor human judge can see each other. Users sit back to back with each other, with 3D print machine sitting in front of human competitor.
Both human competitor and human judge can see the 3D Print, but not each other. Human competitor and human judge sit in an “L” shape, perpendicular to each other, with the 3D print machine at the corner of this shape.
Both human competitor and human judge can see the 3D Print, but not each other. Human competitor and human judge sit in a line, parallel to each other, with the 3D print machine in between the two users. Judge and human competitor will turn their heads to either right/left to see the machine, and users will not see each other (with the help of one-sided mirror film)
Environmental Set Up
As part of our environmental set up, it was important to us that we had an intentional and designated space for both the judge and the human competitor — it was important to maintain both privacy, yet allow the two to clearly see the 3-d printer. To do this, we sketched out several desk-set up structures, and Davis constructed one of them in the end:
Week 8 (3/03)
We had a couple of different thoughts and tried two approaches in introducing our project. Ideally, we wanted the users to understand the process at the end of the print, however, when introducing the project scope at the end, we ran into the risk and issue of having users being bored and not having paid attention to the print, as they did not understand what was directly happening in the moment (thus, ruining the real-time aspect of the situation)
Overall, this project was extremely rewarding, thought-provoking, and fun to tackle as we learned new hard set skills alongside the way. There are several elements we would love to improve upon, and more insights as we continue to reflect on our experience for this project.