We’ve previously written about the rise of artificial intelligence and the current and anticipated effects of AI upon employment. (See links to previous blog posts, below) Two recent articles treat the effects of AI on the assessment of students and the hiring of employees.
In her recent article for NPR, “More States Opting To ‘Robo-Grade’ Student Essays By Computer” Tovia Smith discusses how so-called “robo-graders” (i.e., computer algorithms) are increasingly being used to grade students’ essays on state standardized tests. Smith reports that Utah and Ohio currently use computers to read and grade students’ essays and that soon, Massachusetts will follow suit. Peter Foltz, a research professor at the University of Colorado, Boulder observes, “We have artificial intelligence techniques which can judge anywhere from 50 to 100 features…We’ve done a number of studies to show that the (essay) scoring can be highly accurate.” Smith also notes that Utah, which once had humans review students’ essays after they had been graded by a machine, now relies on the machines almost exclusively. Cyndee Carter, assessment development coordinator for the Utah State Board of Education reports “…the state began very cautiously, at first making sure every machine-graded essay was also read by a real person. But…the computer scoring has proven “spot-on” and Utah now lets machines be the sole judge of the vast majority of essays.”
Needless to say, despite support for “robo-graders”, there are critics of automated essay assessments. Smith details how one critic, Les Perelman at MIT, has created an essay-generating program, the BABEL generator, that creates nonsense essays designed to trick the algorithmic “robo-graders” for the Graduate Record Exam (GRE). When Perelman submits a nonsense essay to the GRE computer, the algorithm gives the essay a near perfect score. Perelman observes, “”It makes absolutely no sense,” shaking his head. “There is no meaning. It’s not real writing. It’s so scary that it works….Machines are very brilliant for certain things and very stupid on other things. This is a case where the machines are very, very stupid.”
Critics of “robo-graders” are also worried that students might learn how to game the system, that is, give the algorithms exactly what they are looking for, and thereby receive undeservedly high scores. Cyndee Carter, the assessment development coordinator for the Utah State Board of Education, describes instances of students gaming the state test: “…Students have figured out that they could do well writing one really good paragraph and just copying that four times to make a five-paragraph essay that scores well. Others have pulled one over on the computer by padding their essays with long quotes from the text they’re supposed to analyze, or from the question they’re supposed to answer.”
Despite these shortcomings, computer designers are learning and further perfecting computer algorithms. It’s anticipated that more states will soon use refined algorithms to read and grade student essays.
Grading student essays is not the end of computer assessment. Once you’ve left school and start looking for a job, you may find that your resume is read not by an employer eager to hire a new employee, but by an algorithm whose job it is to screen for appropriate job applicants. In the brief article, “How Algorithms May Decide Your Career: Getting a job means getting past the computer,” The Economist reports that most large firms now use computer programs, or algorithms, for screening candidates seeking junior jobs. Applicant Tracking Systems (ATS) can reject up to 75% of candidates, so it becomes increasingly imperative for applicants to send resumes filled with key words that will peak screening computers’ interests.
Once your resume passes the initial screening, some companies use computer driven visual interviews to further screen and select candidates. “Many companies, including Vodafone and Intel, use a video-interview service called HireVue. Candidates are quizzed while an artificial-intelligence (AI) program analyses their facial expressions (maintaining eye contact with the camera is advisable) and language patterns (sounding confident is the trick). People who wave their arms about or slouch in their seat are likely to fail. Only if they pass that test will the applicants meet some humans.”
Although one might think that computer-driven screening systems might avoid some of the biases of traditional recruitment processes, it seems that AI isn’t bias free, and that algorithms may favor applicants who have the time and monetary resources to continually retool their resumes so that these present the code words that employers are looking for. (This is similar to gaming the system, described above.) “There may also be an ‘arms race’ as candidates learn how to adjust their CVs to pass the initial AI test, and algorithms adapt to screen out more candidates.”
“More States Opting To ‘Robo-Grade’ Student Essays By Computer,” Tovia Smith, NPR, June 30, 2018
“How Algorithms May Decide Your Career: Getting a job means getting past the computer” The Economist, June 21, 2018
“Welcoming our New Robotic Overlords,” Sheelah Kolhatkar, The New Yorker, October 23 2017
“AI, Robotics, and the Future of Jobs,” Pew Research Center
“Artificial intelligence and employment,” Global Business Outlook
Asking questions is a critical aspect of learning. We’ve previously written about the importance of questions in our blog post “Evaluation Research Interviews: Just Like Good Conversations.” In a recent article, “The Surprising Power of Questions,” which appears in the Harvard Business Review, May-June, 2018, authors Alison Wood Brooks and Leslie K. John offer suggestions for asking better questions.
As Brooks and John report, we often don’t ask enough questions during our conversations. Too often we talk rather than listen. Brooks and John, however, note that recent research shows that by asking good questions and genuinely listening to the answers, we are more likely to achieve both genuine information exchange and effective self-presentation. “Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding.”
Although asking more questions in our conversations is important, the authors show that asking follow-up questions is critical. Follow-up questions “…signal to your conversation partner that you are listening, care, and want to know more. People interacting with a partner who asks lots of follow-up questions tend to feel respected and heard.”
Another critical component of a question-asking is to be sure that we ask open-ended questions, not simply categorial (yes/no) questions. “Open-ended questions …can be particularly useful in uncovering information or learning something new. Indeed, they are wellsprings of innovation—which is often the result of finding the hidden, unexpected answer that no one has thought of before.”
Asking effective questions depends, of course, on the purpose and context of conversations. That said, it is vital to ask questions in an appropriate sequence. Counterintuitively, asking tougher questions first, and leaving easier questions until later “…can make your conversational partner more willing to open up.” On the other hand, asking tough questions too early in the conversation, can seem intrusive and sometimes offensive. If the ultimate goal of the conversation is to build a strong relationship with your interlocutor, especially with someone who you don’t know, or don’t know well, it may be better opening with less sensitive questions and escalate slowly. Tone and attitude are also important: “People are more forthcoming when you ask questions in a casual way, rather than in a buttoned-up, official tone.”
While question-asking is a necessary component of learning, the authors remind us that “The wellspring of all questions is wonder and curiosity and a capacity for delight. We pose and respond to queries in the belief that the magic of a conversation will produce a whole that is greater than the sum of its parts. Sustained personal engagement and motivation—in our lives as well as our work—require that we are always mindful of the transformative joy of asking and answering questions.”
“The Surprising Power of Questions,” Alison Wood Brooks and Leslie K. John. Harvard Business Review, May–June 2018 (pp.60–67)
In a recent article in the May 2, 2018 Harvard Business Review, “Learning Is a Learned Behavior. Here’s How to Get Better at It,” Ulrich Boser rejects the idea that our capacities for learning are innate and immutable. He argues, instead, that a growing body of research shows that learners are not born, but made. Boser says that we can all get better at learning how to learn, and that improving our knowledge-acquisition skills is a matter of practicing some basic strategies.
Learning how to learn is a matter of:
- setting clear and achievable targets about what we want to learn
- developing our metacognition skills (“metacognition” is a fancy way to say thinking about thinking) so that as we learn, we ask ourselves questions like, Could I explain this to a friend? Do I need to get more background knowledge? etc.
- reflecting on what we are learning by taking time to “step away” from our deliberate learning activities so that during periods of calm and even mind-wondering, new insights emerge
Boser says that research shows we’re more committed, if we develop a learning plan with clear objectives, and that periodic reflection on the skills and concepts we’re trying to master, i.e., utilizing metacognition, makes each of us a better learner.
In a recent article “Against Metrics: How Measuring Performance by Numbers Backfires,” Jerry Z Muller argues that companies, educational institutions, government agencies, and philanthropies are now in the grip of what he calls “metric fixation,” “…the belief that it is possible – and desirable – to replace professional judgment (acquired through personal experience and talent) with numerical indicators of comparative performance based upon standardized data (metrics).”
In this brief and important article, Muller critiques the growing phenomenon of paying employees for performance. He points out that such schemes often lead to a narrowing measure of what is desirable for the organization, leads members of an organization to “game the system”, often undermines organizations ability to think more broadly about their purposes, and most importantly, impedes innovation.
Looking at the unintended outcomes of metric fixation, he writes:
“When reward is tied to measured performance, metric fixation invites just this sort of gaming. But metric fixation also leads to a variety of more subtle unintended negative consequences. These include goal displacement, which comes in many varieties: when performance is judged by a few measures, and the stakes are high (keeping one’s job, getting a pay rise or raising the stock price at the time that stock options are vested), people focus on satisfying those measures – often at the expense of other, more important organizational goals that are not measured. The best-known example is ‘teaching to the test’, a widespread phenomenon that has distorted primary and secondary education in the United States since the adoption of the No Child Left Behind Act of 2001.”
Pay for performance schemes, however, are not alone in eliciting a narrowing of goals, and a tendency to game the system. Metric fixation (or what I term the “tyranny of measurement”) can be a risk for a range of non-profit organizations and educational institutions who often feel that demands for accountability can be addressed by merely counting the number of participants who receive services, or the number of students who score well on reading tests. While it is important to have clear goals, and to be able to indicate if these goals are met, organizations, in their rush to address demands from funders and other stakeholders for accountability, must be careful not to reduce their goals—indeed their organizations’ vision— to only a few countable variables. “What can and does get measured is not always worth measuring, may not be what we really want to know, and may draw effort away from the things we care about” (Muller). As Albert Einstein observed, “Not everything that counts can be counted, and not everything that can be counted, counts.”
“Against Metrics: How Measuring Performance by Numbers Backfires,” Aeon, April 24, 2018
The Tyranny of Metrics, Jerry Z. Muller, Princeton University Press, 2018
Brad recently presented an introductory seminar on program evaluation at Nonprofit Net, a Lexington, Massachusetts-based organization. Nonprofit Net is a forum for Massachusetts nonprofit leaders and nonprofit consultants which offers seminars on topics of importance to the nonprofit community. Brad’s presentation provided an introduction to program evaluation, outlined the benefits of evaluating outcomes in order to demonstrate programs’ achievements and challenges, introduced the use of logic models, and reviewed the key questions that nonprofit leaders should consider as they approach a program evaluation.
The seminar was based on the materials in the following white papers, which you can download for free: