The Foreign Service Journal, May-June 2026

THE FOREIGN SERVICE JOURNAL | MAY-JUNE 2026 45 applications that could be sliced by nationality, profession, residence, age, gender, and scores of other differentiators. Even better, we know prior outcomes of those applications in the form of validation studies. If a 25-year-old Indian entrylevel employee from Chennai wanted to go to a trade show in Las Vegas, we could compare that application with prior issuances to see if people with that profile historically traveled well. Sounds straightforward and easy enough. Further, AI is already widely used in legal contexts. A 2025 Thomson Reuters survey of legal professionals showed that 80 percent of respondents said AI would have a high or transformational impact on their work. Among users, most relied on it for document review, legal research, and opinion summaries. In the public sector, Estonia is using so-called “robo-judges” to dismiss simple cases based on procedural errors. Meanwhile, Chinese courts are using robo-judges for small claims, misdemeanors, and traffic violations. But a 2025 University of Chicago study argues that while black-and-white or low-impact decisions are easier use cases, those involving human intent and experience are beyond AI’s capability. This study took real war crimes cases from the International Criminal Tribunal for the former Yugoslavia and let AI (in this case, GPT-4) read transcripts to make decisions. The AI decisions followed legal precedent, but a defendant’s character and explanations did little to move the needle. Even when it was instructed to consider sympathy—admittedly an odd request for a computer—GPT-4 dismissed human explanations as legally irrelevant. The authors said the AI’s decisions aligned with those found by law students but missed the nuance used by seasoned judges. Rather than the popular belief that AI “thinks” or “reasons,” the study confirmed that large language models like ChatGPT just find the most appropriate word to follow the last word, all based on the training data fed into their algorithms. Apart from the inability to replicate human reasoning, there is a concern of bias. A study from the International Journal for Court Administration noted that historical training data includes the bias in that data, and it is impossible to divorce the bias in the outcomes from the outcomes themselves. As the old programming adage goes: garbage in, garbage out. The process becomes a “vicious circle, since many machine learning approaches are creating their own algorithms based on the datasets in which they are trying to identify and recreate patterns.” They take the pattern, regardless of the desirability of the pattern, as the norm. In one example cited in the paper, Amazon had to scrap a hiring tool because it would disqualify women applicants knowing that Amazon currently had few women in executive roles. What AI Has to Say for Itself But that’s the opinion of academics. What do AI models “think” about it themselves? To find out, I submitted the same three prompts to OpenAI’s ChatGPT, Anthropic’s Claude, and Elon Musk and xAI’s Grok—three of the most popular chatbots. The first part of the prompt asked it to explain the visa interview process generally to confirm its understanding of the question; the second part asked whether AI could replace the role of the consular officer. ChatGPT’s answer was careful in all aspects, concluding that AI could not do the job fully or safely, at least not with nearterm technology. “A machine cannot be held legally accountable. A consular officer can.” It noted that AI would struggle to assess honesty, eliminate bias and systemic discrimination, and cannot handle “outlier” situations like complex life situations. Luckily, no visa applicant has a complex life situation! Claude said AI could possibly identify “very low-risk individuals,” without explaining what would make a person “low-risk,” and flag risk factors. But it was the most cautious of the bunch, stating that AI might be able to reduce variation in outcomes but will always lack human judgment about credibility. Grok is well known for its odd “tweaks,” such as the recent one where it genuinely claimed that its owner, Elon Musk, would dominate every Major League Baseball hitter except, maybe, L.A. Dodgers phenom Shohei Ohtani. Grok was similarly confident when discussing consular work, stating that AI could soon easily take over visa decisions. It said that consular sections already use AI to flag fake documents and cross-check all NIV application data against every public record in the world in seconds during the visa interview. We keep millions of records of prior visa applications that could be sliced by nationality, profession, residence, age, gender, and scores of other differentiators.

Made with FlippingBook

RkJQdWJsaXNoZXIy ODIyMDU=