🔴 Viewpoint: The Delegation Threshold
When AI Stopped Waiting for Instructions
When AI Stopped Waiting for Instructions
I noticed the shift while preparing English language materials for a cybersecurity module. I had been experimenting with AI to generate contextually appropriate exercises—nothing extraordinary, just seeking efficiency in the tedious work of constructing scaffolded learning activities. I had grown accustomed to the iterative process: draft a prompt, review the output, refine my instructions, generate again. Each cycle required me to specify exactly what I wanted—vocabulary level, sentence complexity, the particular grammatical structures to emphasise.
But sometime in late 2024, the process changed. I found myself typing broader instructions: "Create a B1-level lesson on network security that builds toward discussing ethical hacking." What came back was not merely text matching my specifications—it was a complete pedagogical sequence. The system had made decisions about pacing, selected supporting examples, designed assessment questions, even suggested discussion prompts that might generate productive debate. I had not asked for these elements explicitly. The system had inferred them from my stated objective and constructed a path toward it.
The distinction seemed subtle at first. Yet as I encountered similar patterns across other domains throughout 2025—reading about AI systems that managed supply chains, coordinated medical trials, handled customer retention—I began to recognise a common thread. The discourse had shifted. We were no longer talking about AI as sophisticated pattern-matching, reflecting our prompts back to us with elaboration. We were talking about AI as systems with something approximating agency—tools that do not wait for step-by-step instructions but instead pursue goals we articulate in broad strokes, determining their own paths toward completion.
This, I have come to believe, represents the most significant AI advancement of 2025—not because of any particular technical breakthrough, though the underlying improvements in reasoning and planning are substantial, but because it marks a philosophical threshold. Agentic AI does not simply amplify human capability; it displaces human decision-making. And in that displacement lies a transformation so fundamental that we are only beginning to grasp its implications. The question is no longer merely what AI can do, but what we are willing to let it decide—and what forms of human judgment, relationship, and meaning we risk sacrificing in the process.
The Nature of Agentic AI
To understand what makes 2025's agentic AI genuinely novel, we must first acknowledge what it is not. It is not the first time machines have acted without immediate human supervision. Industrial automation has long operated according to programmed rules, executing predefined sequences with mechanical precision. Even earlier generations of AI could optimise certain parameters within bounded domains—adjusting manufacturing tolerances, routing delivery trucks, recommending products. These systems were sophisticated, certainly, but they were fundamentally reactive. They responded to conditions according to logic we had inscribed into their operation.
What distinguishes the agentic systems emerging in 2025 is their capacity for what Maryam Ashoori, Director of Product Management at IBM watsonx.ai, describes as "intelligent entities with reasoning and planning capabilities that can autonomously take action" (Belcic & Stryker, 2025). The technical specifics involve advances in large language models, enhanced reasoning architectures, and sophisticated tool-use frameworks that allow AI systems to chain together multiple operations. Foundational research on chain-of-thought prompting demonstrated that language models exhibit emergent reasoning capabilities when prompted to articulate intermediate steps (Wei et al., 2022), while subsequent work on ReAct (Yao et al., 2023a) and Tree of Thoughts (Yao et al., 2023b) established architectures enabling more deliberate, strategic problem-solving. These technical developments have converged to enable systems that can engage in what Lilian Weng (2023) characterises as "task decomposition, self-reflection, and refinement"—the hallmarks of autonomous operation.
But the functional result is more philosophically interesting than the technical mechanisms: these systems can interpret a high-level objective, decompose it into constituent tasks, select and sequence appropriate actions, monitor their own progress, and adapt their approach based on outcomes—all without returning to us for approval at each decision point. Wang et al. (2024), in their comprehensive survey of large language model-based autonomous agents, propose a unified framework comprising four essential modules: profile, memory, planning, and action. This architecture enables what they term "autonomous operation" across complex, multi-step tasks. The planning module, drawing on techniques ranging from chain-of-thought prompting to more sophisticated tree-search algorithms, allows these systems to construct and execute elaborate action sequences with minimal human intervention.
Consider the difference between asking an AI to "draft a response to this customer complaint" versus telling an agent "reduce our customer churn rate by 15 per cent." The former requires the AI to generate text according to patterns it has learned. The latter requires something that looks remarkably like strategic thinking: analysing customer data to identify at-risk segments, designing retention interventions, determining which channels to deploy them through, executing those campaigns, measuring their effectiveness, and iterating based on results. The system does not merely follow our instructions; it formulates its own action plan in service of our stated goal. Multi-agent frameworks like MetaGPT (Hong et al., 2024) and AutoGen (Wu et al., 2024) extend this capability further, enabling multiple AI agents to collaborate on complex tasks through specialised roles—product managers, architects, engineers—coordinating their activities toward shared objectives.
This shift from instruction-following to goal-pursuit represents what Gartner (2025a) identifies as one of the fastest-advancing technologies on their 2025 Hype Cycle for Artificial Intelligence, noting that "AI agents and AI-ready data are the two fastest advancing technologies... experiencing heightened interest this year, accompanied by ambitious projections and speculative promises, placing them at the Peak of Inflated Expectations." Industry projections suggest that by 2028, at least 15 per cent of day-to-day work decisions will be made autonomously by agentic AI, up from effectively zero in 2024 (Gartner, 2024). These are not trivial administrative tasks but substantive operational choices: which vendors to contract with, how to allocate marketing budgets, when to adjust production schedules. We are, in short, beginning to delegate not just labour but judgment.
Yet scepticism remains warranted. As Marina Danilevsky, also at IBM Research, observes: "I'm still struggling to truly believe that this is all that different from just orchestration. You've renamed orchestration, but now it's called agents" (Belcic & Stryker, 2025). This tension—between revolutionary capability and sophisticated rebranding—runs through the discourse on agentic AI, and honest engagement with this technology requires holding both possibilities simultaneously. The gap between marketing claims and operational reality often proves substantial; Gartner itself projects that over 40 per cent of agentic AI projects will be cancelled by the end of 2027 due to inflated expectations meeting implementation realities (Gartner, 2025a).
The Gen AI Paradox
There is an irony embedded in this development that bears examination. For the past two years, organisations have been deploying generative AI at a remarkable pace—by some estimates, nearly 80 per cent of companies have integrated it into their operations in some form. Yet the same proportion reports no material impact on their bottom line. McKinsey terms this the "gen AI paradox," noting that "nearly eight in ten companies report using gen AI—yet just as many report no significant bottom-line impact" and "fewer than 10 percent of use cases deployed ever make it past the pilot stage" (Sukharevsky et al., 2025). This pattern echoes historical technology adoption cycles, where initial enthusiasm often outpaces practical implementation, creating what researchers at the Harvard Kennedy School's Ash Center describe as a "hype bubble" that must eventually deflate before realistic value can be extracted (Widder & Hicks, 2024).
Agentic AI emerges as a proposed solution to this paradox precisely because it promises to move beyond what these researchers call "horizontal" applications—enterprise-wide copilots and chatbots that assist employees across many functions but generate value that proves difficult to quantify. Instead, agentic systems target "vertical," function-specific use cases where they can autonomously execute complete business processes. The economic logic is compelling: if an AI agent can handle customer onboarding from qualification through activation without human intervention, the productivity gains become measurable, the cost savings concrete. The shift represents a movement from AI as assistant to AI as autonomous operator—a distinction with profound implications for how work is organised and who controls its execution.
Early case studies suggest this promise is not merely theoretical. At 1-800-Accountant, autonomous agents reportedly resolved 70 per cent of customer chats during peak tax season without human intervention. Grupo Globo documented a 22 per cent increase in customer retention using autonomous engagement systems. OI Infusion Services reduced prior authorisation processing from approximately 30 days to 3 days through agentic automation (Sukharevsky et al., 2025). In healthcare, UC San Diego Health deployed AI-generated message drafts that clinicians preferred 78.6 per cent of the time for their clarity, empathy, and completeness. These are substantive operational improvements that translate directly to business outcomes.
But I find myself deeply ambivalent about the terms in which this promise is articulated. The language surrounding agentic AI is saturated with metaphors of liberation and optimisation. Workers will be "freed" from repetitive tasks to focus on "higher-value" activities. Businesses will achieve unprecedented "agility" and "responsiveness." Decision-making will be "accelerated" by orders of magnitude. All of this may be true. Yet the rhetoric systematically elides the more troubling question lurking beneath: what happens to human agency when we construct systems designed to replace human decision-making? The liberation narrative assumes that the tasks being automated are unambiguously low-value—but value is not an objective property of tasks; it depends on perspective, context, and whose interests are centred.
I keep returning to a phrase that appeared in reporting on Moderna's organisational restructuring, describing how the pharmaceutical company merged its HR and IT leadership because "AI is no longer a tool, it's a colleague" (Dupont-Calbo, 2025). The intention is clearly positive—to signal AI's elevated status in organisational life. But the metaphor unsettles me. Colleagues collaborate; they negotiate; they disagree and persuade and occasionally overrule one another. They operate within a shared understanding of mutual respect and accountability. An agentic AI system, no matter how sophisticated its reasoning, does not participate in that relational ecology. It optimises toward objectives we specify, unconstrained by the social and ethical considerations that govern human collaboration. When we start calling it a "colleague," we risk anthropomorphising a fundamentally different kind of entity—and in doing so, obscuring the asymmetry of power that defines our relationship with it.
Education as Test Case
My work coordinating international mobility at a higher education institution puts me in daily contact with the kind of processes that agentic AI is designed to transform. Student applications involve countless moving parts: academic transcripts, language certifications, visa documentation, housing arrangements, course equivalencies. Each step depends on the previous one; delays cascade through the system; exceptions require individual attention. It is precisely the sort of complex, multi-stakeholder workflow that advocates celebrate as ideal territory for autonomous agents.
And indeed, I can imagine—have been pitched, in fact—systems that would handle much of this orchestration independently. An agent could monitor application deadlines, request missing documents, coordinate with partner institutions, flag potential issues before they become crises. Students would receive more consistent support. Administrative staff would be released from tedious tracking tasks. The efficiency gains would be real. The question is whether efficiency is the only—or even the primary—metric by which we should evaluate such systems.
I wonder what we would lose in the translation. The international mobility process, at its best, is not merely administrative but deeply formative. Students are navigating unfamiliar institutional structures, learning to advocate for themselves, and developing the organisational competencies that will serve them throughout their professional lives. When a student emails me panicking about a visa rejection, I do not simply provide information; I help them understand the system they are operating within, model problem-solving approaches, build their confidence to navigate bureaucratic complexity independently. These interactions are inherently pedagogical, even when they occur outside formal classroom settings. The friction they encounter is not a bug to be optimised away but a feature of their educational development.
If we delegate such interactions to an agentic system—one that efficiently handles the problem without involving me—we optimise for immediate resolution at the cost of developmental opportunity. The student gets the outcome they need, but misses the process of learning how to secure it. This is not an argument against automation per se, but a recognition that efficiency and education sometimes work at cross-purposes. Not all friction is waste; sometimes it is the very medium through which growth occurs. The educator's task includes creating productive struggle—challenges calibrated to student capability that stretch without overwhelming—and this requires a form of judgment that current AI systems cannot exercise.
This insight finds support in four decades of research on dialogic pedagogy. Robin Alexander's work demonstrates that quality classroom talk operates through cumulative, purposeful, and reciprocal dialogue that cannot be reduced to information transfer (Alexander, 2020). GarcÃa-Carrión et al. (2020) document the social-emotional dimensions of dialogic learning that resist algorithmic replication. Education, at its most profound, is an intersubjective encounter—a relationship within which both teacher and student are transformed. The Learning Sciences have consistently demonstrated that meaningful learning emerges not from content delivery alone but from the social construction of understanding through dialogue, debate, and collaborative meaning-making.
Paulo Freire understood this when he critiqued what he called the "banking model" of education, in which "the teacher talks about reality as if it were motionless, static, compartmentalized, and predictable" and "education thus becomes an act of depositing, in which the students are the depositories and the teacher is the depositor" (Freire, 1970, p. 72). True education, Freire insisted, involves dialogue, mutual recognition, and critical consciousness—conscientização. It requires the teacher to remain genuinely open to being challenged, to learning alongside students, to allowing the curriculum to evolve in response to their questions and insights. This is not merely pedagogical preference but an ethical stance about the nature of knowledge and the dignity of learners.
An agentic AI can personalise instruction with remarkable precision. Research continues to demonstrate that AI tutoring systems can adapt content presentation to individual learning styles and paces (Plass & Froehlich, 2025). What these systems cannot do—what they are structurally incapable of doing—is participate in dialogue as an authentic interlocutor. They can simulate responsiveness, but they cannot be genuinely affected by what students say. They optimise toward pedagogical objectives we define, but they cannot question whether those objectives are appropriate, whether they reflect the students' actual needs, or whether they perpetuate unexamined assumptions about what knowledge is worth acquiring.
Audrey Watters' historical analysis in Teaching Machines demonstrates that the behaviourist assumptions underlying pre-digital "teaching machines" from the 1920s–1960s continue to inform contemporary educational technology (Watters, 2021). Current AI tutoring systems may represent technological sophistication, but they often reinscribe the same transmission-oriented model of learning that critical pedagogy has long critiqued. The personalisation is real; whether it constitutes genuine education—in Freire's sense of humanising, consciousness-raising encounter—is another question entirely. As Gulson, Sellar, and Webb (2022) argue in Algorithms of Education, the increasing datafication of education creates an illusion of greater control while actually reducing meaningful human agency over educational processes.
The Ironies of Automation
Perhaps the deepest source of my unease with agentic AI lies in what it reveals about our collective relationship with control and delegation. The promise of these systems is premised on a particular anthropology—an understanding of human beings and their limitations. We are told, again and again, that we lack the cognitive bandwidth to process information at scale, to monitor complex systems in real-time, to make decisions with the speed that contemporary business demands. We are bottlenecks. We introduce latency. Our attention wavers; our judgment falters; we require sleep and suffer from bias and forget important details. This narrative positions human limitations as problems to be solved rather than as features of our embodied, social existence that carry their own forms of value.
All of this is true, of course. We are finite creatures, gloriously and frustratingly limited. But agentic AI proposes to address our limitations not by augmenting our capabilities—extending our reach, amplifying our perception—but by replacing our decision-making altogether. The system does not wait for us to review options and choose a course of action; it acts autonomously, presenting us with completed outcomes rather than intermediate deliberations. This represents a qualitative shift from earlier forms of automation, which typically handled execution while leaving decision-making to humans.
The distinction matters because it determines our relationship to the processes that structure our lives. When AI serves as a tool—even a sophisticated one—we remain in the position of agents exercising judgment. We might rely on its analysis, defer to its recommendations, but the ultimate decision is ours to make. Responsibility remains clearly located. We can be held accountable because we maintained the capacity to choose differently.
But when we delegate decision-making to autonomous agents, we enter more ambiguous territory. The legal scholar Frank Pasquale has written extensively about algorithmic accountability gaps, arguing that "an intelligible society would assure that key decisions of its most important firms are fair, nondiscriminatory, and open to criticism" (Pasquale, 2015, p. 218). Santoni de Sio and Mecacci (2021) identify not one but four distinct responsibility gaps with AI: culpability gaps (who is blameworthy when AI causes harm?), moral accountability gaps (who must answer for AI decisions?), public accountability gaps (how can AI be subject to democratic oversight?), and active responsibility gaps (who bears ongoing duty to prevent AI-caused harm?). Each of these gaps requires different interventions, and none is adequately addressed by current governance frameworks.
Who is responsible when an agentic AI makes a hiring decision that perpetuates bias, or approves a loan that a human underwriter would have flagged, or allocates resources in ways that systematically disadvantage certain populations? The humans who designed the system? The ones who deployed it? The organisation that benefits from its operations? The AI itself? Andreas Matthias (2004) identified this "responsibility gap" two decades ago, arguing that autonomous learning machines create situations where no human can be legitimately held accountable for outcomes they could neither predict nor prevent. The emergence of agentic AI—systems that not only learn but actively pursue goals through self-directed action—amplifies these concerns exponentially.
The standard response is that humans remain "in the loop"—that agentic systems operate under supervision, with ultimate authority residing in human oversight. But this framing obscures the practical dynamics of delegation. Research on human-automation interaction consistently demonstrates that as systems become more reliable, human supervisors increasingly defer to their recommendations, even when those recommendations are erroneous. Parasuraman and Manzey (2010) developed an integrated model showing that automation complacency occurs dynamically across multiple-task conditions, with attention playing the central role in oversight failures. Goddard et al. (2012), in their systematic review, found that clinical decision support systems increased the risk of incorrect decisions by 26 per cent when the system provided erroneous advice—clinicians trusted the machine even when it was wrong.
We experience what Lisanne Bainbridge (1983) termed the "ironies of automation." In her foundational analysis, Bainbridge observed that "the designer who tries to eliminate the operator still leaves the operator to do the tasks which the designer cannot think how to automate" and "physical skills deteriorate when they are not used, particularly the refinements of gain and timing" (pp. 775-776). Perhaps the final irony, she noted, is that "it is the most successful automated systems, with rare need for manual intervention, which may need the greatest investment in human operator training" precisely because intervention opportunities become so rare that skills atrophy. After four decades, these ironies remain unresolved (Strauch, 2018). Rinta-Kahila et al. (2023) document "vicious circles" of skill erosion, where automation dependence leads to capability degradation, which increases dependence, which accelerates degradation.
Moreover, the economic logic driving agentic AI adoption actively undermines meaningful human oversight. These systems are valuable precisely because they reduce the need for human involvement—because they enable organisations to accomplish more with fewer people, to operate faster by eliminating the delays inherent in human deliberation. Building in robust oversight mechanisms contradicts the efficiency imperatives that motivate deployment in the first place. Organisations face a fundamental tension: the more effective human oversight becomes, the less it serves the cost-reduction goals that justified automation.
Meredith Whittaker, co-founder of the AI Now Institute and current president of Signal, argues that corporate ethics initiatives often function as "a smokescreen that avoids conversations about real accountability and liability and topics of power" (Whittaker, 2019). We end up with what might be termed accountability theatre: formal structures of human review that exist more to satisfy ethical and regulatory requirements than to exercise substantive control. Langer et al. (2024), in their interdisciplinary analysis presented at ACM FAccT, argue that effective human oversight requires not merely presence but three specific conditions: sufficient causal power to intervene, suitable epistemic access to understand system operations, and proper motivation to exercise vigilance. Absent any of these conditions, oversight becomes performative rather than substantive. Their analysis of the EU AI Act's Article 14 oversight requirements suggests significant gaps between regulatory intent and practical implementation.
The Philosophy of Machine Agency
There is a philosophical dimension to agentic AI that I have been circling but have not yet addressed directly. These systems raise fundamental questions about the nature of intelligence, intentionality, and purpose that extend beyond their practical applications.
When we describe an AI system as "autonomous" or "goal-directed," when we say it "plans" and "decides" and "pursues objectives," we are employing metaphors borrowed from the vocabulary of human agency. But what does it actually mean for a system to have a goal, in the absence of consciousness, desire, or any subjective experience of striving? This is not merely an academic question; the metaphors we deploy shape how we think about these systems, what we expect from them, and how we structure our relationship to them.
The philosopher Daniel Dennett made a useful distinction between what he called the "intentional stance" and the "physical stance." Taking the intentional stance toward an entity means treating it as if it has beliefs, desires, and intentions—as if its behaviour can be understood and predicted by attributing mental states to it. This stance is, Dennett argued, "the strategy of prediction and explanation that attributes beliefs, desires, and other states to systems and predicts future behavior from what it would be rational for an agent to do, given those beliefs and desires" (Dennett, 1987, p. 49). We adopt it toward animals, toward corporations, even toward thermostats ("the thermostat 'wants' to keep the room at 72 degrees").
The catch is that adopting the intentional stance does not necessarily commit us to the metaphysical claim that the entity in question genuinely possesses mental states. It is a pragmatic posture, a way of making sense of behaviour, not a declaration about underlying reality. When we describe an agentic AI as "pursuing goals," we are adopting the intentional stance because it provides a compact way to characterise complex behaviour patterns. But we should not mistake this convenient fiction for literal truth. The danger lies in what philosophers call the "reification" of metaphor—treating a useful way of speaking as if it described an underlying reality.
Luciano Floridi (2025), in recent work on artificial agency, proposes what he terms the "Multiple Realisability of Agency" thesis, arguing that AI represents a genuinely new form of agency—but one that operates without intelligence in any phenomenologically rich sense. This framing preserves the functional utility of describing AI systems as agents while resisting the anthropomorphic conflation of machine operation with human experience. Agency, on this view, admits of degrees and kinds; AI possesses a form of agency sufficient for goal-pursuit without possessing the consciousness, affect, or subjective experience that characterise human agency.
Here is why this matters: the metaphors we use to describe agentic AI shape how we think about our relationship with these systems and about the nature of delegation itself. If we genuinely believe that AI systems possess something like agency—that they have interests, perspectives, objectives that might diverge from ours—then our relationship with them becomes one of negotiation and potential conflict. We must attend to their autonomy not just instrumentally (to ensure they function as intended) but ethically (to respect their status as quasi-subjects).
But if we recognise that "agency" in agentic AI is metaphorical—that these systems do not actually experience goals or make decisions in any phenomenologically rich sense—then our relationship is fundamentally different. We remain the only genuine locus of agency in the system. The question is not how to negotiate with autonomous AI agents but how to maintain our own agency in the face of systems that obscure human decision-making behind veils of algorithmic complexity. The philosophical stakes are practical: how we frame machine agency determines whether we approach AI governance as a matter of alignment (getting AI to share our values) or oversight (maintaining human control over powerful tools).
This distinction has profound implications for how we approach the governance of agentic AI. If we adopt the first framing, we might focus on developing mechanisms for AI systems to explain or justify their decisions, on ensuring they operate according to ethically aligned values, on creating frameworks for holding them accountable. If we adopt the second framing, we focus instead on maintaining human agency: on ensuring that the deployment of autonomous systems does not erode our capacity to understand and intervene in the processes that shape our lives, on preserving domains of human judgment that we deem too important to delegate, on structuring sociotechnical systems so that algorithmic outputs remain legible and revisable.
I find myself drawn to the second framing, not because I am certain that AI systems lack anything deserving the name "agency" in some abstract sense, but because I am convinced that treating them as genuine agents prematurely forecloses important questions about power, responsibility, and the conditions for human flourishing. When we say that AI is "no longer a tool, it's a colleague," we naturalise a particular configuration of human-machine relations—one in which substantial decision-making authority migrates from human to algorithmic agents—and make it harder to question whether that configuration serves our deepest purposes.
The Regulatory Response
The emergence of agentic AI has prompted regulatory responses across multiple jurisdictions, though these remain in early stages of implementation and their adequacy for addressing autonomous systems is far from clear. The regulatory landscape reflects a fundamental tension between the pace of technological development and the deliberative processes through which democratic societies establish governance frameworks.
The European Union's Artificial Intelligence Act, which entered into force in August 2024 with provisions rolling out through 2027, represents the most comprehensive regulatory framework to date. The Act defines AI systems as "machine-based system[s] designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment" (European Parliament, 2024, Article 3). Notably, this definition explicitly contemplates the autonomous operation that characterises agentic AI. The Act establishes a risk-based classification system, with prohibited practices (such as social scoring and certain forms of biometric surveillance) banned outright, high-risk applications subject to stringent requirements, and lower-risk systems facing lighter-touch regulation.
Article 14 of the EU AI Act mandates human oversight requirements for high-risk AI systems, stipulating that such systems must enable users to "fully understand the capacities and limitations" of the system, maintain "awareness of the possible tendency of automatically relying on or over-relying on the output," and retain the "ability to decide, in any particular situation, not to use the high-risk AI system or otherwise disregard, override or reverse the output" (European Parliament, 2024). These provisions directly address the concerns raised by research on automation bias and the erosion of human agency. Prohibited AI practices became effective February 2, 2025, with high-risk system requirements following in August 2026.
However, as Langer et al. (2024) observe in their analysis of Article 14, there exists a significant gap between regulatory intent and practical implementation. Effective oversight requires not merely that humans theoretically retain override capability but that they possess the epistemic access, causal power, and sustained motivation to exercise it meaningfully. The EU framework establishes requirements; whether those requirements translate to genuine human control remains an open question that will only be answered through implementation experience and enforcement practice.
In the United States, regulatory development has proceeded more slowly and through different mechanisms. The National Institute of Standards and Technology (NIST) published its AI Risk Management Framework in January 2023, establishing voluntary guidelines organised around four core functions: GOVERN, MAP, MEASURE, and MANAGE (NIST, 2023). The framework emphasises "context-dependent" risk assessment and provides detailed guidance for identifying, assessing, and mitigating AI-related risks. While not legally binding, the NIST framework has influenced corporate AI governance practices and may shape future regulatory requirements. Its emphasis on continuous monitoring and adaptive governance may prove particularly relevant as agentic systems' behaviours evolve through deployment.
Legislative efforts continue to develop. The Algorithmic Accountability Act, reintroduced in the 119th Congress as S. 2164, would require large companies to conduct impact assessments of automated decision systems and establish transparency requirements for when and how such systems are deployed (U.S. Congress, 2025). The bill focuses on "critical decisions" affecting housing, employment, education, credit, and healthcare—precisely the domains where agentic AI deployment raises the most significant concerns about algorithmic accountability. However, as of late 2025, the bill remains in committee, and the regulatory landscape for AI in the United States continues to be characterised more by voluntary frameworks than binding requirements.
Beyond governmental regulation, industry standards are emerging. ISO/IEC 42001:2023 establishes a structured framework for governing AI projects with 38 specific controls across 10 clauses, providing organisations with a comprehensive approach to AI management systems (ISO, 2023). Whether voluntary standards can adequately constrain the deployment of autonomous systems in the absence of binding regulation remains contested. Critics argue that self-regulation tends to privilege industry interests over public accountability, while proponents contend that flexible standards can adapt more quickly to technological change than legislation.
Critical scholars have raised concerns about the limitations of both regulatory and voluntary approaches. Regilme (2024) argues that current AI governance frameworks inadequately address the global dimensions of AI development, particularly the labour exploitation and environmental damage concentrated in the Global South. Data labelers in Venezuela, he notes, earn $0.90-$2 per hour compared to $10-25 per hour for similar work in the United States. Correa Lucero and Martens (2025) identify "colonial structures" embedded in AI systems, including data colonialism, labour coloniality, and what they term "digital feudalism." These perspectives suggest that effective AI governance requires attention not merely to technical safety but to the political economy of AI development and deployment—questions that current regulatory frameworks largely elide.
Alternative Futures
The historian of technology David Edgerton (2007) argues that we systematically overvalue innovation while undervaluing the actual patterns of technology use. His "use-centred" approach suggests that the significance of a technology lies not in its moment of invention but in how it is actually deployed, maintained, and integrated into social practice over time. This perspective offers a corrective to the innovation-centric discourse surrounding agentic AI, reminding us that technological trajectories are shaped by choices—individual, organisational, and political—rather than determined by capability alone. We need not conflate what AI can do with what we ought to let it do.
Some futures are easier to imagine than others. One trajectory—already underway in many organisations—involves the progressive automation of decision-making across increasingly consequential domains. Administrative tasks give way to operational choices; operational choices yield to strategic planning; strategic planning eventually encompasses resource allocation, organisational structure, and competitive positioning. In this future, human work becomes primarily about setting high-level objectives and monitoring outcomes, while the actual process of determining how to achieve those objectives unfolds autonomously, mediated by networks of collaborating AI agents.
This future promises remarkable efficiency. It also concentrates power in the hands of those who control these systems—who determine their training, shape their objective functions, decide which domains remain open to autonomous operation and which require human judgment. It is a future where most people experience work not as a domain of agency and decision-making but as a context where they respond to directives generated by systems whose logic they neither understand nor influence. The democratisation of AI tools does not automatically translate to democratisation of AI governance.
Recent research on human-AI collaboration suggests the picture is more complex than either techno-optimists or pessimists suggest. Vaccaro et al. (2024), analysing over 100 studies, found that on average, AI-human combinations do not outperform the best human-only or AI-only systems—synergy is difficult to achieve except in specific creative tasks. This finding challenges the assumption that hybrid approaches automatically yield superior outcomes and suggests that thoughtful task allocation, rather than blanket collaboration, is required. The question is not whether to use AI but which decisions warrant autonomous operation and which require human judgment.
Brynjolfsson et al. (2025), in their study of generative AI deployment at a large software company, document a 15 per cent average productivity increase with AI assistance—but with highly uneven distribution. Novice and low-skilled workers saw improvements of up to 34 per cent, while experienced workers showed minimal gains. The researchers interpret this as AI disseminating the tacit knowledge of top performers to less experienced colleagues. The implications for skill development are ambiguous: does AI assistance accelerate learning by providing expert scaffolding, or does it create dependency that impedes the development of independent expertise? MIT Sloan research suggests both dynamics operate simultaneously, with outcomes depending heavily on implementation choices (MIT Sloan, 2024).
Another future is possible. One where we deploy agentic AI selectively, in contexts where autonomous operation genuinely serves human flourishing—where it liberates attention for more meaningful work, where it handles genuinely routine decisions so humans can focus on complex, value-laden judgments. One where we invest at least as much in developing human capability as we do in delegating tasks to machines—where education emphasises the kinds of judgment that resist automation, where we cultivate the capacities that make us indispensable collaborators with intelligent systems rather than obsolete bottlenecks. This future requires intentionality; it will not emerge by default.
Shannon Vallor's (2016) work on "technomoral virtues" offers a philosophical foundation for this alternative trajectory. She argues that human flourishing in technological contexts requires cultivating virtues like practical wisdom (phronesis), justice, and humility that enable us to navigate the ethical challenges posed by powerful technologies. Neither blind embrace nor reflexive rejection of AI serves human interests; what is required is the wisdom to discern appropriate applications and the courage to maintain boundaries around domains that should remain under human control. This is not technophobia but what we might call "technological prudence"—the capacity to assess technological affordances in light of human purposes.
The World Economic Forum's Future of Jobs Report 2025 projects that technological change will create 170 million new jobs globally by 2030 while displacing 92 million—a net gain of 78 million positions (WEF, 2025). The report emphasises that 39 per cent of workers' core skills are expected to change by 2030, highlighting the scale of adaptation required. The International Labour Organisation, analysing occupational exposure to generative AI, concludes that "transformation, not replacement, is the most likely outcome" for the majority of exposed occupations (ILO, 2025). These projections suggest that the future of work alongside AI remains genuinely open, contingent on the choices we make about deployment, training, and governance. They also underscore that "transformation" carries its own costs and demands, particularly for workers whose current skills become devalued.
Complicating the Critique
I want to complicate what I have said so far, to acknowledge dimensions of this development that my initial framing may have obscured. My critique of agentic AI risks suggesting that I long for a pre-technological innocence, for purely human systems unmediated by algorithmic logic. This would be both nostalgic and intellectually dishonest. My own professional practice is thoroughly entangled with digital tools; I rely on email systems, database platforms, collaborative software that would have been unimaginable a generation ago. I am not opposed to delegation per se, nor to automation that genuinely serves human flourishing. The question is not whether to use technology but how to use it wisely.
What troubles me is not that we build systems with increasing autonomy, but that we seem to be doing so without sustained attention to what such systems displace—what forms of knowledge, relationship, and judgment get devalued or rendered obsolete when autonomous agents become the primary mechanisms through which work gets done. Every technological system embodies assumptions about what matters and what can be safely ignored; the question is whether we have made those assumptions explicit and subjected them to critical scrutiny.
Let me offer a concrete example that complicates my earlier concerns about education. As I mentioned earlier, in preparing English language materials for B1-level students, I recently experimented with using AI to generate contextually rich scenarios for teaching cybersecurity concepts. The system produced exercises that were surprisingly nuanced, incorporating authentic discourse patterns, culturally relevant examples, and appropriate scaffolding for the target proficiency level. It accomplished in minutes what might have taken me hours of careful construction.
But here is what struck me: the AI did not simply execute my instructions; it made thousands of micro-decisions about vocabulary selection, sentence complexity, content sequencing—decisions that, in my manual approach, I would have made reflectively, adjusting based on my embodied knowledge of these particular students, their interests, their struggles with specific grammatical structures. The agent optimised for a general model of B1 competence, producing materials that were pedagogically sound in a generic sense but lacking the specificity that comes from situated knowledge.
Yet I found myself grateful for the materials. They provided a starting point I could adapt, a scaffold I could customise. The agent's autonomy did not eliminate my expertise but reframed how I exercised it—shifting my work from construction to curation, from generation to refinement. This felt like a legitimate form of collaboration, one where the system's capabilities enhanced rather than replaced my judgment. The value I added was precisely the situated knowledge that the AI lacked: understanding of these particular students, this particular context, these particular learning objectives.
The question, then, is under what conditions agentic AI genuinely augments human capability versus when it substitutes for it in ways that diminish our agency. I do not think there is a universal answer. It depends on the domain, on implementation, on the power dynamics governing deployment. Some forms of delegation are liberating; others are alienating. The challenge is developing the discernment to distinguish between them—and the institutional structures to ensure that such discernment actually shapes how these systems are built and deployed.
Wisdom Over Sophistication
I began by describing 2025's agentic AI as the most significant advancement in artificial intelligence, not because of any particular technical breakthrough but because it represents a philosophical threshold—the moment when we stopped asking AI to amplify our intentions and began delegating the formation of intentions themselves.
I want to conclude by questioning whether "advancement" is the right frame for understanding this development. Advancement suggests unidirectional progress, movement toward a state we recognise as better than what preceded it. But the emergence of agentic AI seems to me more ambiguous than that characterisation allows. It represents genuine technical achievement, the successful realisation of capabilities that have been theoretical aspirations for decades. It opens possibilities for addressing problems that have seemed intractable at human scale. It will undoubtedly generate forms of value, convenience, and capability that we cannot yet fully anticipate.
But it also poses risks that we are only beginning to acknowledge: risks to human agency, to the kinds of relationships that make work meaningful, to the forms of knowledge that resist algorithmic capture, to the democratic accountability of consequential decision-making. These are not inevitable outcomes but possibilities that become more or less likely depending on how we choose to develop and deploy these systems. Technology does not determine its own use; humans do, within the constraints and affordances that technologies create.
The real advancement, I suspect, will be measured not by the sophistication of the autonomous agents we build but by the wisdom with which we decide when and how to use them. By whether we manage to preserve domains of human judgment that we deem too important to delegate. By whether we structure these systems to enhance rather than erode our capacity to shape the conditions of our own lives. By whether we resist the technocratic fantasy that all meaningful questions can be reduced to optimisation problems with computable solutions.
Today, I found myself requesting yet again the assistance of an AI system to add a few more contextual language exercises, and casually observing: "I don't tell it what to do anymore. I tell it what I want, and it figures out how to get there." The distinction still resonates. What we want—the values, purposes, and forms of life we deem worth pursuing—cannot itself be delegated to autonomous agents. That remains irreducibly human work. The question facing us in 2025 and beyond is whether we will preserve the institutional, educational, and political conditions necessary for that work to flourish, or whether we will gradually cede that ground to systems optimising toward objectives we articulated in haste, pursuing futures we never consciously chose.
The technical capabilities exist. The regulatory frameworks are emerging. The philosophical questions have been articulated. What remains is the collective will to shape this technology rather than merely adapt to it—to insist that agentic AI serve purposes we have deliberately chosen rather than defaulting to purposes that emerge from the path of least resistance. In that insistence lies the difference between a future we have made and one that has simply happened to us.
References
Alexander, R. (2020). A dialogic teaching companion. Routledge.
Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779. https://doi.org/10.1016/0005-1098(83)90046-8
Belcic, I., & Stryker, C. (2025). AI agents in 2025: Expectations vs. reality. IBM Think. https://www.ibm.com/think/insights/ai-agents-2025-expectations-vs-reality
Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at work. The Quarterly Journal of Economics, 140(2), 889–942. https://doi.org/10.1093/qje/qjae044
Correa Lucero, H., & Martens, C. (2025). Colonial structures in AI: A Latin American decolonial literature review of structural implications for marginalised communities in the Global South. AI & Society. https://doi.org/10.1007/s00146-025-02547-9
Dennett, D. C. (1987). The intentional stance. MIT Press.
Dupont-Calbo, J. (2025, May 15). L'IA n'est plus un outil, c'est un collègue: Moderna fusionne sa DRH et sa DSI. Les Echos.
Edgerton, D. (2007). The shock of the old: Technology and global history since 1900. Oxford University Press.
European Parliament and Council of the European Union. (2024). Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union, L 1689.
Floridi, L. (2025). AI as agency without intelligence: On artificial intelligence as a new form of artificial agency and the multiple realisability of agency thesis. Philosophy and Technology, 38(1), 1–27. https://doi.org/10.1007/s13347-024-00819-w
Freire, P. (1970). Pedagogy of the oppressed (M. B. Ramos, Trans.). Herder and Herder. (Original work published 1968)
GarcÃa-Carrión, R., López de Aguileta, G., Padrós, M., & Ramis-Salas, M. (2020). Implications for social impact of dialogic teaching and learning. Frontiers in Psychology, 11, Article 140. https://doi.org/10.3389/fpsyg.2020.00140
Gartner, Inc. (2024, October 21). Gartner identifies the top 10 strategic technology trends for 2025 [Press release]. https://www.gartner.com/en/newsroom/press-releases/2024-10-21-gartner-identifies-the-top-10-strategic-technology-trends-for-2025
Gartner, Inc. (2025a, August 5). Gartner Hype Cycle identifies top AI innovations in 2025 [Press release]. https://www.gartner.com/en/newsroom/press-releases/2025-08-05-gartner-hype-cycle-identifies-top-ai-innovations-in-2025
Goddard, K., Roudsari, A., & Wyatt, J. C. (2012). Automation bias: A systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1), 121–127. https://doi.org/10.1136/amiajnl-2011-000089
Gulson, K. N., Sellar, S., & Webb, P. T. (2022). Algorithms of education: How datafication and artificial intelligence shape policy. University of Minnesota Press.
Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y., Zhang, C., Wang, J., Wang, Z., Yau, S. K. S., Lin, Z., Zhou, L., Ran, C., Xiao, L., Wu, C., & Schmidhuber, J. (2024). MetaGPT: Meta programming for a multi-agent collaborative framework. In Proceedings of the International Conference on Learning Representations (ICLR 2024).
International Labour Organization. (2025, May 20). Generative AI and jobs: A refined global index of occupational exposure (ILO Working Paper 140). https://www.ilo.org/resource/news/one-four-jobs-risk-being-transformed-genai
International Organization for Standardization. (2023). ISO/IEC 42001:2023 Information technology—Artificial intelligence—Management system. ISO.
Langer, M., Oster, D., Speith, T., Hermanns, H., Kästner, L., Schmidt, E., Sesing, A., & Baum, K. (2024). On the quest for effectiveness in human oversight: Interdisciplinary perspectives. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24) (pp. 2310–2329). ACM. https://doi.org/10.1145/3630106.3659051
Matthias, A. (2004). The responsibility gap: Ascribing responsibility for the actions of learning automata. Ethics and Information Technology, 6(3), 175–183. https://doi.org/10.1007/s10676-004-3422-1
MIT Sloan School of Management. (2024). When humans and AI work best together—and when each is better alone. MIT Sloan Ideas Made to Matter. https://mitsloan.mit.edu/ideas-made-to-matter/when-humans-and-ai-work-best-together-and-when-each-better-alone
National Institute of Standards and Technology. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0) (NIST AI 100-1). U.S. Department of Commerce. https://doi.org/10.6028/NIST.AI.100-1
Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52(3), 381–410. https://doi.org/10.1177/0018720810376055
Pasquale, F. (2015). The Black Box Society: The secret algorithms that control money and information. Harvard University Press.
Plass, J. L., & Froehlich, D. (2025). The future of personalized learning with AI. Computers & Education. https://doi.org/10.1016/j.compedu.2025.105189
Regilme, S. S. F. (2024). Artificial intelligence colonialism: Environmental damage, labor exploitation, and human rights crises in the Global South. SAIS Review of International Affairs, 44(2), 75–92. https://doi.org/10.1353/sais.2024.a943819
Rinta-Kahila, T., Penttinen, E., Salovaara, A., Soliman, W., & Ruissalo, J. (2023). The vicious circles of skill erosion: A case study of cognitive automation. Journal of the Association for Information Systems, 24(5), 1378–1412.
Santoni de Sio, F., & Mecacci, G. (2021). Four responsibility gaps with artificial intelligence: Why they matter and how to address them. Philosophy and Technology, 34(4), 1057–1084. https://doi.org/10.1007/s13347-021-00450-x
Strauch, B. (2018). Ironies of automation: Still unresolved after all these years. IEEE Transactions on Human-Machine Systems, 48(5), 419–433. https://doi.org/10.1109/THMS.2017.2732506
Sukharevsky, A., Kerr, D., Hjartar, K., Hämäläinen, L., Bout, S., Di Leo, V., & Dagorret, G. (2025, June 13). Seizing the agentic AI advantage: A CEO playbook to solve the gen AI paradox and unlock scalable impact with AI agents. McKinsey QuantumBlack. https://www.mckinsey.com/capabilities/quantumblack/our-insights/seizing-the-agentic-ai-advantage
U.S. Congress. (2025). Algorithmic Accountability Act of 2025, S. 2164, 119th Congress. https://www.govtrack.us/congress/bills/119/s2164
Vaccaro, M., Almaatouq, A., & Malone, T. W. (2024). When AI and humans work together: Evidence from 100+ studies. Nature Human Behaviour. https://doi.org/10.1038/s41562-024-02024-1
Vallor, S. (2016). Technology and the virtues: A philosophical guide to a future worth wanting. Oxford University Press.
Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., Zhao, W. X., Wei, Z., & Wen, J.-R. (2024). A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6), Article 186345. https://doi.org/10.1007/s11704-024-40231-1
Watters, A. (2021). Teaching machines: The history of personalized learning. MIT Press.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (pp. 24824–24837).
Weng, L. (2023, June 23). LLM powered autonomous agents. Lil'Log. https://lilianweng.github.io/posts/2023-06-23-agent/
Whittaker, M. (2019, April). Reclaiming the future: Privacy, ethics & organizing in tech [Lecture]. City Arts & Lectures, San Francisco, CA.
Widder, D. G., & Hicks, M. (2024). Watching the generative AI hype bubble deflate (arXiv:2408.08778). Harvard Kennedy School Ash Center. https://ash.harvard.edu/resources/watching-the-generative-ai-hype-bubble-deflate/
World Economic Forum. (2025, January). Future of Jobs Report 2025. WEF. https://www.weforum.org/publications/the-future-of-jobs-report-2025/
Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E., Jiang, L., Zhang, X., Zhang, S., Liu, J., Awadalla, A., White, R. W., Burger, D., & Wang, C. (2024). AutoGen: Enabling next-gen LLM applications via multi-agent conversation. In Proceedings of the Conference on Language Modeling (COLM 2024).
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023a). ReAct: Synergizing reasoning and acting in language models. In Proceedings of the International Conference on Learning Representations (ICLR 2023).
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. (2023b). Tree of Thoughts: Deliberate problem solving with large language models. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023).
🔴 Viewpoint is a random series of spontaneous considerations about subjects that linger in my mind just long enough for me to write them down. They express my own often inconsistent thoughts, ideas, assumptions, and speculations. Nothing else. Quote me at your peril.