As artificial intelligence (AI) technologies advance and integrate into healthcare, business operations, and sensitive data ecosystems, it becomes increasingly critical to understand and manage their limitations and evolving behavior. A recent study published in The BMJ highlights that even the most advanced large language models (LLMs) exhibit cognitive impairments when tested using tools like the Montreal Cognitive Assessment (MoCA). This finding not only underscores the limitations of AI in clinical applications but also emphasizes the importance of robust monitoring and management practices for organizations adopting AI.
Understanding AI Limitations and Model Behavior
The study revealed that leading LLMs, including ChatGPT 4o, Claude 3.5, and Gemini 1.0, struggle with tasks requiring visuospatial skills and executive functions—key competencies in clinical and operational decision-making. These deficiencies challenge the assumption that AI will replace human professionals, particularly in nuanced, high-stakes environments.
However, the challenges posed by AI are not confined to clinical applications. Organizations leveraging AI for sensitive tasks—whether processing protected health information (PHI), personally identifiable information (PII), or making mission-critical decisions—must contend with risks such as:
- Model Drift: Changes in AI model performance over time due to evolving datasets or operational shifts.
- Hallucinations: Instances where AI generates inaccurate or fabricated information, which can erode trust and lead to operational failures.
- Emerging Intelligence or Maladaptive Behaviors: Sudden, unexplained increases in AI capability or self-replication behaviors that could pose significant ethical and security concerns.
Establishing Baselines and Continuous Monitoring
To address these risks, organizations must implement structured processes for setting operational baselines and monitoring AI performance over time. Tools like ZeroTrusted.ai offer advanced solutions tailored to these needs by enabling:
- Baseline Establishment: Defining expected behaviors and outcomes for AI models to detect deviations effectively.
- Real-Time Monitoring: Tracking model performance, data usage, and anomalies in real time to ensure operational consistency and security.
- Proactive Alerts: Identifying signs of data drift, hallucinations, and governance violations before they escalate into critical issues.
- Health Checks: Continuously evaluating AI systems’ reliability, security, and privacy to meet organizational and regulatory requirements.
For organizations adopting AI to take over human-managed tasks, these tools and processes are not optional—they are critical safeguards. AI systems must be monitored and managed with the same rigor as human professionals to ensure consistent performance, ethical operations, and compliance with standards such as HIPAA, NIST, and ISO 42001.
Monitoring AI Integration Across Products and Technologies
The rapid adoption of AI into diverse products and platforms further complicates the landscape. Whether integrating AI into customer service systems, financial tools, or autonomous vehicles, organizations must monitor the sessions and components for:
- Anomalies and Decline: Detecting errors, delays, or security vulnerabilities that could impact performance.
- Disinformation Risks: Ensuring that AI-generated insights remain accurate and unbiased.
- Replication and Unauthorized Intelligence Growth: Guarding against AI models replicating themselves or developing unintended capabilities, which could create security or ethical challenges.
Traditional security technologies often fall short in addressing these complexities. Applying outdated “Model T” cybersecurity solutions to the “Ferrari” that is modern AI leads to inadequate protection and operational inefficiencies. Organizations must adopt AI-specific monitoring tools, like ZeroTrusted.ai, that are purpose-built to address the unique risks posed by LLMs and other advanced models.
Findings on AI Cognitive Impairments
The BMJ study’s evaluation of LLMs using the MoCA test sheds light on the limitations of AI in tasks requiring human-like cognitive abilities:
- Visual and Executive Tasks: LLMs struggled with visuospatial and executive functions, such as drawing a clock face or completing a trail-making task.
- Delayed Recall: Models like Gemini failed to remember a five-word sequence, highlighting memory-related deficiencies.
- Empathy and Interpretation: Chatbots showed an inability to accurately interpret complex visual scenes or exhibit empathy, which are crucial in clinical and interpersonal applications.
These results demonstrate that while AI excels at language-based tasks, significant gaps remain in its ability to handle visual abstraction and executive decision-making. This is a reminder that AI, even in its most advanced forms, should augment human capabilities rather than replace them entirely—at least for now.
The Critical Role of AI Monitoring Tools
For organizations adopting AI, the focus must shift from merely implementing AI to managing it effectively. Monitoring tools like ZeroTrusted.ai are indispensable in ensuring:
- Security: Preventing data leaks, unauthorized access, and prompt injection attacks.
- Reliability: Maintaining consistent model performance across various tasks and environments.
- Governance: Ensuring compliance with organizational policies and regulatory standards.
- Adaptability: Proactively addressing changes in AI behavior, whether from data drift or updates to the model.
These capabilities are particularly critical for organizations handling sensitive data or integrating AI into mission-critical operations.
Looking Ahead: The Balance Between Human Expertise and AI
The BMJ study concludes that AI is unlikely to replace human professionals, such as neurologists, in the near future. However, it raises an intriguing possibility: AI models themselves may become virtual “patients,” requiring monitoring and intervention to address their “cognitive impairments.”
This perspective underscores the importance of tools and processes that ensure AI systems are not only secure but also reliable, ethical, and aligned with human values. As organizations continue to embrace AI, they must prioritize monitoring, governance, and adaptability to realize the full potential of this transformative technology while mitigating its risks.
With the right tools, processes, and vigilance, organizations can safely and effectively navigate the complexities of AI adoption—ushering in a future where humans and AI work together to achieve unprecedented outcomes.
Reference: “Age against the machine—susceptibility of large language models to cognitive impairment: cross sectional analysis” 18 December 2024, The BMJ.
DOI: 10.1136/bmj-2024-081948
About the Author
Waylon Krush, CISSP, CISA, CGRC is a leading AI cybersecurity expert and the CEO of ZeroTrusted.ai, an organization dedicated to advancing secure and trustworthy AI. With over 25 years of experience in cybersecurity, including developing solutions for government and private sectors, Krush focuses on ensuring the security, reliability, and privacy of AI technologies. His work in AI governance and monitoring tools helps organizations meet and exceed industry standards like NIST and ISO 42001.
Comments