Services Why It Matters AI-SOFA Approach Contact
AI Consulting & Development

We build AI systems
that actually work.

Ideatum is a team of PhDs in AI and Science. We consult, design, and build production AI — and we assess the systems others have built. From strategy to deployment, we bring the rigour that matters.

AI Strategy · ML Engineering · AI Risk Assessment · System Architecture · AI Governance · Clinical AI · EU AI Act · NLP & Computer Vision · AI Strategy · ML Engineering · AI Risk Assessment · System Architecture · AI Governance · Clinical AI · EU AI Act · NLP & Computer Vision ·
PhD
Team of PhDs in AI & Science
8
Risk dimensions per system
v2.0
AI-SOFA · Zenodo DOI
E2E
End-to-end: strategy to production

AI consulting & development,
built on science.

We work across the full AI lifecycle — from identifying where AI creates value, to building and deploying production systems, to assessing and governing the ones already running. Every engagement is led by senior researchers with deep domain expertise.

01

AI Strategy & Roadmap

We help organisations identify where AI creates real value — and where it doesn't. Practical, evidence-based roadmaps that align AI capabilities with business objectives and technical feasibility.

Strategy
02

Custom AI Development

End-to-end design and engineering of production AI systems — from data pipelines and model training to deployment infrastructure. ML, NLP, computer vision, and domain-specific solutions.

Development
03

AI Risk & Due Diligence

Rigorous assessment of AI systems using the AI-SOFA framework. Quantitative risk profiling for investors, boards, and regulators — built to withstand scrutiny.

Assessment
04

MLOps & Production Support

Monitoring, maintenance, and continuous improvement of deployed AI systems. We ensure your models stay accurate, your pipelines stay stable, and your systems degrade gracefully.

Operations
05

Regulatory & Compliance

Pre-audit mapping against EU AI Act, DORA, and sector-specific requirements. Documentation packages built for supervisory expectations and board-level governance.

Compliance
06

Scientific Advisory

Board-level and technical advisory on AI strategy, model governance, and publication-quality system documentation — particularly for life sciences, health tech, and financial services.

Advisory
"The question is not whether an AI system can perform a task. It is whether it degrades gracefully — or collapses."
AI-SOFA Framework · Ideatum

AI deployment is outpacing our ability to assess it.

Frontier AI systems are being deployed into production faster than the frameworks to evaluate them have matured. Investors fund them. Boards approve them. Regulators are still writing the rules.

When they fail — they fail in ways that are opaque, sudden, and systemic. The financial system has stress tests. Clinical medicine has validated endpoints. AI deployment, by contrast, largely relies on benchmark scores and vendor assurances.

That gap is where catastrophic risk lives. We built Ideatum to close it — with both the scientific depth to understand AI failure and the engineering capability to prevent it.

"Knight Capital lost $440M in 45 minutes. Zillow wrote down $569M. In both cases, the system passed every internal benchmark before deployment."

AI-SOFA Case Analysis

"Rigorous AI development and rigorous AI assessment are not separate disciplines. They are the same discipline, applied at different stages."

Ideatum Methodology

Science-first AI that holds up.

01

Quantify what others narrate

AI risk cannot be assessed with checklists or qualitative impressions. We apply clinical severity scoring methodology — adapted from ICU medicine — to derive a structured, numerical risk profile for every system we build or assess.

02

Build with failure modes in mind

Our research identified a specific threshold — the AI-Shock Condition — that separates recoverable failures from terminal ones. We engineer systems to stay above that boundary, and we assess others against it.

03

Deliver work that withstands scrutiny

Whether it's a production system or an assessment report — our deliverables are built for board scrutiny, regulatory audit, and real-world stress. The methodology is derived, not assembled.

AI-SOFA — measuring systemic risk at its source.

AI-SOFA (AI Systemic Operational Failure Assessment) is Ideatum's proprietary quantitative framework for evaluating catastrophic failure risk in deployed AI systems. Developed by analogy with the SOFA score used in intensive care medicine to predict organ failure — applying the same logic of multi-dimensional severity scoring to AI systems in production.

The framework assesses eight dimensions: Robustness, Controllability, Transparency, Alignment, Scalability, Dependency, Reversibility, and Oversight. Each is scored and combined into a risk profile that drives specific, prioritised recommendations.

The core empirical finding — the AI-Shock Condition — emerged from case analysis of historical AI failures across financial services, real estate, and health technology. Systems scoring above threshold on both Robustness and Controllability degraded gracefully. Those below did not recover.

Published
Zenodo · CC BY-NC-ND 4.0
Version
2.0 · 2024
Protocol
Confidential
Core Finding · AI-Shock Condition

Robustness > 2 AND Controllability > 2 cleanly separates terminal failures from survivable ones. This threshold drives all remediation priorities.

Read the AI-SOFA Paper on Zenodo
AI-SHOCK STABLE 5 2 2 5 ROBUSTNESS → CONTROLLABILITY → AI-SOFA Risk Matrix

Independent.
Rigorous.
Documented.

We carry no vendor relationships, no software commissions, no platform affiliations. Our only incentive is the quality of the work — whether we're building a system from scratch or assessing one built by others.

The Assessment Protocol — the operational heart of AI-SOFA — remains unpublished. It is the reason clients trust our assessments and the foundation of everything we build.

I

Measurement before opinion

AI risk cannot be assessed qualitatively. We quantify all eight SOFA dimensions and derive risk classifications from evidence, not narrative.

II

The AI-Shock Condition

Our core empirical finding separates terminal failures from survivable ones. This single threshold drives the entire remediation agenda for every system we build or assess.

III

Structural transfer across domains

Our methodology draws simultaneously on clinical medicine, algebraic statistics, and machine learning. This cross-domain depth cannot be replicated by single-discipline teams.

IV

Confidentiality by design

All proprietary methodology remains unpublished. Client data never leaves the engagement perimeter. NDAs are standard from first contact.

The right team
for what comes next.

Whether you need to build an AI system, assess one, or navigate the regulatory landscape — we'd like to hear from you.

Start a conversation