The Tech Due
Diligence
Manifesto
A new standard for evaluating software companies in the AI era. Built from 80+ engagements across Europe - for investors who want the truth, and companies who want to improve.
A new standard for evaluating software companies in the AI era. Built from 80+ engagements across Europe - for investors who want the truth, and companies who want to improve.
We have done over 80 technical due diligences. And we can tell you exactly what most of them look like when they go wrong.
A junior consultant spends two days in a codebase they have never seen before. They run a static analysis tool, produce a list of issues ordered by severity, and deliver a 40-page report that the investor cannot read and the startup will not act on. The deal closes. Six months later, the portfolio company needs to re-platform because the architecture cannot support the growth the round was supposed to fund.
Or the other version: no technical due diligence at all. The product demo was compelling. The founder was impressive. The commercial traction was real. Nobody looked under the hood. The 70% of deal failures linked to tech issues did not happen because the investors were careless. They happened because the model for evaluating technical risk is broken.
This is what we believe instead.
It gets skipped. Early-stage deal cycles move fast. Technical due diligence feels expensive and slow. So it gets cut, or compressed into a conversation that covers the surface but not the substance. A 2025 audit found high-risk vulnerabilities in 74% of target codebases. Almost none of those were flagged before the deal closed. The cost of fixing them post-close runs three to five times higher than if they had been caught pre-investment. Skipping technical due diligence is not a way to save money. It is a way to pay more later, under worse conditions.
It is treated as a one-time event. Software is a living system. Codebases evolve every week. Infrastructure changes. AI models drift. Teams turn over. The engineer who built the authentication layer leaves. A company that looked solid at Series A can be carrying serious structural risk by the time Series B arrives - not through negligence, but through the natural accumulation of decisions made under pressure. A snapshot taken at one moment tells you the state of the system. It tells you nothing about the trajectory. And trajectory is what you are actually funding.
Recommendations disappear into a folder. We have seen this more times than we can count. The report is accurate. The findings are real. The recommendations are prioritized. And six months later, nothing has changed - because the report was delivered as a document, not as a process. Without operational follow-through, technical due diligence is an expensive way to produce something no one acts on. The value is not in identifying problems. The value is in building the system that resolves them.
Investors are not buying code. They are buying a company's ability to execute repeatedly over time. Technical due diligence should reflect that.
The most common mistake in technical due diligence is starting with the code. Code is the output of a system. Evaluating the output without understanding the system is like reading the last chapter of a book to understand the plot.
We start at the top and work down. We begin with the vision, the product experience, and the market strategy. We move to the team and how they execute. Then we go deep into the technology. By the time we are reading source code and reviewing infrastructure, we understand the context in which every decision was made. That context is the difference between a finding and a conclusion.
A feature built quickly with known technical debt is a trade-off. The same feature built quickly without any awareness of the debt is a risk. The code looks identical. The organizational reality is completely different. You only know which one you are looking at if you started from the top.
The meetings are not just information-gathering sessions. They are also the only opportunity to read the people. Across all five, we are deliberately meeting as many members of the team as possible - not just the CEO and CTO, but the engineers, the product lead, the person who owns infrastructure. Each profile tells us something the org chart does not.
We are watching for ego. A team that deflects hard questions, blames external factors for every failure, or cannot acknowledge trade-offs they have made is a team that will struggle to improve. We are watching for ownership - the engineer who says “I built that” and can explain every decision, versus the one who says “that was before my time” about a system they are currently responsible for. We are watching for cultural coherence: do the different people we speak to describe the same company, or five different versions of it?
In a startup, culture is not a values poster on the wall. It is visible in how people handle a question they cannot answer, how they talk about their colleagues, and whether accountability is shared or quietly assigned to someone who is not in the room. By the end of the five meetings, we have a view of the technical system and a view of the human system behind it. Both matter. Often the human system matters more.
Before the first meeting, we have already spent time on the product. We go through the website, the onboarding, the core user flows. We form a view before anyone has the chance to brief us. Poor onboarding is almost never a UX problem - it is a prioritization problem. A confusing value proposition is almost never a copywriting problem - it is a strategy problem. The product experience is a mirror of the organization that built it.
In the meeting, we are assessing one fundamental question: does this team have a clear and realistic path from where they are today to where the roadmap says they will be in eighteen months? We have seen roadmaps that assumed a team of twelve but the company had four engineers. We have seen AI product strategies built on a data infrastructure that could not support them. We have seen CEOs who described technical capabilities their CTO had no awareness of. Any of these gaps, caught here, changes everything that follows.
This is often the most revealing meeting. Not because of what the CPO says - but because of what they cannot answer.
We get asked regularly whether we prefer Scrum, Kanban, Shape Up, or something else entirely. The honest answer is that we do not care. Every methodology works in the right hands. Every methodology fails in the wrong culture. A team running disciplined Kanban with clear WIP limits and genuine flow metrics will consistently outperform a team doing “Scrum” that is really just chaotic sprints with a retrospective nobody acts on.
What we actually assess is alignment and velocity. Not velocity as a number of story points - that metric is almost always gamed - but velocity as the lived experience of the organization. Can a business decision made on Monday become a product change in the hands of users by Friday? Is there a clear, short path from a business goal to a product release? Or does every initiative disappear into a planning process that takes longer than the market opportunity it was meant to capture?
The most dangerous pattern we see is misalignment between three layers that should move as one: business goals, product decisions, and engineering execution. When these are out of sync, the company builds things that do not matter, misses things that do, and cannot explain why the roadmap looks the way it does. We have seen CPOs who could not tell us which features on their roadmap were directly tied to a revenue or retention objective. We have seen CTOs who were designing the next quarter's releases without any input from the engineering team on feasibility. Both are symptoms of the same dysfunction: the product is being built by three teams that are not really talking to each other.
We also assess knowledge concentration. In almost every early-stage company, there is one person who holds a disproportionate share of critical system knowledge. Sometimes it is the CTO. Sometimes it is a senior engineer who has been there since day one. When that person leaves - and at some point, they always do - what happens? We have seen Series B companies grind to a halt because the one engineer who understood the payment integration quit. That is not a people problem. It is a structural risk that should have been flagged and addressed long before.
The question we use to open this meeting: if a user reports a bug right now, how long does it take your team to understand what happened, why it happened, and who was affected?
The answer tells us almost everything about the maturity of the system. Companies with real observability can answer that question in minutes. Companies without it start a conversation about pulling logs, checking timestamps, and cross-referencing three different tools that do not talk to each other. DORA research distinguishes monitoring - watching predefined metrics - from observability, which is the ability to actively debug a system through patterns not defined in advance. Most early-stage companies have monitoring. Almost none have observability. The gap costs them every time something breaks at 2am.
We also look at frontend architecture maturity: component structure, state management, testing, deployment. Not because these are the most critical issues at this stage, but because they reveal the engineering culture. A frontend that is chaotic almost always reflects a broader culture of moving fast without leaving the system in a state that others can understand and extend.
This is where we find the most expensive surprises. Not security vulnerabilities - though those appear regularly - but architectural decisions that looked sensible at a hundred users and become serious problems at a hundred thousand.
For companies with an AI layer - which is now most of them - we add a distinct set of questions. We have assessed companies that described themselves as “AI-native” and, on inspection, had a single prompt wrapped around a GPT-4 API call with no evaluation pipeline, no observability on model outputs, no fallback, and no understanding of their inference costs per user. That is not AI maturity. That is AI exposure - all the risk, none of the defensibility.
AI systems carry a category of technical risk that traditional software does not. Outputs are probabilistic. Models drift. Providers change pricing or deprecate models. Hallucinations create operational uncertainty that no amount of unit testing eliminates. We assess whether the company has built the infrastructure to manage that uncertainty - or simply inherited it.
Infrastructure is where optimism meets physics. Every startup believes their architecture will scale. Very few have tested that belief against real load, real data volume, or real failure scenarios.
The question we ask here is not “can this handle 10x growth?” - that is too abstract. We ask: what breaks first, at what load, and how long does it take the team to know? The answer reveals both the technical design and the operational maturity. A team that can point to their current bottleneck, explain why it exists, and describe their plan to address it before it becomes critical is a team that understands their system. A team that says “we haven't had scaling issues yet” has not thought about it - which is its own kind of answer.
The pattern we see most often at this layer is not spectacular failure - it is quiet accumulation. A database schema that made complete sense at one thousand users becomes a migration nightmare at fifty thousand. A deployment process that worked fine when the team shipped once a week becomes a daily source of friction when the team needs to ship ten times a day. These are not dramatic problems. They are the kind that grind a company down slowly - slowing delivery, draining engineering morale, and consuming disproportionate senior time on issues that should have been designed away. By the time they are visible, they are already expensive. The point of this meeting is to find them before that.
Every technical due diligence we conduct now includes a formal AI maturity assessment. Not simply whether the company uses AI - that question is no longer meaningful. Almost every company uses AI in some form. The question is how, how deeply, and whether it creates genuine competitive advantage or just surface-level differentiation that a competitor can replicate in a weekend.
We assess AI maturity across two distinct axes that most evaluations collapse into one. The first is how the team uses AI: are engineers using AI-assisted development tools, automated testing pipelines, AI-powered code review? Is the product team using AI to accelerate research, design, and prioritization? Is operations using AI agents to handle tasks that previously required manual intervention? A company where AI is genuinely embedded in how the team works moves faster, scales more efficiently, and compounds its velocity over time. A company where AI is a talking point in investor decks but absent from daily workflows is not an AI company - it is a company with an AI strategy document.
The second axis is how AI is embedded in the product itself. This is where the real differentiation lives - and where the real risk lives too. We look at the depth of AI integration: is it a single summarization feature bolted onto an otherwise traditional product, or is AI woven into the core user experience, the data model, and the competitive value proposition? We assess the quality of the underlying data infrastructure - because AI products are only as good as the data they run on, and weak data foundations are the most common reason AI features underdeliver in production. We evaluate the model orchestration layer: how prompts are structured, how context is managed, how outputs are validated before they reach users.
Critically, we assess AI observability - the dimension almost every company gets wrong. Running AI in production without observability is the equivalent of running a backend with no logs. You cannot improve what you cannot measure, and you cannot debug what you cannot see. We look for observability frameworks like Langfuse that give teams trace-level visibility into every model interaction: what prompt went in, what response came out, how long it took, what it cost, and whether the output met quality thresholds. Without this layer, AI systems are black boxes. When they fail - and they will fail - the team has no systematic way to understand why or prevent recurrence. We have seen companies running significant AI workloads with no evaluation pipeline, no cost tracking per feature, and no way to detect when model outputs degraded. That is not an AI product. That is an AI experiment running in production.
We assess companies across four levels of AI maturity - from early experimentation to genuinely AI-native operations - based on the framework we described in detail in The 4 Levels of AI Maturity. Level 1 is an LLM API call with no surrounding infrastructure. Level 4 is an organization where AI agents operate autonomously across product and operations, with full observability, continuous evaluation, and compounding velocity that a non-AI-native competitor structurally cannot match.
The assessment tells investors whether AI is a marketing claim or a moat. It tells companies exactly what they need to build to reach the next level - and in the current environment, where AI maturity is increasingly the primary driver of competitive differentiation, that clarity is one of the most valuable outputs we produce.
After enough engagements, the manual work becomes the bottleneck. Repository analysis that took two days of senior engineer time was producing findings we could have surfaced in hours with the right tooling. We were spending time on signal extraction that should have been spent on interpretation and interviews - the parts that actually require human judgment.
So we built TechSignal. It is the intelligence layer behind every engagement we run: AI-assisted code analysis, engineering signal detection, dependency mapping, architecture intelligence, continuous monitoring. The work that previously required days now runs before the first meeting.
The insight behind it is simple: the value of technical due diligence is not in reading code. It is in understanding what the code reveals about the organization that wrote it. TechSignal handles the reading. Our senior CTOs handle the understanding. That separation is what makes continuous technical diligence economically viable - fast enough to run at the pace of a deal cycle, affordable enough to run between funding rounds, and precise enough to surface the signals that matter before they become the problems that do not.
Our reports run around fifteen pages. We have a rule: if it takes longer than fifteen minutes for an investor to understand the verdict, the report has failed.
The Executive Summary is written for investors. It leads with a clear verdict, not a list of findings. It answers the question the investor is actually asking: should I be more or less confident in this company as a result of what we found? It then provides the supporting evidence: the key risks, the organizational strengths, the scalability readiness, and the execution confidence.
The Recommendations & Action Steps section is written for the startup. It is prioritized by impact and urgency. It is written to be actionable, not to be comprehensive. A list of 47 recommendations is not a roadmap - it is noise. We focus on the ten things that matter most, why they matter, and what done looks like. This section should go straight into the startup's backlog. If it does not, it has not been written correctly.
A report that only an engineer can understand has failed. A report that only an investor can understand has also failed. The goal is clarity for both - from the same document.
The final thing we believe, and the thing the industry is slowest to accept: a technical assessment is not an event. It is a practice.
Companies that undergo thorough technical due diligence are 2.8 times more likely to achieve successful outcomes according to McKinsey research on technology acquisitions. That gap does not come from doing a better one-time assessment. It comes from building a continuous relationship between technical reality and investment decisions.
Portfolio companies that are monitored continuously do not surprise their investors. Problems surface when they are small, not when they have compounded into crises. Recommendations get implemented because there is accountability - the next signal shows whether they were acted on or ignored. And the startup benefits too: continuous technical feedback is one of the most valuable things an investor can provide, and almost none of them do it.
The future of technical due diligence is not a better report. It is a different model entirely - one where technical intelligence runs continuously alongside investment decision-making, where risks surface when they are small rather than when they have compounded, and where the gap between diagnosis and execution closes because accountability is built into the process.
The investors who will look back on this period and feel good about their decisions are not the ones who moved fastest. They are the ones who understood what they were buying. Technical due diligence, done properly, is how you know the difference between a company that will execute its vision and a company that is selling one.
That distinction is worth knowing before the wire transfer, not after.
Above The Clouds has delivered 80+ technical due diligences across Europe for VCs, PE firms, family offices, and M&A advisors. If you are evaluating a software company and want a clear, honest technical verdict - or if you are a startup preparing for due diligence and want to know where you stand - let's talk. You can also explore our Tech Due Diligence service and our TechSignal platform.