From Interviews to Framework

Methodology

March 2026

This framework was built using a modified grounded theory methodology — adapted for AI-assisted qualitative research. The core commitment: let the data produce the framework, rather than testing a hypothesis against the data.

Grounded theory works by moving from raw observations to emergent categories through successive rounds of coding and abstraction. Each round compresses the data while preserving the specific, surprising, idiosyncratic material that makes the analysis worth reading. The adaptation for AI-assisted research introduced several constraints designed to prevent the failure modes specific to LLM-assisted qualitative work — principally, the tendency of language models to smooth away the high-entropy signal that makes qualitative data valuable.

The Pipeline

1
Semi-Structured Interviewsall functions
2
Open Codingemergent codes + analytic memos
3
Code Consolidationsame-phenomenon merges only
4
Axial Codingclusters → candidate axes
5
Framework Synthesis6 dimensions × 5 levels
Individual Assessments
Organizational Analysis
Implications

Data Collection

Semi-structured interviews were conducted across all functions at Tribe AI — engineers, GMs, PMs, data scientists, designers, operations, finance, recruiting, sales, and delivery leads. Interviews lasted 30–60 minutes.

The interviews explored how each person uses AI in their work: what they build, what tools they use, how they verify outputs, how they share knowledge, what blocks them, and how they think about AI's role in their function. An existing 6-category interview guide structured the conversations, but this guide was explicitly treated as a conversation tool, not a hypothesis to be confirmed.

Open Coding

Each transcript was processed independently with no cross-transcript knowledge. This is the standard grounded theory constraint: the first pass must treat each data source as if the others don't exist, to prevent premature pattern-matching.

Each transcript produced two outputs: a set of emergent codes (each with a cleaned verbatim quote, speaker attribution, and analytical note) and an analytic memo — 400–800 words of opinionated analytical reflection that foregrounded high-entropy material, not summary. Memos preserved roughly 80% interviewee words and 20% analyst scaffolding.

Code Consolidation

The 764 open codes were merged where they described the same phenomenon — not merely related phenomena. This is a strict standard: two codes about "verification" that describe different verification behaviors remain separate. Only codes that point to the identical underlying behavior or insight get consolidated.

764 codes became 683 consolidated codes (11% merge rate). All quotes and attributions were preserved through the merge. The relatively low merge rate reflects the diversity of the data — these interviews genuinely covered different ground.

Axial Coding

The axial coding phase grouped the 683 codes into conceptual clusters, validated those clusters against the original memos, and then organized validated clusters into candidate axes for the framework.

Framework Synthesis

The candidate axes were tested against five selection criteria to arrive at the final 6 dimensions:

Individual Assessments

Each employee was assessed against the 6-dimension framework using their interview transcript as the sole evidence source. Self-reported levels from the pre-existing framework were explicitly ignored — they were found to be uncorrelated with demonstrated capability. Assessments were developmental and direct, role-contextualized: a CFO is assessed differently than an engineer on the same dimension.

Organizational Analysis

A single aggregate analysis synthesized patterns across all individual assessments. The methodology was narrative-first: the analytical narrative was written by reading all assessments before computing any aggregate scores. The narrative was not reverse-engineered from numbers. No individuals were identified. "Most people" rather than "73%" — numbers provide context, not conclusions.

Aggregate statistics (per-dimension averages, min, max, percentiles, and distributions) were computed after the narrative was written, as a validation step — confirming that the narrative's claims held up quantitatively.

Implications

The implications were written as fresh analytical work grounded in all individual assessments and the organizational analysis — not reshuffled from earlier stages. This distinction matters: earlier implications were written before individual data existed; the final implications were written after assessing every person.

Business Impact Heuristic

The claim that each dimension-level improvement adds roughly 12% to a person's effective capacity was derived empirically, not asserted.

The individual assessments identified anchor examples — practitioners whose interview transcripts contained specific, concrete capacity claims. Each anchor described work they now do that previously required additional headcount or wasn't done at all: a deal review system that replaced a research analyst function, automated verification that removed a QA role from a project, prototyping capability that collapsed an engineering team's timeline.

For each anchor, the implied headcount multiplier was computed from their specific claim. These per-person multipliers were then converted to a per-dimension-level rate. The 12% figure is the conservative estimate — it falls below both the median (14%) and mean (18%) of the anchor examples.

The compounding logic: six dimensions at 12% per level means one level up across all six dimensions roughly doubles a person's effective capacity.

Example 1: One practitioner built an automated verification pipeline that absorbed what had been a dedicated QA function on a client project, enabling the team to remove a front-end engineer role. Their assessment placed them at L4 on Verification Design (3 levels above baseline). Implied multiplier: ~1.4x on that dimension alone. Per-level rate: ~13%.

Example 2: A non-technical role built a deal review orchestration system that replaced what would have been a research analyst function — work that previously wasn't done at all. Their assessment placed them at L3 on Context Architecture (2 levels above baseline). Implied multiplier: ~1.3x. Per-level rate: ~15%.

Methodological Principles

These principles were enforced across all stages:

The Entropy Principle. LLMs add their own signal — smooth, diplomatic, abstract — which drowns out interviewees' high-entropy signal (surprising, contradictory, idiosyncratic). Every pass must maximize the ratio of transcript signal to LLM signal.

Reasoning before judgment. Analytical reasoning must be written before determining levels, scores, or categories. This prevents anchoring bias and ensures conclusions follow from evidence rather than evidence being selected to support conclusions.

Evidence chains are scaffolding, not output. Quotes travel between passes to keep the analysis honest, but the final output is genuine synthesis — analytical prose, not a quote catalogue.

No generated content. All analytical text is something that could only have been written after reading these specific transcripts. No change-management boilerplate.

Falsifiable claims. Say what things are, not what they aren't. Specific, testable assertions — not negations or abstractions.

Normalize out the input framework. The interview guide shaped conversations, not conclusions. The analysis pipeline was given the existing framework explicitly so it could recognize and deliberately avoid reproducing it.

Narrative before numbers. Write analytical narratives before computing statistics. Numbers validate prose; prose is not reverse-engineered from numbers.

Calibrate, then batch. For any assessment involving many units (individual profiles, dimension coding), hand-review calibration examples before running unsupervised batches.