Module 2 — Conversation to Commitment · Lesson 2.3
Auto-Generating Tasks from Conversation
What a good candidate looks like, what the agent fills in, what the human still has to decide
~15 min
What you'll learn
- Describe what makes a commitment recognizable to the agent
- Read an agent task proposal and identify what was inferred vs. observed
- Use the agent's create_task and recommend_assignees flows confidently
- Decide which proposals to accept, edit, or dismiss — and why
Auto-task generation in Kavanah is not a black box. The agent does five distinct things — recognize, enrich, infer KVN, recommend an assignee, and estimate — and presents the result as a task proposal. The human's job is no longer to type the task; it is to make the small set of decisions the agent cannot make on its own. This lesson walks the pipeline step by step so you know what to trust, what to verify, and what to override.
Recognition: what counts as a commitment
The agent looks for three signal patterns. The first is an explicit commitment: a first-person verb with a deliverable and a vague-or-specific time horizon. 'I'll send you the analysis by Friday' is the textbook case.
The second is an implicit ownership change: someone in the conversation accepts a problem without using commitment-shaped language. 'Yeah, let me take a look at the auth bug.' The signal is the acceptance, not the verb. The agent picks this up reliably in chat, less reliably in email (where politeness obscures the signal), and most reliably in AI agent transcripts (where you usually phrase things directly to the agent).
The third is an unowned task: a problem identified in conversation that nobody volunteered for. 'We should figure out what's going on with the deploy timing.' The agent surfaces these as candidates without an assignee, flagged for the manager (or persona with the tasks_projects scope) to triage.
What the agent ignores: rhetorical first-person ('I'd love to do that someday'), past-tense reports ('I sent that yesterday' — this is a status update, not a commitment), and external commitments where the actor is a customer or third party. These show up in the audit log but do not become candidate tasks.
Enrichment: filling in what the conversation left out
A raw commitment is usually missing most of the data a task needs. 'I'll send the analysis by Friday' has no project, no priority, no description beyond the verb, no skills, no time estimate. The agent enriches the candidate using:
The conversation context. The thread the commitment was made in is usually scoped to a project; the agent links the task there automatically.
The speaker's profile. The skills associated with the speaker filter the candidate skill-tag list; the agent picks the most relevant tag from the canonical vocabulary.
The workspace and project KVN charters. The Vision frames the description; the Negations filter out candidates that violate them outright (these get dropped with a log entry, not promoted to tasks).
The related history. If similar commitments have produced similar tasks before, the agent reuses the structure — same priority pattern, similar estimate, similar skill tag — adjusted by how the current commitment differs.
Enrichment is the step where the agent's value is highest. The recognition step is mostly pattern matching; the enrichment step is where the agent uses its access to the whole work-graph to produce a candidate that already has 80% of the fields a manager would otherwise have to fill in.
KVN inference: what the conversation implied
When the conversation makes the K, V, or N explicit, the agent uses it directly. When it doesn't, the agent infers from context — but flags the inference rather than presenting it as ground truth.
For Know-How, the agent uses the workspace's standing K plus any skill tags the commitment touched. If the conversation referenced a specific system, that system's known dependencies get pulled in.
For Vision, the agent uses the project Vision if available, narrowed by the specific phrasing of the commitment. 'I'll send the analysis' under a project whose Vision is 'so the customer can see Q3 trends at a glance' becomes 'analysis output that the customer can see at a glance' — not a vague deliverable.
For Negation, the agent surfaces the relevant Negations from the project and workspace charters. If the candidate task would, on its face, cross a Negation, the agent flags it for human review before creating the task — it does not silently create.
The practical effect is that most agent-generated tasks land with at least two of the three KVN axes already populated, plus a clear flag on what was inferred and what was observed.
Assignment recommendation: routing by capability
If the commitment was explicit ('I'll send the analysis'), the speaker is the assignee. If it was implicit ('we should look at this'), the agent calls its recommend_assignees tool: rank members for the task using declared skills, learned skills (from the reinforcement loop), recent throughput, and current load.
The recommendation surfaces as a ranked list with reasons. 'Sarah Chen (top match: declared Expert in design systems; shipped 3 design-systems tasks last sprint; current load 1 active task)' is more useful than 'Sarah Chen.' The manager sees the reasoning and can override.
For candidates that touch AI Employee scopes, the agent also considers personas as assignees. If you have a research-assistant AI Employee with the knowledge and email scopes, and the commitment is 'someone should pull the SOC 2 evidence,' the persona is a candidate the same way a human is.
Module 3 covers the assignment system in depth. For now, the relevant point is that the agent does not pick assignees blindly; it uses the explicit work the workspace has done to declare capabilities.
Estimation: how long this will take
Every candidate task arrives with a time estimate generated by the task-intelligence layer. The estimate is computed from three signals: the skill tags on the task, the historical time-to-complete distribution for tasks with similar tags by similar members, and any explicit dependencies the description names.
The estimate is presented with a confidence band, not a single number. 'Estimated 2–4 hours (median 3, p90 5)' tells the manager more than '3 hours,' because it surfaces the uncertainty the model is operating under.
Where the estimate is high-confidence (similar tasks have a tight distribution), the manager usually accepts it. Where it is low-confidence (the task is novel or the band is wide), the manager either tightens the description or breaks the task into smaller pieces. Module 4 covers the estimation system in detail.
What the human still decides
After the agent has done its five things, the manager has a candidate task with project link, assignee recommendation, KVN inferred where possible, skill tags, time estimate, and provenance pointer back to the source conversation. The human's job is to make exactly five decisions on each candidate:
Is this a task at all, or is it noise? (Module 2.4 covers this triage.)
Is the assignee right? (Override if the recommendation missed context only the human has.)
Does the KVN need refinement? (If the inferred Vision is generic, sharpen it. If the Negation is missing, add one.)
Is the estimate believable? (If wildly off, break it down.)
Does this need scheduling? (If it has a hard deadline, set it. Otherwise leave it for the planning pass.)
This is the entire human contribution. Five decisions, ten seconds per candidate, in batches of a dozen at a time. The capture problem dissolves into a triage problem, and the triage problem is straightforward.
Run the loop on a real conversation
- 1
Have a five-minute conversation with the AI agent
Open /ai-agent and talk through what your team needs to do this week. End by asking the agent to list the tasks.
- 2
Review the resulting task proposals
Each should arrive enriched: project, assignee, KVN inferred, skill tag, estimate, provenance. Notice what the agent inferred vs. observed.
- 3
Accept, edit, or dismiss each
Practice the five decisions. Most candidates need at most one tweak. Some get dismissed; the agent learns from the dismissal pattern.
- 4
Check the audit log entry
Each created task has a provenance pointer back to the conversation. Click through and confirm the link to the source.
Proposal pipeline health
- Acceptance rate
- Fraction of agent-proposed tasks that get accepted into the board.
- Healthy signal: 60–80%. Below 50% suggests the agent's recognition is over-eager; above 95% may mean it is under-proposing.
- Edit rate on accepted tasks
- Fraction of accepted tasks where the manager edited at least one field before saving.
- Healthy signal: 30–60% in the first month, falling as the agent learns the team's patterns.
- Provenance hit rate
- Fraction of created tasks whose provenance pointer is followed at least once by anyone.
- Healthy signal: Above 25%. Below 10% means the provenance is decoration, not used; consider tightening it.
- Time from commitment to task
- Median time between an in-conversation commitment and the task appearing on the board.
- Healthy signal: Under 10 minutes when the agent is running; under 60 seconds when the conversation was with the AI agent directly.
Key takeaways
- ·The agent does five things on every candidate: recognize, enrich, infer KVN, recommend assignee, estimate.
- ·Recognition is conservative on email and on rhetorical first-person; this is by design.
- ·KVN inference is flagged, not presented as truth — the manager sees what was observed vs. assumed.
- ·Assignment uses declared and learned skills, not the agent's hunch.
- ·The human's job is five small decisions per candidate. Capture becomes triage.
Capture is solved. Now the manager's bottleneck is triage: deciding which proposals are work and which are noise. The next lesson is a short, practical guide to that decision.