Unified analysis skill - Python data analysis (--data) or KB gap identification (--kb)
The analysis skill provides two modes for generating PM insights from structured sources:
| Mode | Purpose | Output |
|---|---|---|
--data |
Python-based analysis of CSV/Excel files (retention, funnel, segmentation) | outputs/insights/data-analysis-YYYY-MM-DD.md |
--kb |
Knowledge Base gap analysis (pain points, missing articles, AI opportunities) | outputs/insights/kb-gaps-YYYY-MM-DD.md |
inputs/data/ folderStep 1: Choose Analysis Method
Your goal is to provide the most accurate analysis. Autonomously select the best method based on the data and the user's query.
Step 2: Execute Analysis
IF Direct LLM Calculation was chosen:
IF Python Script Execution was chosen:
import pandas as pd
# Load data, print shape, dtypes, head, describe, isnull, etc.
outputs/insights/).Step 3: Generate Output
Write to outputs/insights/data-analysis-YYYY-MM-DD.md. The output must be structured as follows and must include the analysis_method field in the YAML frontmatter.
---
generated: YYYY-MM-DD HH:MM
skill: analyze --data
analysis_method: "Direct LLM Calculation" # or "Python Script Execution"
sources:
- inputs/data/filename.csv (modified: YYYY-MM-DD)
downstream: []
---
# Data Analysis: [Dataset Name]
## Analysis Method
This analysis was performed via **[Direct LLM Calculation / Python Script Execution]**.
## Dataset Overview
| Attribute | Value |
|-----------|-------|
| Rows | N |
| Columns | N |
| Date range | [if applicable] |
## Data Dictionary
| Column | Type | Example Values | Meaning |
|--------|------|----------------|---------|
| ... | ... | ... | Explicit/Unknown |
## Key Metrics
| Metric | Value | Source |
|--------|-------|--------|
| [Metric name] | [Number] | [Direct Calculation / Python output] |
## Findings
1. **[Finding]** - Evidence: [Direct Calculation / Python output]
## Hypotheses (require validation)
1. **[Hypothesis]** - Based on: [observation]
## Visualizations
- [Chart description]: outputs/insights/[filename].png
## Sources Used
- [file paths]
## Claims Ledger
| Claim | Type | Source |
|-------|------|--------|
| [Metric] | Evidence | [Direct Calculation / Python output] |
| [Trend interpretation] | Hypothesis | [Based on metric X] |
Step 1: Gather Sources
Read files in:
inputs/knowledge_base/ - KB article exportsoutputs/insights/voc-synthesis-*.md - VOC insights (if available, for correlation)Step 2: Analyze Article Coverage
For each KB article (or category), note:
Step 3: Identify Gaps
Look for:
Step 4: Assess AI Opportunities
For each gap, evaluate:
| Opportunity Type | Criteria | Risk Level |
|---|---|---|
| Better search/IA | Hard to find articles | Low |
| Guided resolution | Multi-step process | Low-Medium |
| AI-assisted | Can be automated with citations | Medium |
| DO NOT automate | Compliance, billing, trust-sensitive | High |
Step 5: Generate Output
Write to outputs/insights/kb-gaps-YYYY-MM-DD.md:
---
generated: YYYY-MM-DD HH:MM
skill: analyze --kb
sources:
- inputs/knowledge_base/*.md
- outputs/insights/voc-synthesis-*.md (if used)
downstream:
- outputs/roadmap/Qx-YYYY-charters.md
---
# KB Gap Analysis: [Date]
## Executive Summary
[2-3 sentences: What's the state of KB? Where are the biggest gaps?]
## Coverage Overview
| Category | Article Count | Last Updated | Complexity | Gap Score |
|----------|---------------|--------------|------------|-----------|
| [Category 1] | N | YYYY-MM-DD | Simple/Complex | High/Med/Low |
## High-Volume Topics
*Categories with most articles (signal: users struggle here)*
| Topic | Article Count | Sample Titles | VOC Correlation |
|-------|---------------|---------------|-----------------|
| [Topic] | N | [title1, title2] | [Yes/No/Unknown] |
## Missing / Outdated Articles
| Gap | Type | Evidence | Priority |
|-----|------|----------|----------|
| [Topic with no article] | Missing | VOC mentions in [file] | High |
| [Article X] | Outdated | Last updated [date] | Medium |
## AI Opportunity Assessment
### Safe to Automate (Low Risk)
| Opportunity | Type | Rationale |
|-------------|------|-----------|
| [Better search for X] | Search/IA | Articles exist but hard to find |
| [Guided wizard for Y] | Guided resolution | Clear steps, no judgment needed |
### Automate with Caution (Medium Risk)
| Opportunity | Type | Guardrails Needed |
|-------------|------|-------------------|
| [AI assist for Z] | AI-assisted | Must cite source article, human review |
### DO NOT Automate (High Risk)
| Topic | Reason |
|-------|--------|
| [Billing disputes] | Financial, requires human judgment |
| [Data deletion] | Compliance, irreversible |
| [Access control] | Trust/security sensitive |
## Recommendations
1. **[Recommendation]** - Evidence: [source]
## Sources Used
- [file paths]
## Claims Ledger
| Claim | Type | Source |
|-------|------|--------|
| [High volume in X] | Evidence | [article count] |
| [Users struggle with Y] | Evidence | [VOC file] |
| Action | Command |
|---|---|
| Load CSV | pd.read_csv('inputs/data/file.csv') |
| Load Excel | pd.read_excel('inputs/data/file.xlsx') |
| Save chart | plt.savefig('outputs/insights/output.png') |
| Check nulls | df.isnull().sum() |
| Risk Level | Examples | Action |
|---|---|---|
| Low | Search improvements, FAQ bots | Safe to build |
| Medium | Troubleshooting assistants | Build with guardrails |
| High | Billing, compliance, security | Human only |
| Mode | Primary Output | History |
|---|---|---|
--data |
outputs/insights/data-analysis-YYYY-MM-DD.md |
history/analyze/data/ |
--kb |
outputs/insights/kb-gaps-YYYY-MM-DD.md |
history/analyze/kb/ |
| Claim | Type | Source |
|---|---|---|
| [Metric] | Evidence | [Python output] |
| [Trend interpretation] | Hypothesis | [Based on metric X] |
| [Column meaning] | Evidence/Unknown | [User confirmed / Not stated] |
| [47 articles on X] | Evidence | [KB export count] |
| [Users complain about Y] | Evidence | [VOC file:line] |
| [Safe to automate Z] | Assumption | [no compliance concern identified] |