Building a DevSecOps MCP Server - Unified Security Finding

"Why do I have to open five tabs to find out what's on fire?" -- every engineer, every Monday morning

Modern engineering orgs run a small zoo of security scanners. A dependency scanner watches your supply chain. A SAST scanner watches your source code. A container scanner watches your images. A CNAPP watches your cloud. An attack surface scanner watches your public surface. A GRC platform watches your compliance posture. Pentest vendors mail you PDFs once a quarter.

Each one is great in isolation. Together, they create a different problem: nobody has one place to look. SLA breaches hide in tabs. The same CVE is reported three times with three different statuses. "Who owns this repo?" turns into a Slack archaeology project.

This post walks through how I built a unified security dashboard to fix that, and then took it one step further: a Model Context Protocol (MCP) server that lets developers, security engineers, and CI pipelines all talk to the same dashboard from Claude, Cursor, or a GitHub Actions step. List open criticals in plain English, draft a fix as a merge request, re-query after CI to confirm the finding closed, or block a release automatically. One tool, one contract, every persona.

No browser. No tab-switching. No mental context loss between "there is a vulnerability" and "the vulnerability is closed."

1. The problem in concrete terms

Before the dashboard existed, "what is the security state of our org" required four jumps:

Log into the dependency scanner UI, filter alerts by repo, eyeball the criticals.
Log into the SAST scanner, pick the right project key (different naming convention per org), filter by vulnerability type, ignore code smells (different lifecycle), eyeball the criticals.
Cross-reference the last pentest PDF that landed in someone's inbox.
Check the CNAPP for cloud findings, with its own owner taxonomy that doesn't match the source repo.

Nothing was wrong with any of those tools individually. The problems were structural:

Problem	What it looked like
No common schema	Severity meant different things in different tools. The SAST scanner's `CRITICAL` is closer to "high"; calling it critical inflated the headline KPI by 3x.
No common identity	The "same" CVE in the dependency scanner and the container scanner was two unrelated rows. Marking one fixed told you nothing about the other.
No common ownership	Every tool guessed who owned a finding from a different signal. Nothing was authoritative.
No SLA discipline	Age was anchored to "when we first saw it," not to when the vendor first reported it. A 200-day-old alert ingested today read as zero days old.
No audit trail	When a finding disappeared, you couldn't tell if it was fixed, suppressed, or the scanner just stopped reporting.
No exec view	"How are we doing?" was a Slack DM to the security team, who built a spreadsheet by hand every Monday.

So I built a tool. The design constraints were ruthless: read-only of the source systems (I never mutate state in the upstream scanners -- those are the systems of record), one schema for everything, ownership in YAML in git (not a CMDB nobody updates), and AI assistants are first-class clients from day one.

2. Architecture at a glance

+--------------------------------------------------------------------+
|  Sources                                                           |
|  Dependency Scanner * SAST Scanner * Container Scanner             |
|  CNAPP * Attack Surface Scanner * GRC Platform * Pentest          |
+------+-------------------------------------------------------------+
       |  REST / GraphQL pulls * CI SARIF uploads * manual PDFs
       v
+--------------------------------------------------------------------+
|  Ingestion layer -- Cloud Run Jobs, one per source                 |
|  Output: full snapshot to object storage + 1 Pub/Sub message       |
|  carrying {source, poll_id, raw_uri}                               |
+------+-------------------------------------------------------------+
       v
+--------------------------------------------------------------------+
|  Normalizer -- Cloud Run Service, push-subscribed to Pub/Sub       |
|   1. Idempotency guard (UNIQUE on processed_payloads)              |
|   2. Map to unified Finding schema                                 |
|   3. Resolve owner_team from ownership.yaml                        |
|   4. UPSERT findings; emit FindingEvent on every change            |
|   5. Auto-close findings absent from N consecutive snapshots       |
|   6. Commit (atomic per snapshot)                                  |
+------+-------------------------------------------------------------+
       v
+--------------------------------------------------------------------+
|  Storage                                                           |
|   * Postgres: findings * finding_events * daily_metrics *          |
|     processed_payloads * api_tokens * risk_accepts * report_jobs   |
|   * Object storage: raw scan payloads (1-year retention)           |
+------+-------------------------------------------------------------+
       v
+--------------------------------------------------------------------+
|  Nightly rollup job -- daily_metrics                               |
|  (Powers trend charts without scanning the raw table every load.)  |
+------+-------------------------------------------------------------+
       v
+--------------------------+    +------------------------------------+
|  FastAPI (Cloud Run)     |    |  MCP Server (Cloud Run, public,   |
|  * Auth at the edge      |<---+  bearer-only)                     |
|  * RBAC per request      |    |  * Forwards caller's bearer       |
|  * One set of endpoints  |    |    token to the API               |
+--------------------------+    |  * Exposes read + gate tools      |
           v                    |  * Same RBAC, same contract       |
+--------------------------+    +------------------------------------+
|  Next.js frontend        |
|  Executive / Developer / |
|  Platform / Admin views  |
+--------------------------+

The shape is deliberately boring. Each box has one job; the seams between boxes are real network calls (not in-process function calls dressed up as a "service"), so when something is on fire I can answer "which box?" before I answer "which line?"

The stack:

Backend: FastAPI + SQLAlchemy 2.x + Alembic, Python 3.12.
Frontend: Next.js 16 (App Router, server components, Turbopack), React 19, Tailwind CSS, Recharts, React Query (5-minute stale).
Database: Postgres (managed, regional, HA).
Messaging: Pub/Sub, push subscription with dead-letter topic.
Object storage: GCS, 1-year retention on raw scan payloads.
Compute: Cloud Run (services + jobs) -- autoscale-to-zero everywhere except the API.
IaC: Terraform.
MCP: Anthropic's mcp Python SDK, streamable-http transport in production, stdio in local dev.

3. The Finding -- a single schema that scanners speak through

Everything downstream (the dashboard, the API, the MCP, the metrics) sits on one entity:

class Finding:
    id: UUID                         # uuid5(NAMESPACE, f"{source}:{native_id}")
    source: str                      # "dependency_scanner" | "sast_scanner" | "container_scanner" | ...
    native_id: str                   # per-source identifier (defined in an ADR)

    title: str
    description: str
    severity: Enum                   # critical | high | medium | low | info
    cve_id: str | None
    cwe_id: str | None

    asset_id: str                    # repo:owner/name * image:registry/path * cloud:project/...
    asset_root: str                  # used to correlate across scanners
    owner_team: str                  # resolved from ownership.yaml; "unowned" if missing
    correlation_group_id: UUID | None  # links the same CVE across scanners

    status: Enum                     # open | triaged | in_progress | fixed |
                                     # auto_closed | risk_accepted | suppressed

    first_seen_at: datetime          # when I first ingested it (provenance only)
    last_seen_at: datetime           # touched every ingest, no event emitted
    upstream_created_at: datetime    # the source's own creation time -- THE age anchor
    reopened_at: datetime | None     # set on auto_closed -> open transitions
    consecutive_misses: int          # increments per snapshot the finding is absent

    raw_payload_uri: str             # gs:// link to the original scanner JSON
    tags: list[str]

    @property
    def sla_started_at(self) -> datetime:
        # coalesce(reopened_at, upstream_created_at, first_seen_at)
        # This is what every age display, SLA check, and MTTR calculation reads.
        ...

A few decisions in there are load-bearing enough to spell out.

3.1 Identity is deterministic, not random

Finding.id = uuid5(NAMESPACE_SECDB, f"{source}:{native_id}")

The (source, native_id) pair is the sole identity contract. Every source defines its native_id explicitly (e.g. for the dependency scanner it's {owner}/{repo}#{alert_number}; for the SAST scanner it's {org}/{project_key}#{issue_key}). Same scanner, same alert, same UUID, forever, with no database round-trip required to compute it.

This single decision is responsible for the fact that ingestion is idempotent. Re-running yesterday's poll is a no-op. Pub/Sub redelivers a message and we UPSERT the same UUIDs and nothing duplicates. The cost of doing this wrong (non-deterministic IDs) is duplicated findings, broken auto-close, and an audit log that gaslights you about whether a vulnerability is the same vulnerability you closed last week.

3.2 Age is anchored upstream, not at ingest

This sounds pedantic until it bites you.

A naive dashboard sets created_at = NOW() when a finding is first ingested. So when you bring on a new source today, every finding ever reported by that source (including the one that's been open in the vendor system for 200 days) reads as "0 days old, 0 SLA breaches." The headline KPI shows your team in great shape on a day they're objectively the worst they've been all year.

The fix is to read the vendor's own creation timestamp and use that as the SLA anchor:

sla_started_at = coalesce(reopened_at, upstream_created_at, first_seen_at)

Why the coalesce? reopened_at wins if it's set, because a reopened finding gets a fresh SLA window. upstream_created_at wins next, because that's the truth. first_seen_at is the fallback for sources that don't expose a creation time at all.

The two implementations of this rule (a Python @property for response payloads and a SQL func.coalesce(...) for ORDER BY and aggregate queries) are kept in lockstep by tests. Drift here would mean the chart on the dashboard disagreed with the breach count on the same page.

3.3 Per-scanner findings, with correlation groups

When a CVE is reported by both the dependency scanner (the repo) and the container scanner (the image built from that repo), I keep two finding rows. They share a nullable correlation_group_id, and views decide whether to deduplicate.

The temptation is to merge at ingest. I didn't, for three reasons:

Independent fix lifecycles -- closing the dependency alert says nothing about whether the image was rebuilt. The container is still vulnerable until it's pulled with the new SHA.
Status-merge logic is fragile -- whose status wins when two scanners disagree? Every answer is wrong sometimes.
The gap is the point -- the delta between "source-level fixed" and "runtime-level fixed" is exactly what platform engineers need to see.

So the rule is: dedup is a view-layer concern. The Executive view dedups by correlation group for the headline KPIs. The Platform view shows everything per-scanner so the rebuild gap is visible.

3.4 Auto-close is its own status, not "fixed"

When a scanner stops reporting a finding for N consecutive polls, I transition it to auto_closed -- not fixed. Because I genuinely don't know.

| Source                    | Misses to auto-close |
|---------------------------|----------------------|
| Dependency Scanner        | 3                    |
| SAST Scanner              | 2                    |
| Container Scanner         | 3                    |
| CNAPP                     | 3                    |
| Attack Surface Scanner    | 3                    |
| GRC Platform              | 2                    |
| Pentest                   | never (manual only)  |

A finding disappearing from a snapshot could mean the developer fixed it (great), the developer suppressed it upstream (also valid, but a different state), or the scanner had a transient outage and missed a poll (I don't want to false-close on that, hence the consecutive-miss threshold).

Calling all three fixed is dishonest. Calling them auto_closed is honest, preserves the audit trail, and lets MTTR count them as closures without erasing the distinction.

If a previously auto-closed finding reappears, I transition auto_closed -> open, set reopened_at = now(), keep the original first_seen_at, restart the SLA clock from reopened_at, and emit a FindingEvent(event_type=reopened). Same UUID. Same audit log. Different SLA window.

4. Ingestion: snapshot-per-message, Pub/Sub, atomic auto-close

Each pull-mode scanner has a poller -- a Cloud Run Job triggered by Cloud Scheduler. The poller's job is tiny and deliberately stupid:

Hit the vendor's API. Follow pagination. Assemble the full response.
Write it to gs://secdb-raw/{source}/{poll_id}.json.
Publish one Pub/Sub message: {source, poll_id, raw_uri}.

That's the entire poller. No parsing, no DB writes, no business logic. It's the smallest possible thing that could be replaced when (not if) the vendor's API changes.

The normalizer is a Cloud Run Service, push-subscribed to the ingest topic. On every message it runs one DB transaction:

1. INSERT processed_payloads(source, poll_id)
   ON CONFLICT DO NOTHING
   -- if the row already exists, ack and return (Pub/Sub redelivery, work is done)

2. Download the raw payload from object storage and parse it.

3. UPSERT every finding in the snapshot:
   - new                -> INSERT, emit `discovered`, set first_seen_at = now
   - reopened           -> status auto_closed -> open, emit `reopened`,
                          set reopened_at = now, preserve first_seen_at
   - per-field changed  -> emit one typed FindingEvent per (severity / status /
                          title / description) change
   - always             -> last_seen_at = now, consecutive_misses = 0

4. ABSENT findings of the same source (still in an open status):
   - consecutive_misses += 1 (bulk UPDATE, no events)

5. Findings whose consecutive_misses crossed the source's threshold:
   - status -> auto_closed, emit `auto_closed`

6. Commit.

Two things that matter about this:

Auto-close runs inside the snapshot transaction. Not as a nightly cron. The set of findings present in the snapshot and the set of findings not present are computed against the same point in time, atomically. There is no window where I've ingested the snapshot but haven't yet aged the absent ones.

Idempotency is the cheapest possible thing. UNIQUE(source, poll_id) on processed_payloads. Pub/Sub redelivers a message, INSERT ... ON CONFLICT DO NOTHING returns no row, I ack and exit. The whole guard is one indexed insert.

If the transaction fails (malformed payload, vendor schema change, transient DB error), the message NACKs. Pub/Sub redelivers up to five times, then sends it to a dead-letter topic. A separate Cloud Run Service subscribed to the DLQ pings my webhook so on-call sees it. The raw payload is in object storage for a year, so replay is always trivial.

4.1 The "snapshot vs. event" choice

I considered publishing one message per finding instead of one per snapshot. It's more "stream-native" and parallelises trivially. I rejected it for one specific reason: auto-close becomes a coordination problem. Per-finding messages mean the normalizer has to know "have all the messages from poll P been processed?" before it can decide what's absent. That requires a poll_complete sentinel or sequence numbers, neither of which Pub/Sub does natively. Per-snapshot makes the whole problem disappear.

The downside: a poisoned snapshot (malformed payload, schema drift) takes the whole poll to the DLQ instead of degrading gracefully. I pay that cost in exchange for atomic per-poll semantics and replay-from-object-storage.

5. The connectors -- what each scanner actually looks like in code

The shape of every connector is identical: poll API, write raw, publish message. What changes per source is the mapper that converts the vendor's JSON into the unified Finding schema.

5.1 Dependency Scanner

GET /orgs/{org}/dependency/alerts?state=open&per_page=100

Native ID: dependency_scanner:{owner}/{repo}#{alert_number}. Alert numbers are assigned by the upstream system and stable across rescans, so the derived UUID stays put.

Severity comes straight off the advisory severity field and maps 1:1 to my enum.

Owner team resolution: the alert's repository.full_name (owner/repo) is looked up in ownership.yaml.assets[*] where id = "repo:owner/repo". Hit stamps owner_team. Miss stamps unowned and the finding shows up on the Admin view's triage list.

5.2 SAST Scanner

This is the messiest connector and is worth dwelling on.

GET {SAST_BASE_URL}/api/issues/search
    ?organization={org}
    &types=VULNERABILITY
    &statuses=OPEN,CONFIRMED,REOPENED
    &ps=500&p=N

Three things to highlight:

(a) Severity is intentionally remapped down. The SAST scanner's CRITICAL tier is broader than mine -- it includes things I'd call "high." Promoting it would inflate the headline open_criticals KPI. The mapper uses a conservative table:

BLOCKER  -> critical
CRITICAL -> high       <- deliberately demoted
MAJOR    -> medium
MINOR    -> low
INFO     -> info

Changing that table changes a load-bearing metric. It requires an Architecture Decision Record edit, not a casual diff. I pin it with a test.

(b) Multi-organization support is real. Some orgs run more than one SAST organization (different billing units, acquisitions, etc.). The poller takes a CSV environment variable of org keys and runs them sequentially, publishing one message per org. Each message carries its own organization field so the mapper can prefix native IDs correctly.

(c) External issues get split into a separate logical source. The SAST scanner lets teams import third-party scanner reports as "external issues." If a team has done this, container scanner CVE entries surface on the same issues search response as native SAST rule violations, indistinguishable by type (both report as VULNERABILITY).

I detect them with two signals (either is sufficient):

The rule field matches external_<engineId>:<ruleId> where the engine portion contains a known container scanner identifier.
The issue's message contains a deep-link URL that the container scanner hard-codes in every CVE finding.

Findings that match get routed to a separate logical source (sast_scanner_external) that is admin-only by default. They're excluded from every team-facing view so that turning on external-issue imports doesn't suddenly inflate every team's posture overnight. Admins see them on a dedicated triage section.

The deterministic UUID is source-stable -- derived from the SAST scanner namespace regardless of which side of the split the finding lands on. So when the detection logic improves on a future deploy, an existing row that's now recognisable as a container finding gets reclassified in place on the next poll, with no history loss.

5.3 Container Scanner (CI-uploaded SARIF)

The poller pattern doesn't fit the container scanner because it runs in CI, not against an API. Instead:

CI uploads the SARIF artifact to gs://secdb-raw/container/{poll_id}.sarif.
A GCS notification publishes a Pub/Sub message to the same ingest topic.
The normalizer recognises the source and routes to the container scanner mapper.

Same downstream shape, different on-ramp. The seam is honest because the ingest topic only ever carries {source, poll_id, raw_uri} -- it doesn't care whether raw_uri was put there by a poller or by a CI job.

5.4 CNAPP, Attack Surface Scanner, GRC Platform, Pentest

The other connectors follow the same two patterns. The CNAPP and GRC platform are pollers (snapshot-per-poll). The attack surface scanner is CI-uploaded (snapshot-per-CI-run). Pentest reports are uploaded manually through the dashboard UI; risk-accept is the only lifecycle operation allowed on them directly in the dashboard, because pentest findings have no upstream system of record to defer to.

6. Ownership in YAML, not in a CMDB

Every finding gets stamped with an owner_team at normalization time. The source of truth is ownership.yaml, versioned in git:

pillars:
  product-a:
    name: "Product A"
  product-b:
    name: "Product B"
  platform:
    name: "Platform"

assets:
  - id: repo:org/orders-api
    type: github_repo
    team: orders-team

  - id: repo:org/payments-monorepo
    type: github_repo
    team: payments-shared      # first-wins for repos owned by multiple teams

  - id: sast_proj:org-billing-service
    type: sast_project         # synthetic asset for monorepos with
                               # per-service SAST project keys
    team: billing-team

  - id: image:registry/org/orders-api
    type: container_image
    team: orders-team

teams:
  orders-team:
    pillar: product-a
    display_name: "Orders"
    slack: "#orders-team"      # optional opt-in for finding alerts
    engineering_manager:
      name: "Engineering Manager Name"
      email: "em@org.example.com"
    members:
      - dev1@org.example.com
      - dev2@org.example.com

A few things this gives for free:

Auditable. Every change is a PR with a reviewer and a diff. No CMDB drift, no Slack DM that nobody remembers.

One source of truth. The same YAML feeds the dashboard's owner column, the developer view's team filter, the per-team metrics, and the routing of opt-in Slack alerts.

Pluggable. Adding a metadata field (Jira project, on-call rotation, security PoC, anything) is a YAML edit and a loader change, not a schema migration.

Tells the truth about gaps. Assets not in the YAML get stamped owner_team = "unowned" -- never silently dropped. They surface on the Admin view with a clear resolution path: add an entry, redeploy. The alternative ("just guess from the repo name") is wrong in roughly the same fraction of cases that you'd want to know about.

When ownership.yaml changes (which is a Terraform-mediated config refresh), a re-resolution job replays existing findings against the new map and emits FindingEvent(event_type=ownership_changed) for every row that moves. So the audit trail shows you not just who owns this today but who used to own this before the takeover.

7. The three (well, four) audience views

A unified backend is only half the win. The other half is stop shipping one dashboard for everyone. The C-suite, a developer, and a platform engineer all need to look at the same data with completely different shapes.

7.1 Executive view

One screen. No scrolling. No row-level finding tables. No filters. Read-only.

A pillar selector at the top splits the org into product pillars; the selector pills carry per-pillar open-critical counts on the chip itself, so the split is visible at a glance. A KPI strip shows open criticals (with week-over-week delta), SLA compliance %, and MTTR for criticals (rolling 30 days). A 90-day stacked-area trend chart shows open findings by severity. A ranked offender list covers every sub-team in the pillar with a "By team / By service" toggle.

Auto-refreshes every five minutes silently. Designed to live on an always-on screen in a security director's office.

7.2 Developer view

Charts-first. The findings table is behind a <details> disclosure so the page is fast and quiet by default; clicking a card auto-expands the table pre-filtered to that slice.

The lead row is three independent per-source ratings -- SAST Scanner, Dependency Scanner, Pentest -- each an A-D grade. Developers think per-source: the daily question is not "what's my composite letter" but "how is my team doing on SAST this week, and is anything still open from last quarter's pentest?"

The high-volume code scanners share one ladder:

A: 0 criticals AND 0 highs
B: <= 2 criticals AND <= 10 highs
C: <= 10 criticals AND <= 30 highs
D: > 10 criticals OR > 30 highs

Pentest gets a stricter, severity-weighted ladder (every pentest finding is a deliberate professional alarm -- a single critical lands the team at C, not D):

A: 0c AND 0h AND 0m
B: 0c AND 0h AND <= 2m
C: 1c OR 1-2h OR >= 3m
D: >= 2c OR >= 3h

Default filter across every findings table: severity in {critical, high}. Lower severities are reachable via a row of severity pills or "Clear all filters." Dashboards drive remediation; remediation is critical and high. The default-everything view buries the signal.

7.3 Platform view

Identical shape to the Developer view (same <PosturePanel> component, parameterised by team scope, header text, and deep-link base path). Different data -- hard-scoped to the platform engineering teams' assets. The two views looking and reading the same is intentional: developers and platform engineers should never have to relearn the UI when they switch context.

7.4 Admin view

Operational triage surface for the security team: an unowned posture KPI strip, a scanner health grid (last seen, expected cadence, status), and triage disclosures for unowned findings and admin-only external issues.

The admin role is a per-email superuser flag -- holders see org-wide data on every other view too. It's not a separate scope; it's a capability that widens the data they can see.

8. Auth at the edge -- and how AI assistants get in

8.1 Browser auth

Every browser request hits an Identity-Aware Proxy at the cloud load balancer. The IAP allowlist (managed in Terraform) is the gate. Beyond IAP, the only authorization signal in code is a per-email is_admin flag set in a YAML config in git. Two-tier RBAC: can you reach the dashboard at all? (IAP) and can you see admin-only surfaces? (is_admin). There is no Role enum, no X-Active-Role header, no in-database role table.

The simplicity is the feature. Role models grow weed-like complexity over time. I resisted by deleting the role model entirely and pushing all access-control nuance to two places: the network edge (IAP allowlist) and the surface (each page chooses its own scope). The result is RBAC code that fits on a sticky note.

8.2 Non-browser auth (the MCP path)

AI assistants can't complete a browser login flow. So I need a separate credential -- and I want it to be self-service, so any user with dashboard access can wire up Claude or Cursor without filing a ticket.

The mechanism is personal API tokens:

Generated in the dashboard at /settings/mcp. Any authenticated user can mint their own.
Format: secdb_live_<urlsafe-32-bytes> -- high entropy, not user-chosen.
Stored as SHA-256 hex in api_tokens.token_hash (bcrypt is unnecessary for random 256-bit secrets and would just slow lookups).
Maximum 5 active tokens per user, optional expiry (never / 30 / 90 / 365 days / custom), revocable instantly.
Plaintext shown once at creation; the list endpoint only ever returns the 12-character prefix so users can tell tokens apart.

The API's identity verifier is a layered chain:

1. If Authorization: Bearer secdb_live_... -> look up SHA-256 in api_tokens.
   Re-derive admin status from the *current* admin_emails list against the
   token owner's email on every request.
2. Else (browser request) -> trust the IAP-injected identity header.

The crucial property: admin status tracks current membership, not a frozen snapshot. If I remove someone from the admin list, every one of their outstanding tokens is downgraded on the very next call. No revocation race, no snapshot drift.

8.3 Hosted MCP -- public ingress, bearer-only, no fallback

The hosted MCP service runs on Cloud Run with public ingress (no IAP -- IAP is browser-only and MCP clients can't do the OIDC dance). It accepts streamable-http MCP transport and forwards the caller's bearer token to the API.

The single most important line in the MCP client:

def _headers(self, bearer: str | None) -> dict[str, str]:
    if bearer:
        return {"Authorization": f"Bearer {bearer}", "Accept": "application/json"}
    if hosted_mode():
        # Falling back to X-Dev-Identity in hosted mode would let any
        # unauthenticated caller impersonate the default DEV_IDENTITY_EMAIL
        # against the API. Public ingress is only acceptable because of
        # this guard.
        raise PermissionError(
            "Authorization: Bearer secdb_live_... is required for hosted MCP."
        )
    return {"X-Dev-Identity": _dev_identity_header(), "Accept": "application/json"}

In hosted mode, no bearer = no call. The local stdio path keeps the dev-identity envelope so engineers running the stack on their laptop don't need to mint a token, but that path is unreachable from the public URL.

9. The MCP server -- one tool, every persona

This is where the dashboard stops being a destination and starts being plumbing for the tools people already live in. The design goal is deliberately ambitious: the same MCP server, the same four tools, serves a developer in Cursor, a security engineer running Claude from the CLI, and a CI pipeline in GitHub Actions -- without any of them needing a different contract or a different integration.

9.1 The tools

Phase 1 is intentionally small. Four read-only tools:

@mcp.tool()
async def get_findings(
    ctx: Context,
    team: str | None = None,
    repo: str | None = None,
    asset: str | None = None,
    severity: list[Severity] | None = None,
    status: list[Status] | None = None,
    source: str | None = None,
    limit: int = 50,
) -> dict[str, Any]:
    """List open security findings. Filters combine with AND;
    multi-value filters combine with OR within themselves."""
    ...

@mcp.tool()
async def get_sla_breaches(
    ctx: Context, team: str | None = None, limit: int = 50
) -> dict[str, Any]:
    """List open critical+high findings past their SLA window,
    sorted by age descending."""
    ...

@mcp.tool()
async def get_scanner_health(ctx: Context) -> dict[str, Any]:
    """Per-source status: active / stale / dark / no_data."""
    ...

@mcp.tool()
async def get_metrics_summary(
    ctx: Context, team: str | None = None, trend_days: int = 30
) -> dict[str, Any]:
    """Headline KPIs (open_criticals, sla_compliance_pct, mttr) plus
    a per-severity trend over the last N days. Optionally narrow by team."""
    ...

Each one is a thin wrapper over an existing API endpoint. The MCP layer adds zero business logic -- it's a translation surface, not a second source of truth. The same RBAC and the same default visibility rules apply because the same API enforces them. A developer calling get_findings sees only their team's assets. A security engineer with admin rights sees the whole org. The token determines the scope; the tool is identical.

9.2 Setup is under a minute

The dashboard renders copy-paste config snippets at /settings/mcp for each supported client. The Cursor variant is one file:

{
  "mcpServers": {
    "secdb": {
      "url": "https://mcp.your-org.example.com/mcp",
      "headers": {
        "Authorization": "Bearer secdb_live_YOUR_TOKEN_HERE"
      }
    }
  }
}

Drop that into .cursor/mcp.json, reload the window, and the four tools above show up in the assistant. The Claude Desktop variant is one JSON block in claude_desktop_config.json. The same server, different client config.

9.3 The agentic loop in practice

The reason the MCP approach is worth the investment is that it collapses three separate workflows into one continuous session. Here's what that looks like concretely:

The developer session (Cursor or Claude Desktop):

You: what are my open critical findings on the orders-api repo?

Cursor (calls get_findings(repo="org/orders-api", severity=["critical"])): You have three open criticals on org/orders-api:

CVE-2025-XXXXX in lib-foo@2.4.1 -- Dependency Scanner, opened 14 days ago, SLA breach in 3 days. Fix available in 2.4.3.

CWE-79 (Cross-site Scripting) in src/handlers/feedback.ts:42 -- SAST Scanner, opened 6 days ago.

Pentest finding "Unauthenticated GraphQL introspection" -- Pentest 2026-Q1, opened 47 days ago, already past SLA.

You: let's fix the dependency scanner one. open a branch and bump the dep.

Cursor: Created branch security/cve-2025-xxxxx-lib-foo. Bumped lib-foo from 2.4.1 to 2.4.3 in package.json and refreshed the lockfile. Ran the test suite -- all green. Want me to open the PR?

You: yes, link the finding ID in the description and tag the security team.

Cursor: Opened PR #847: "chore(deps): bump lib-foo to 2.4.3 (closes finding secdb:c91f...)" Added the security label and requested review from @security-team. The PR description includes the dashboard deep-link.

After the PR merges and CI redeploys, the dependency scanner rescan triggers, the alert closes upstream, and the next snapshot picks up the absence. After three consecutive misses, the finding auto-closes in the dashboard with a FindingEvent(event_type=auto_closed) in the audit log.

But the developer doesn't have to wait for that. They can re-query immediately:

You: is finding secdb:c91f... still open?

Cursor (calls get_findings(...) filtered to that finding's slice): The finding is still listed as open, but the underlying alert is now in dismissed/fixed state. The dashboard will auto-close it after three consecutive misses from the poller (next ingest cycle in ~45 min). I can ping you again in an hour, or you can re-run this query then.

The security engineer session (Claude CLI, org-wide scope):

A security engineer running Claude with admin token access uses the exact same tools, but sees the full org:

You: which teams have the most SLA breaches right now?

Claude (calls get_sla_breaches()): Four teams are currently over SLA on criticals: payments-shared (6 findings, oldest 43 days), orders-team (3 findings), infra (2 findings), unowned (11 findings -- no team assigned yet).

You: what's the scanner health looking like?

Claude (calls get_scanner_health()): The SAST scanner last polled 2 hours ago (active). Container scanner last seen 19 hours ago (stale -- expected cadence is 6h, CI job may have failed). GRC Platform last seen 4 hours ago (active). Dependency Scanner last seen 1 hour ago (active).

The security engineer gets a Monday-morning org posture report without opening a single browser tab.

The CI pipeline (GitHub Actions, automated gate):

Phase 2 closes the loop with one new tool, called from CI:

@mcp.tool()
async def gate_release(
    ctx: Context,
    service: str,
    commit_sha: str,
) -> dict[str, Any]:
    """Returns the open critical+high findings against the service.
    Caller (GitHub Actions) treats a non-empty result as a release block
    unless every finding has an active risk_accept or is on an allow-list."""

A release pipeline calls gate_release on every push to a long-lived branch. If the result is empty (or every finding is covered by a risk-accept), the gate passes. If not, the developer gets a deep-link to the dashboard and a clear list of "you need to acknowledge or fix these N findings before this release goes out."

The point of placing the gate at the MCP layer rather than as a bespoke GitHub Action talking directly to the API: the same tool the developer uses to draft the fix is the tool the pipeline uses to verify it. There's no second contract to keep in sync. If the rules change (new severity floor, new exemption mechanism), they change in one place.

9.4 Why "one tool, many people" is the real win

A conventional approach would wire each persona to a different system: the developer consults the dashboard UI, the security engineer builds a spreadsheet, the CI pipeline hits a bespoke API endpoint. Each integration drifts independently. Each one needs its own auth story. Each one needs to be updated when the underlying data model changes.

The MCP approach collapses all three into a single versioned contract. When I add a new field to get_findings, every client gains it simultaneously -- the developer in Cursor, the security engineer in Claude, and the GitHub Actions step. When I tighten the RBAC rules, every token type respects them because they all flow through the same API. The blast radius of a change goes down; the consistency guarantee goes up.

The practical consequence: I can give any new engineer a secdb_live_... token, point them at the MCP config snippet, and they're querying the full security posture of their team in under a minute, from whatever AI assistant they already have open, with no training on a new UI.

10. The Postgres schema, in one view

For the curious, here's the storage shape that everything sits on:

-- Findings: one row per scanner per native finding ID.
-- Identity is deterministic: id = uuid5(NAMESPACE, source || ':' || native_id).
CREATE TABLE findings (
    id                    UUID         PRIMARY KEY,
    source                VARCHAR(64)  NOT NULL,
    native_id             VARCHAR(512) NOT NULL,
    title                 TEXT         NOT NULL,
    description           TEXT         NOT NULL DEFAULT '',
    severity              severity     NOT NULL,   -- enum
    cve_id                VARCHAR(64),
    cwe_id                VARCHAR(64),
    asset_id              VARCHAR(512) NOT NULL,
    asset_type            VARCHAR(64)  NOT NULL,
    asset_root            VARCHAR(512) NOT NULL,
    asset_display         VARCHAR(512) NOT NULL DEFAULT '',
    owner_team            VARCHAR(128) NOT NULL DEFAULT 'unowned',
    correlation_group_id  UUID,
    status                status       NOT NULL DEFAULT 'open',   -- enum
    first_seen_at         TIMESTAMPTZ  NOT NULL DEFAULT now(),
    last_seen_at          TIMESTAMPTZ  NOT NULL DEFAULT now(),
    upstream_created_at   TIMESTAMPTZ,
    reopened_at           TIMESTAMPTZ,
    consecutive_misses    INTEGER      NOT NULL DEFAULT 0,
    raw_payload_uri       TEXT,
    tags                  TEXT[]       NOT NULL DEFAULT '{}',

    UNIQUE (source, native_id)
);

-- Append-only event log. Every state change emits one row.
CREATE TABLE finding_events (
    id          BIGSERIAL    PRIMARY KEY,
    finding_id  UUID         NOT NULL REFERENCES findings(id) ON DELETE CASCADE,
    event_type  event_type   NOT NULL,   -- discovered | severity_changed |
                                         -- status_changed | fixed | suppressed |
                                         -- auto_closed | reopened |
                                         -- ownership_changed
    from_value  TEXT,
    to_value    TEXT         NOT NULL,
    occurred_at TIMESTAMPTZ  NOT NULL DEFAULT now(),
    actor       TEXT         NOT NULL,   -- "system:dependency_scanner_poll" |
                                         -- "system:auto_close" | user email
    reason      TEXT
);

-- Idempotency for Pub/Sub: redelivery is a cheap conflict-on-insert.
CREATE TABLE processed_payloads (
    id              UUID         PRIMARY KEY DEFAULT gen_random_uuid(),
    source          VARCHAR(64)  NOT NULL,
    poll_id         UUID         NOT NULL,
    raw_uri         TEXT         NOT NULL,
    finding_count   INTEGER      NOT NULL,
    processed_at    TIMESTAMPTZ  NOT NULL DEFAULT now(),
    CONSTRAINT uq_processed_payloads_source_poll UNIQUE (source, poll_id)
);

-- Pre-aggregated rollups for fast trend charts.
CREATE TABLE daily_metrics (
    date                DATE         NOT NULL,
    team                VARCHAR(128) NOT NULL,
    source              VARCHAR(64)  NOT NULL,
    severity            severity     NOT NULL,
    open_count          INTEGER      NOT NULL,
    new_count           INTEGER      NOT NULL,
    fixed_count         INTEGER      NOT NULL,
    sla_breached_count  INTEGER      NOT NULL,
    median_age_days     NUMERIC(8,2) NOT NULL,
    PRIMARY KEY (date, team, source, severity)
);

-- Personal API tokens for non-browser clients (MCP).
CREATE TABLE api_tokens (
    id            UUID         PRIMARY KEY DEFAULT gen_random_uuid(),
    email         TEXT         NOT NULL,
    label         TEXT         NOT NULL,
    prefix        VARCHAR(12)  NOT NULL,
    token_hash    VARCHAR(64)  NOT NULL UNIQUE,   -- sha256 hex
    created_at    TIMESTAMPTZ  NOT NULL DEFAULT now(),
    expires_at    TIMESTAMPTZ,
    revoked_at    TIMESTAMPTZ,
    last_used_at  TIMESTAMPTZ
);

-- Dashboard-side risk accepts. Only valid for pentest findings --
-- everything else defers to upstream suppression.
CREATE TABLE risk_accepts (
    finding_id   UUID         PRIMARY KEY REFERENCES findings(id) ON DELETE CASCADE,
    reason       TEXT         NOT NULL,
    accepted_by  TEXT         NOT NULL,
    expires_at   DATE         NOT NULL,
    created_at   TIMESTAMPTZ  NOT NULL DEFAULT now()
);

The rollup table is what keeps the trend chart fast: a nightly Cloud Run Job at ~01:00 UTC aggregates the previous day's findings and events into one row per (date, team, source, severity). The chart reads from daily_metrics instead of scanning the raw findings table on every page load.

11. On-demand vendor reports

Adjacent to the dashboard, I also ship on-demand PDF reports for the SAST scanner -- generated from open-source reporting tooling that exists in the security community but is awkward to run by hand.

Any authenticated user picks an organization, picks projects from a live API list, picks a report type, and clicks Generate. The backend inserts a report_jobs row, kicks off the job in a background thread, and the page polls until the artifact is ready. Artifacts live in object storage with 7-day retention. Rate limit: 5 in-flight jobs per requesting email.

The interesting part is not the feature itself; it's the pattern. Take a thing that technically worked in isolation (a Streamlit prototype, in this case), and fold it into the same auth, the same UI, the same audit log, the same operational story.

12. What I deliberately said no to

Every system is partly defined by what it refuses to do.

I skipped Jira integration (ticketing is a downstream team workflow choice, not a dashboard concern), multi-tenancy (single internal tool for a single org, no current value in that complexity), and agentic write operations in Phase 1 (the MCP server stays read-only until I have real usage data on what Phase 2 actually needs to mutate). I also skipped auto-remediation (even obvious fixes like a version bump are a team-level intentionality question, not an automated one) and a bespoke notification system (system alerts go to one webhook; per-team alerts are opt-in via ownership.yaml).

The watchword throughout: be the dashboard, not the orchestrator. Other tools are already great at their jobs. I make their output legible together.

13. The takeaway

The interesting thing about this project isn't any single technology choice. It's how cheap the AI-as-CLI-client experience becomes once you've done the unglamorous middle work:

One deterministic identity contract per finding so every downstream computation is idempotent and replay-safe.
One normalizer that's atomic per snapshot, with auto-close inside the same transaction so there's never a window of inconsistent state.
One API with a tiny, ruthless RBAC model (email is in the list or it isn't).
One token mechanism that lets any user wire an AI assistant up in 60 seconds without filing a ticket.
One MCP wrapper that's a translation surface (zero business logic, mirrors the API one-to-one) and scales across every persona that needs to query security posture -- the developer mid-sprint, the security engineer on Monday morning, and the CI pipeline before a release.

Once those are in place, the developer experience changes shape. Security stops being a tab. It stops being a Monday-morning Slack ping. It becomes a question you ask the assistant that's already open, in plain English, while you're already in the file you'd need to edit. The fix is a branch and a PR away. The verification is a re-query away.

That's the goal: friction down, signal up. The dashboard exists to give humans one place to look and to give their AI assistants one place to ask.

Thanks for reading. If you're working on something in this space and want to compare notes, the comments are open.

Command Palette