The promise of spec-first API development is that the OpenAPI spec becomes the single source of truth, and reference documentation falls out automatically. In practice, it's closer to 70% automated: a generated reference page is a strong foundation, but it consistently breaks in the same places — missing examples, inaccurate error response documentation, and prose descriptions that read like someone copy-pasted field names from the schema into sentences. If you publish unedited generated docs, you're shipping a partial document and calling it complete.
This isn't a criticism of OpenAPI spec generators or documentation tooling — it's a structural property of what a machine can infer from a schema versus what developers actually need to use an API correctly. The spec describes the shape of data; it can't describe intent, side effects, or the gotchas that only appear in production traffic.
What Generates Well: The Structural Skeleton
Spec-to-docs generation handles the structural parts reliably. Endpoint paths, HTTP methods, path and query parameters, request body schemas, response schemas, authentication requirements (if the spec's securitySchemes and security fields are correctly populated) — all of this generates accurately from a well-formed OpenAPI 3.x spec. If your spec is using $ref components correctly for reusable schemas, generators will resolve them and produce inline type tables without duplication.
The generated output is especially strong for data models. A components/schemas/Invoice with 20 fields, each with a type, format, and brief description, turns into a readable table with minimal post-processing. This is the highest-ROI part of the spec-first workflow — defining your models once in the spec and having them appear consistently across every endpoint that references them.
Format attributes are also handled well: format: date-time generates as "ISO 8601 datetime string" in most tools, format: uuid as UUID. If your spec uses enum values, they appear as enumeration tables. If you've added minimum and maximum constraints to numeric fields, those surface in the parameter documentation. This is all useful and correct — generators don't need any help here.
What Breaks: The Four Consistent Failure Modes
Missing or thin request/response examples
The most common gap is examples. The OpenAPI spec supports example and examples fields on parameters, request bodies, and response objects, but many teams don't populate them when writing the spec. The generator's fallback — synthesizing an example from the schema types — produces valid-looking but semantically empty output. A synthesized example for a POST /v1/invoices request body might show "customer_id": "string" when what a developer actually needs is "customer_id": "cust_abc123" with a note that this is the ID from the GET /v1/customers response.
The fix is to add examples blocks to the spec for every non-trivial endpoint. This is the single highest-impact doc improvement per hour of work. A real example in the spec improves the generated output and also improves any SDK generation that reads the spec, since example values propagate into test fixtures.
Inaccurate or incomplete error response documentation
Most specs document the happy path thoroughly and treat error responses as an afterthought. A POST /v1/charges endpoint that can return 200, 400, 401, 402, 422, and 429 often has only the 200 response defined in the spec. The 422 response body — which carries actionable validation errors a partner needs to surface to their own users — is either missing or documented with a generic {"error": "string"} schema that tells you nothing about the actual error fields your API returns.
The correct spec for an error response includes the full error schema as a $ref component: the code field (machine-readable error type), the message field (human-readable), and any structured detail fields like field_errors for validation failures. Documenting this once in the spec's components/responses section and referencing it across all relevant endpoints means the generated docs are accurate for every endpoint simultaneously.
Prose descriptions that read like YAML comments
The description fields in an OpenAPI spec are typically written by engineers while writing the spec — often in a hurry, often as terse field annotations rather than documentation prose. The generator renders these verbatim. "The customer ID" is not a useful description of the customer_id field in the context of an API reference. "The ID of the customer to attach this invoice to, returned by POST /v1/customers or GET /v1/customers. Required for all invoice creation calls." is.
The distinction matters more for endpoint-level descriptions than for individual field descriptions. The generated endpoint description needs to communicate what the operation does, what side effects it triggers (does creating an invoice immediately send an email to the customer? does it lock inventory?), and what the idempotency behavior is. None of this is inferred from the schema.
Authentication documentation that's structural, not operational
Generated docs accurately reflect the securitySchemes definition: "This API uses Bearer token authentication." They don't tell partners where to get the token, what scopes are required for each endpoint, how long tokens are valid, or what the 401 response body looks like when a token is missing versus when it's valid but insufficient scope. This is the information partners need in the first 30 minutes of integration, and it's never in the spec — it requires prose sections that surround the generated reference.
The Post-Processing Workflow That Works
The workflow that closes the gap without abandoning the spec-first discipline: generate the reference docs, then layer prose annotations on top without modifying the spec or the generated output directly. Most documentation platforms allow markdown or HTML description overrides per-endpoint. The right approach is to define a "documentation layer" — a set of prose annotations keyed to operation IDs — that gets merged with the generated output at publish time.
Consider a payments API team at a growing developer platform that ships their first spec-generated reference in late 2025. The generated output is structurally correct but thin on examples and has no error documentation for their 422 responses. Rather than writing all the prose by hand, they start with a coverage audit: for each endpoint, score whether examples, error responses, and operational prose are present. The endpoints with the highest partner ticket volume (charge creation, webhook registration) go first. Within a sprint, those two endpoints have full example coverage and documented error codes. Partner support tickets referencing those endpoints drop measurably in the following weeks. The remaining endpoints get documented iteratively over subsequent releases.
The spec-first discipline is preserved because the prose annotations live alongside the spec in the same repository and are reviewed in the same pull requests as API changes. When POST /v1/charges gains a new optional parameter in v1.3 of the spec, the annotation reviewer is prompted to update the operational description for that parameter.
When to Write Reference Docs Outside the Spec Entirely
There are categories of content that don't belong in a spec-generated reference at all: conceptual overviews (what is an Invoice in your data model and how does it relate to a Charge?), authentication setup guides (end-to-end walkthrough of getting a key and making a first call), webhook integration guides (setting up event subscriptions, verifying signatures, handling retries), and migration guides between API versions.
We're not saying generated reference docs are sufficient documentation — they're one layer of a documentation set that includes conceptual content, how-to guides, and example workflows. The spec generates the reference layer accurately; the other layers require deliberate authorship. The teams that publish spec-generated docs and consider documentation complete are the teams whose partners still have a high support ticket rate six months in.
The test is simple: give your generated reference docs to a developer who has never used your API and ask them to complete a specific task — say, create an invoice, charge it, and retrieve the transaction record. Track where they get stuck. The answers are always in the same places: they didn't know how to get credentials, they didn't know the relationship between the resource IDs, they didn't know what error codes mean, and they didn't know what happens to in-progress operations when they hit a rate limit. None of those answers are in a spec-generated reference without deliberate additions. All of them can be.