Header Logo

3/1/26

Our Approach to the PDF Problem

Everyone's trying to teach AI to read PDFs. We're trying to make you stop needing to.

The PDF Isn't the Problem. The Paper Bureaucracy Is.

Everyone's trying to teach AI to read PDFs. We're trying to make you stop needing to.

There's been a lot of handwringing lately about PDFs and AI. The Verge ran a piece on how LLMs routinely choke on PDF parsing. The Economist is calling it a full-on war. Silicon Valley has spawned a cottage industry of startups promising to finally crack the format: better OCR, smarter extraction, multi-modal parsing, the whole arms race.

We're watching this from Quellist and feeling a little sideways about it.

Not because it's wrong. PDFs are genuinely painful for AI systems, for well-understood reasons. The format was designed to make things look good on paper. It encodes visual layout, not semantic structure. A table in a PDF isn't a table. It's a grid of floating text coordinates that happens to look like a table when rendered. A form field might be a real interactive element, or it might be a box drawn with lines and a label floating nearby. Scanned PDFs are just photographs of bureaucracy. There's no ground truth. The machine has to guess.

But here's the thing: obsessing over how to read PDFs better is like obsessing over how to more efficiently transcribe voicemails by hand. You're optimizing the wrong layer.

Why PDFs exist in the first place

PDFs are infrastructure for a world that needed documents to survive the journey from one desk to another (through fax machines, printers, email attachments, and filing cabinets) and still look the same on the other end. They solved a real problem. They solved it brilliantly, for 1993.

That world is still largely with us. If you're an immigration lawyer, you're filing I-485s, I-131s, I-765s, government forms that haven't fundamentally changed in decades. If you're a contractor, you're submitting certified payroll on WH-347s. If you're in healthcare, you're dealing with prior authorizations that require fax machines in 2026, no joke.

The professional world is wallpapered in this stuff. It's not going away because some startup built a better PDF parser.

What we're actually doing at Quellist

We're not building better PDF readers. We're building systems that understand what professionals are actually trying to accomplish and handle the paperwork as a consequence, not as the primary activity.

When an immigration attorney sits down to prepare a case, their job is to understand their client's situation, assess risk, build a legal strategy, and communicate clearly. Filling out form fields is not their job. It's the tax they pay for existing in a system built on paper.

Quellist's approach is to understand the goal (what outcome does this professional need?) and then generate, populate, and route the necessary documents as a byproduct. Not "here's a better way to interact with a PDF." More like: "you told us what you need, we handled the forms."

This means that in many cases, we're bypassing the PDF problem entirely. We generate pre-filled forms ready for signature. We output structured data directly into the formats downstream systems expect. We're not parsing a PDF that a human filled out by hand. We're producing the output the human would have produced, faster and with fewer errors.

Where parsing still matters (and how we handle it)

That said, we live in the real world. Incoming documents exist. Clients send scanned passports and prior authorization letters and ancient lease agreements. You can't magic those away.

Our view here is also different from the standard "better extraction" framing. When we ingest a document, we're not trying to produce a perfect machine-readable transcript. We're trying to extract decision-relevant information: the facts and dates and names and statuses that actually change what a professional does next. That's a much more tractable problem. You don't need to perfectly parse every pixel of a prior auth letter to know whether it was approved or denied, for what procedure, and when it expires.

The frame matters. "Parse this document perfectly" is an unsolvable problem disguised as an engineering challenge. "What do I need to know from this document to do my job?" is a question we can actually answer.

The boring slop is the point

Quellist's whole thesis is that the most valuable thing AI can do for knowledge workers isn't help them think harder. It's take the thinking-free work off their plate entirely. The forms, the filings, the follow-ups, the status checks, the re-entry of data that already exists somewhere into a system that doesn't know it yet.

This work is beneath the professionals doing it. It's not what they trained for. It's not where their judgment matters. It's administrative friction that compounds daily, and it's crushing small practices that can't afford to hire armies of paralegals and admin staff to absorb it.

PDFs are the medium this friction lives in. But the friction is the enemy, not the format.

That's what we're here to eliminate.

Quellist automates administrative tasks for professionals, starting with the paperwork that should never have required human attention in the first place. If you're a lawyer, healthcare administrator, or other professional drowning in forms and filings, we'd love to talk.

Start your new workflow

Unlock the potential of your business with a new cutting-edge approach to PDF forms.

Start your new workflow

Unlock the potential of your business with a new cutting-edge approach to PDF forms.

Start your new workflow

Unlock the potential of your business with a new cutting-edge approach to PDF forms.