In the age of agentic AI, context is everything. But there are so many different forms of context. While we started as a broad framework connecting all sorts of data and context to the model layer,… | Jerry Liu

In the age of agentic AI, context is everything. But there are so many different forms of context. While we started as a broad framework connecting all sorts of data and context to the model layer, today our mission is hyperfocused on unlocking a very specific but universal form of context: documents 📃📄📑 Today, we have best-in-class technology for parsing PDFs, Office docs, and others to unlock and extract context for your AI agents. That's it. Next time you're in SF and you wonder, "didn't LlamaIndex use to be a RAG framework? What happened?" this sign on 2nd Street might help 😉 Come bring your hardest, nastiest PDFs, we will parse them with LlamaParse. Sincerely, We Parse Docs LlamaIndex

13 Comments

Jerry Liu 1d

Reuploaded as .jpg (the colors look nice!)

2 Reactions

Pedro Fernandes Thomaz 1h

Context type matters as much as context volume. Retrieval, memory, tool state, conversation history: treating them all the same is where most agentic pipelines break down. Great framing.

Glauber Bannwart 12h

Hey Jerry I have a stale public good project. The project is simple and I would like to continue working on it. It's about small claims. So someone had an issue and wants to file a claim. Lawyers are expensive and also, in most cases, not a personal assistant to the client. So amidst the evolution why are we still manually filling long forms? Why foreigners also have to do this and that's why there are 1500 different forms in multiple languages just in CA? Crazy right? The issue I faced was not being able to reuse pdfs and some are editable, others not. I wonder if LlamaIndex could be used for the both directions, so not only extracting context for reasoning, but also filling critical pdfs precisely, or even transforming older versions of PDF to it's editable version. Let me know if your product can or will do that in the future.

Lyle Perrien 1d

Jerry, this is a smart and honest evolution. In agentic systems, the quality of the context you feed the model often matters more than the model itself — and documents are still one of the richest, most under-served sources of that context. The billboard is perfect. Most teams still underestimate how messy real-world documents actually are (scanned forms, mixed layouts, handwritten notes, inconsistent structure). Getting that right is foundational if we want agents that can reliably reason over the kind of information that actually exists in the world, not just clean web text. Appreciate the focus. Looking forward to seeing how LlamaParse handles the truly ugly stuff.

1 Reaction

Abdelmoula Khdoudi 7h

Still llamaparse not able to parse my pdf of scanned manufacturing records.

Yassine Letaief 10h

Docs are a goldmine of enterprise context, but notoriously hard to parse and index at scale, there are so many formats, layouts, spatial relationships between doc components out there, DOcs are also very visual and pure text representations like md lose that richness. Great to see LlamaIndex focus on this hard problem!

Michał Piszczek 20h

The nastiest PDFs are always the compliance docs nobody wanted to digitize in the first place

2 Reactions

Martin B. 20h

Document parsing is exactly where enterprise RAG breaks in production — scanned PDFs, nested tables, multilingual contracts. That said at scale you still need a data quality SLA between the parser and the agent layer.

Jo Kristian Bergum 10h

Love the clear messaging!

1 Reaction

Karandeep Singh 1d

The best parser i can trust for pdf extraction and getting data ready for my agents .

1 Reaction

See more comments

To view or add a comment, sign in

LinkedIn respects your privacy

Jerry Liu’s Post

Explore content categories