Show HN: Long PDF Reader MCP

(pageindex.ai)

5 points | by mingtianzhang 12 hours ago ago

7 comments

LoMoGan 11 hours ago ago
Interesting, is this based on an external Vector DB to store and process the PDF?
[-]
- mingtianzhang 11 hours ago ago
  Thanks for the great question! We actually use a reasoning-based, vectorless approach. In short, it follows this process:
```
  1. Generate a table of contents (ToC) for the document.

  2. Read the ToC to select a relevant section.

  3. Extract relevant information from the selected section.

  4. If enough information has been gathered, provide the answer; otherwise, return to step 2.
```
  We believe this approach closely mimics how a human would navigate and read long PDFs.
  [-]
  - LoMoGan 10 hours ago ago
    Sounds interesting, will try it out.
    [-]
    - mingtianzhang 10 hours ago ago
      Thanks, any feedback is welcome!
fighterhao 12 hours ago ago
非常有价值！
[-]
- mingtianzhang 11 hours ago ago
  thanks!
11 hours ago ago
[deleted]