An MCP server that extracts and transforms webpage content into clean, LLM-optimized Markdown using Mozilla's Readability algorithm.
An MCP server that extracts and transforms webpage content into clean, LLM-optimized Markdown. It returns article title, main content, excerpt, byline, and site name. This server uses Mozilla's Readability algorithm to remove ads, navigation, footers, and non-essential elements while preserving the core content structure.
Features
- Removes ads, navigation, footers, and other non-essential content
- Converts clean HTML into well-formatted Markdown (also uses Turndown)
- Returns article metadata (title, excerpt, byline, site name)
- Handles errors gracefully
Why Not Just Fetch?
Unlike simple fetch requests, this server:
- Extracts only relevant content using Mozilla's Readability algorithm
- Eliminates noise like ads, popups, and navigation menus
- Reduces token usage by removing unnecessary HTML/CSS
- Provides consistent Markdown formatting for better LLM processing
- Includes useful metadata about the content
Tool Reference
parse
Fetches and transforms webpage content into clean Markdown.
Arguments:
Returns:
Installation
Server Statistics
LicenseMIT
LocalNo
Published12/24/2024