This Python code fetches and parses RSS/Atom feeds from multiple...

July 2, 2025 at 10:15 PM

import feedparser def fetch_feed(url): feed = feedparser.parse(url) items = [] for entry in feed.entries: items.append({ "published": entry.get("published", ""), "title": entry.get("title", "No Title"), "link": entry.get("link", ""), "summary": entry.get("summary", "") }) return items def fetch_all(sources): all_items = [] for name, url in sources.items(): try: items = fetch_feed(url) for item in items: item['source'] = name all_items.extend(items) except Exception as e: print(f"Error fetching {name}: {e}") return all_items

This Python code fetches and parses RSS/Atom feeds from multiple sources and structures the data in a consistent format for further use. Here's what the code does:

1. Libraries Used:

  • feedparser: A library to parse RSS/Atom feeds.

2. fetch_feed(url):

  • Purpose: Retrieves and processes the feed entries from a specific URL.
  • How it works:
    • feedparser.parse(url): Fetches and parses the feed from the given url.
    • Iterates through feed.entries (the items in the feed).
    • Creates a dictionary for each entry with the following keys:
      • "published": The date/time the entry was published (defaults to an empty string if not available).
      • "title": The title of the entry (defaults to "No Title" if not available).
      • "link": The web link to the entry (defaults to an empty string if not available).
      • "summary": A summary or description of the entry (defaults to an empty string if not available).
    • Aggregates these dictionaries into the items list.
  • Returns: A list of dictionaries, each representing an entry in the feed.

3. fetch_all(sources):

  • Purpose: Fetches and combines feeds from multiple sources.
  • How it works:
    • sources: A dictionary where keys are source names (e.g., "News Site A") and values are feed URLs.
    • Iterates over each feed source.
    • For each source:
      • Calls fetch_feed(url) to fetch the feed entries.
      • Adds a "source" key to each entry to identify the source of the item.
      • Appends all items from the current source to the all_items list.
    • If an exception occurs while fetching a feed, it prints an error message with the source's name and error details.
  • Returns: A combined list of all feed entries from all sources, with each entry including its source information.

Example Usage:

sources = {
    "BBC": "http://feeds.bbci.co.uk/news/rss.xml",
    "CNN": "http://rss.cnn.com/rss/edition.rss",
}

all_feed_items = fetch_all(sources)
print(all_feed_items)
  • The sources dictionary specifies feed URLs from BBC and CNN.
  • The fetch_all function retrieves and processes all the feed data, handling potential errors gracefully.
  • The result, all_feed_items, will store a list of structured feed entries from both sources.

Summary:

This code is a robust system to pull structured data from RSS/Atom feeds, making it easy to manage and integrate feed content from multiple sources.

Generate your own explanations
Download our vscode extension
Read other generated explanations

Built by @thebuilderjr
Sponsored by beam analytics
Read our terms and privacy policy
Forked from openai-quickstart-node