pages_to_rss.rb

The other day I wrote a program that will produce an RSS feed from a list of URLs. I did this because a few of the pages that I like to read on a daily basis (such as Pitchfork) don’t have feeds. And these days, if a site doesn’t have a feed, I usually don’t remember to read it.

If this is something you’d like to try out, leave a comment and I’ll send you the instructions for setting it up. I’d post it on this site, but it’s way too rough for public use yet.

Also, if you know of a program or service that already does this, please let me know. I’d hate to waste my time replicating someone else’s work.

Posted in Miscellaneous

5 Responses to pages_to_rss.rb

  1. Mike M says:

    Matt, I’d love to test out your code. I have a couple pages that I’ve experimented with doing some screenscraping with php.

    Also, while they’re not official, here are some feeds that I use to read Pitchfork. They work pretty well (but I’d rather take my “scraping” in-house than depend on someone I don’t know doing the work for me).

    Pitchfork Music News: http://www.marteydodoo.com/rss/pitchfork_news.rss

    Pitchfork Album Reviews: http://www.marteydodoo.com/rss/pitchfork_reviews.rss

    Pitchfork Track Reviews: http://www.marteydodoo.com/rss/pitchfork_wearetheworld.rss

  2. Matt says:

    Because it’s intended to be able to pull any URL, it doesn’t do any parsing of the pages. It just dumps all of the text (minus as much of the formatting I can remove) into the feed.

    Still, it’s nicer than no feed at all.

  3. Lindsey says:

    Does it do more or less what Cheesegrater does?

  4. Matt says:

    Yes. I guess that’s as far is this project will go.

  5. Paul says:

    Yours may have been better. Goodness knows your Spotlight script pretty much 0wn3d Apple’s implementation. :-)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>