Using Magpie RSS to scrape blog headlines to html

posted by ian grant on March 25, 2006 at 9:43 am | in digital art hacks, general, net art, web 2.0 |

This walkthrough assumes you have access to a server running PHP and the ability to change permissions on directories.

Step One: Get MagpieRSS here! Head to sourceforge and grab the latest copy of the excellent MagpieRSS.

Step Two: Read the docs. Quickstart: setup a directory on the webserver that looks a bit like this:

Directory 001

Set the permission of the “cache” directory to 777 - world writable. You may be able to get away with more restricted permissions.

Step Three: use the code below as a starting point for exploration. You can see the results of it here here

There are several lines you can comment/uncomment to see the object magpierss returns. The current example is set to return the results of a blogger feed. With some extra code one can detect the feed and provide summaries accordingly… that is to come.


< !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">







< ?php
require_once('magpierss/rss_fetch.inc');
// the @ suppresses errors
// change the URL to the blog atom / rss feed. If the feed is not atom but RSS some of the item names will be different - one will need to check. The info is in the 'channel' array.

$rss = @fetch_rss( 'http://internetandnetworkart.blogspot.com/atom.xml' );
// $rss = @fetch_rss( 'http://ellington.tvu.ac.uk/dev/?feed=rss2' );

// dump the object to the screen to study the structure magpie returns
echo '
';
print_r($rss);
echo '';
// end dump

$channel = $rss->channel;
echo "Blog Title: ” .$channel[’title’];

//display links recent blog entries:

echo ”
    Latest blog additions:\n”; foreach ($rss->items as $item) { $href = $item[’link’]; $title = $item[’title’]; $author = $item[’author_name’]; $created = $item[’created’]; $content = $item[’atom_content’]; echo ”
  • $title created by $author on $created
  • \n $content\n”; } echo “
“; ?>

The sample files can be downloaded here: magpierss_blog_scrape.zip

1 comment

  1. Hey, awesome site! :) check out SimplePie for working with RSS moving forward. Magpie doesn’t seem to have moved forward in a long time.

    comment by steve cooley — March 20, 2007 #

sorry, the comment form is now closed.

(cc) ian grant some rights reserved