Using Magpie RSS to scrape blog headlines to html
posted by ian grant on March 25, 2006 at 9:43 am | in digital art hacks, general, net art, web 2.0 |This walkthrough assumes you have access to a server running PHP and the ability to change permissions on directories.
Step One: Get MagpieRSS here! Head to sourceforge and grab the latest copy of the excellent MagpieRSS.
Step Two: Read the docs. Quickstart: setup a directory on the webserver that looks a bit like this:

Set the permission of the “cache” directory to 777 - world writable. You may be able to get away with more restricted permissions.
Step Three: use the code below as a starting point for exploration. You can see the results of it here here
There are several lines you can comment/uncomment to see the object magpierss returns. The current example is set to return the results of a blogger feed. With some extra code one can detect the feed and provide summaries accordingly… that is to come.
< !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> < ?php require_once('magpierss/rss_fetch.inc'); // the @ suppresses errors // change the URL to the blog atom / rss feed. If the feed is not atom but RSS some of the item names will be different - one will need to check. The info is in the 'channel' array. $rss = @fetch_rss( 'http://internetandnetworkart.blogspot.com/atom.xml' ); // $rss = @fetch_rss( 'http://ellington.tvu.ac.uk/dev/?feed=rss2' ); // dump the object to the screen to study the structure magpie returns echo ''; print_r($rss); echo ''; // end dump $channel = $rss->channel; echo "Blog Title: ” .$channel[’title’]; //display links recent blog entries: echo ”Latest blog additions:\n”; foreach ($rss->items as $item) { $href = $item[’link’]; $title = $item[’title’]; $author = $item[’author_name’]; $created = $item[’created’]; $content = $item[’atom_content’]; echo ”
“; ?>- $title created by $author on $created
\n $content\n”; } echo “
The sample files can be downloaded here: magpierss_blog_scrape.zip
1 comment
sorry, the comment form is now closed.
(cc) ian grant some rights reserved
Hey, awesome site!
check out SimplePie for working with RSS moving forward. Magpie doesn’t seem to have moved forward in a long time.
comment by steve cooley — March 20, 2007 #