Mark Forums Read
  #1  
Old 11-29-2005, 02:35 PM
3GM-Media 3GM-Media is offline
Junior Member
 
Join Date: Nov 2005
Posts: 14
3GM-Media is on a distinguished road
Thumbs up What is RSS?

What is RSS?
RSS is a format for syndicating news and the content of news-like sites, including major news sites like Wired, news-oriented community sites like Slashdot, and personal weblogs. But it's not just for news. Pretty much anything that can be broken down into discrete items can be syndicated via RSS: the "recent changes" page of a wiki, a changelog of CVS checkins, even the revision history of a book. Once information about each item is in RSS format, an RSS-aware program can check the feed for changes and react to the changes in an appropriate way.
RSS-aware programs called news aggregators are popular in the weblogging community. Many weblogs make content available in RSS. A news aggregator can help you keep up with all your favorite weblogs by checking their RSS feeds and displaying new items from each of them.

What does RSS look like?
Imagine you want to write a program that reads RSS feeds, so that you can publish headlines on your site, build your own portal or homegrown news aggregator, or whatever. What does an RSS feed look like? That depends on which version of RSS you're talking about. Here's a sample RSS 0.91 feed (adapted from XML.com's RSS feed):
PHP Code:
<rss version="0.91">
<
channel>
<
title>XML.com</title>
<
link>http://www.xml.com/</link> 
<description>XML.com features a rich mix of information and services for the XML community.</description>
<
language>en-us</language>
<
item>
<
title>Normalizing XMLPart 2</title>
<
link>http://www.xml.com/pub/a/2002/12/04/normalizing.html</link>
<description>In this second and final look at applying relational normalization techniques to W3C XML Schema data modelingWill Provost discusses when not to normalizethe scope of uniqueness and the fourth and fifth normal forms.</description>
</
item>
<
item>
<
title>The .NET Schema Object Model</title>
<
link>http://www.xml.com/pub/a/2002/12/04/som.html</link>
<description>Priya Lakshminarayanan describes in detail the use of the .NET Schema Object Model for programmatic manipulation of W3C XML Schemas.</description>
</
item>
<
item>
<
title>SVG's Past and Promising Future</title>
<link>http://www.xml.com/pub/a/2002/12/04/svg.html</link>
<description>In this month'
s SVG columnAntoine Quint looks back at SVG's journey through 2002 and looks forward to 2003.</description>
</item>
</channel>
</rss> 
Simple, right? A feed comprises a channel, which has a title, link, description, and (optional) language, followed by a series of items, each of which have a title, link, and description.

Now look at the RSS 1.0 version of the same information:
PHP Code:
<rdf:RDF
xmlns
:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://purl.org/rss/1.0/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<
channel rdf:about="http://www.xml.com/cs/xml/query/q/19">
<
title>XML.com</title>
<
link>http://www.xml.com/</link>
<description>XML.com features a rich mix of information and services for the XML community.</description>
<
language>en-us</language>
<
items>
<
rdf:Seq>
<
rdf:li rdf:resource="http://www.xml.com/pub/a/2002/12/04/normalizing.html"/>
<
rdf:li rdf:resource="http://www.xml.com/pub/a/2002/12/04/som.html"/>
<
rdf:li rdf:resource="http://www.xml.com/pub/a/2002/12/04/svg.html"/>
</
rdf:Seq>
</
items>
</
channel>
<
item rdf:about="http://www.xml.com/pub/a/2002/12/04/normalizing.html">
<
title>Normalizing XMLPart 2</title>
<
link>http://www.xml.com/pub/a/2002/12/04/normalizing.html</link>
<description>In this second and final look at applying relational normalization techniques to W3C XML Schema data modelingWill Provost discusses when not to normalizethe scope of uniqueness and the fourth and fifth normal forms.</description>
<
dc:creator>Will Provost</dc:creator>
<
dc:date>2002-12-04</dc:date
</
item>
<
item rdf:about="http://www.xml.com/pub/a/2002/12/04/som.html">
<
title>The .NET Schema Object Model</title>
<
link>http://www.xml.com/pub/a/2002/12/04/som.html</link>
<description>Priya Lakshminarayanan describes in detail the use of the .NET Schema Object Model for programmatic manipulation of W3C XML Schemas.</description>
<
dc:creator>Priya Lakshminarayanan</dc:creator>
<
dc:date>2002-12-04</dc:date
</
item>
<
item rdf:about="http://www.xml.com/pub/a/2002/12/04/svg.html">
<
title>SVG's Past and Promising Future</title>
<link>http://www.xml.com/pub/a/2002/12/04/svg.html</link>
<description>In this month'
s SVG columnAntoine Quint looks back at SVG's journey through 2002 and looks forward to 2003.</description>
<dc:creator>Antoine Quint</dc:creator>
<dc:date>2002-12-04</dc:date> 
</item>
</rdf:RDF> 
Quite a bit more verbose. People familiar with RDF will recognize this as an XML serialization of an RDF document; the rest of the world will at least recognize that we're syndicating essentially the same information. In fact, we're including a bit more information: item-level authors and publishing dates, which RSS 0.91 does not support.

Despite being RDF/XML, RSS 1.0 is structurally similar to previous versions of RSS -- similar enough that we can simply treat it as XML and write a single function to extract information out of either an RSS 0.91 or RSS 1.0 feed. However, there are some significant differences that our code will need to be aware of:
  1. The root element is rdf:RDF instead of rss. We'll either need to handle both explicitly or just ignore the name of the root element altogether and blindly look for useful information inside it.
  2. RSS 1.0 uses namespaces extensively. The RSS 1.0 namespace is http://purl.org/rss/1.0/, and it's defined as the default namespace. The feed also uses http://www.w3.org/1999/02/22-rdf-syntax-ns# for the RDF-specific elements (which we'll simply be ignoring for our purposes) and http://purl.org/dc/elements/1.1/ (Dublin Core) for the additional ****data of article authors and publishing dates.
    We can go in one of two ways here: if we don't have a namespace-aware XML parser, we can blindly assume that the feed uses the standard prefixes and default namespace and look for item elements and dc:creator elements within them. This will actually work in a large number of real-world cases; most RSS feeds use the default namespace and the same prefixes for common modules like Dublin Core. This is a horrible hack, though. There's no guarantee that a feed won't use a different prefix for a namespace (which would be perfectly valid XML and RDF). If or when it does, we'll miss it.
    If we have a namespace-aware XML parser at our disposal, we can construct a more elegant solution that handles both RSS 0.91 and 1.0 feeds. We can look for items in no namespace; if that fails, we can look for items in the RSS 1.0 namespace. (Not shown, but RSS 0.90 feeds also use a namespace, but not the same one as RSS 1.0. So what we really need is a list of namespaces to search.)
  3. Less obvious but still important, the item elements are outside the channel element. (In RSS 0.91, the item elements were inside the channel. In RSS 0.90, they were outside; in RSS 2.0, they're inside. Whee.) So we can't be picky about where we look for items.
  4. Finally, you'll notice there is an extra items element within the channel. It's only useful to RDF parsers, and we're going to ignore it and assume that the order of the items within the RSS feed is given by their order of the item elements.
But what about RSS 2.0? Luckily, once we've written code to handle RSS 0.91 and 1.0, RSS 2.0 is a piece of cake. Here's the RSS 2.0 version of the same feed:
PHP Code:
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<
channel>
<
title>XML.com</title>
<
link>http://www.xml.com/</link>
<description>XML.com features a rich mix of information and services for the XML community.</description>
<
language>en-us</language>
<
item>
<
title>Normalizing XMLPart 2</title>
<
link>http://www.xml.com/pub/a/2002/12/04/normalizing.html</link>
<description>In this second and final look at applying relational normalization techniques to W3C XML Schema data modelingWill Provost discusses when not to normalizethe scope of uniqueness and the fourth and fifth normal forms.</description>
<
dc:creator>Will Provost</dc:creator>
<
dc:date>2002-12-04</dc:date
</
item>
<
item>
<
title>The .NET Schema Object Model</title>
<
link>http://www.xml.com/pub/a/2002/12/04/som.html</link>
<description>Priya Lakshminarayanan describes in detail the use of the .NET Schema Object Model for programmatic manipulation of W3C XML Schemas.</description>
<
dc:creator>Priya Lakshminarayanan</dc:creator>
<
dc:date>2002-12-04</dc:date
</
item>
<
item>
<
title>SVG's Past and Promising Future</title>
<link>http://www.xml.com/pub/a/2002/12/04/svg.html</link>
<description>In this month'
s SVG columnAntoine Quint looks back at SVG's journey through 2002 and looks forward to 2003.</description>
<dc:creator>Antoine Quint</dc:creator>
<dc:date>2002-12-04</dc:date> 
</item>
</channel>
</rss> 
As this example shows, RSS 2.0 uses namespaces like RSS 1.0, but it's not RDF. Like RSS 0.91, there is no default namespace and items are back inside the channel. If our code is liberal enough to handle the differences between RSS 0.91 and 1.0, RSS 2.0 should not present any additional wrinkles.

Last edited by 3GM-Media : 11-29-2005 at 02:49 PM.
Reply With Quote
 #Add to Ads's Reputation  
OldSponsored Ads
Ads AdsPromoter is online
Member
 
Join Date: LongTime
Posts: 1100
Ads is on a distinguished road
Default New Sponsored Ads



This message will go away once you are registered. Also, by registering, you will have access to all post topics, communicate privately with other members (PM), respond to polls, upload graphics, and access other special features! Registration is fast, simple and absolutely free so please Click Here to join our Web Hosting community today!
Reply With Quote
  #2  
Old 11-29-2005, 02:52 PM
3GM-Media 3GM-Media is offline
Junior Member
 
Join Date: Nov 2005
Posts: 14
3GM-Media is on a distinguished road
Default

How can I read RSS?
Now let's get down to actually reading these sample RSS feeds from Python. The first thing we'll need to do is download some RSS feeds. This is simple in Python; most distributions come with both a URL retrieval library and an XML parser. (Note to Mac OS X 10.2 users: your copy of Python does not come with an XML parser; you will need to install PyXML first.)
from xml.dom import minidom
import urllib

def load(rssURL):
return minidom.parse(urllib.urlopen(rssURL))

This takes the URL of an RSS feed and returns a parsed representation of the DOM, as native Python objects.
The next bit is the tricky part. To compensate for the differences in RSS formats, we'll need a function that searches for specific elements in any number of namespaces. Python's XML library includes a getElementsByTagNameNS which takes a namespace and a tag name, so we'll use that to make our code general enough to handle RSS 0.9x/2.0 (which has no default namespace), RSS 1.0 and even RSS 0.90. This function will find all elements with a given name, anywhere within a node. That's a good thing; it means that we can search for item elements within the root node and always find them, whether they are inside or outside the channel element.
DEFAULT_NAMESPACES = \
(None, # RSS 0.91, 0.92, 0.93, 0.94, 2.0
'http://purl.org/rss/1.0/', # RSS 1.0
'http://my.netscape.com/rdf/simple/0.9/' # RSS 0.90
)

def getElementsByTagName(node, tagName, possibleNamespaces=DEFAULT_NAMESPACES):
for namespace in possibleNamespaces:
children = node.getElementsByTagNameNS(namespace, tagName)
if len(children): return children
return []

Finally, we need two utility functions to make our lives easier. First, our getElementsByTagName function will return a list of elements, but most of the time we know there's only going to be one. An item only has one title, one link, one description, and so on. We'll define a first function that returns the first element of a given name (again, searching across several different namespaces). Second, Python's XML libraries are great at parsing an XML document into nodes, but not that helpful at putting the data back together again. We'll define a textOf function that returns the entire text of a particular XML element.
def first(node, tagName, possibleNamespaces=DEFAULT_NAMESPACES):
children = getElementsByTagName(node, tagName, possibleNamespaces)
return len(children) and children[0] or None

def textOf(node):
return node and "".join([child.data for child in node.childNodes]) or ""

That's it. The actual parsing is easy. We'll take a URL on the command line, download it, parse it, get the list of items, and then get some useful information from each item:
DUBLIN_CORE = ('http://purl.org/dc/elements/1.1/',)

if __name__ == '__main__':
import sys
rssDocument = load(sys.argv[1])
for item in getElementsByTagName(rssDocument, 'item'):
print 'title:', textOf(first(item, 'title'))
print 'link:', textOf(first(item, 'link'))
print 'description:', textOf(first(item, 'description'))
print 'date:', textOf(first(item, 'date', DUBLIN_CORE))
print 'author:', textOf(first(item, 'creator', DUBLIN_CORE))
print

Running it with our sample RSS 0.91 feed prints only title, link, and description (since the feed didn't include any other information on dates or authors):
$ python rss1.py http://www.xml.com/2002/12/18/examples/rss091.xml.txt
title: Normalizing XML, Part 2
link: http://www.xml.com/pub/a/2002/12/04/normalizing.html
description: In this second and final look at applying relational normalization techniques to W3C XML Schema data modeling, Will Provost discusses when not to normalize, the scope of uniqueness and the fourth and fifth normal forms.
date:
author:

title: The .NET Schema Object Model
link: http://www.xml.com/pub/a/2002/12/04/som.html
description: Priya Lakshminarayanan describes in detail the use of the .NET Schema Object Model for programmatic manipulation of W3C XML Schemas.
date:
author:

title: SVG's Past and Promising Future
link: http://www.xml.com/pub/a/2002/12/04/svg.html
description: In this month's SVG column, Antoine Quint looks back at SVG's journey through 2002 and looks forward to 2003.
date:
author:

For both the sample RSS 1.0 feed and sample RSS 2.0 feed, we also get dates and authors for each item. We reuse our custom getElementsByTagName function, but pass in the Dublin Core namespace and appropriate tag name. We could reuse this same function to extract information from any of the basic RSS modules. (There are a few advanced modules specific to RSS 1.0 that would require a full RDF parser, but they are not widely deployed in public RSS feeds.)
Here's the output against our sample RSS 1.0 feed:
$ python rss1.py http://www.xml.com/2002/12/18/examples/rss10.xml.txt
title: Normalizing XML, Part 2
link: http://www.xml.com/pub/a/2002/12/04/normalizing.html
description: In this second and final look at applying relational normalization techniques to W3C XML Schema data modeling, Will Provost discusses when not to normalize, the scope of uniqueness and the fourth and fifth normal forms.
date: 2002-12-04
author: Will Provost

title: The .NET Schema Object Model
link: http://www.xml.com/pub/a/2002/12/04/som.html
description: Priya Lakshminarayanan describes in detail the use of the .NET Schema Object Model for programmatic manipulation of W3C XML Schemas.
date: 2002-12-04
author: Priya Lakshminarayanan

title: SVG's Past and Promising Future
link: http://www.xml.com/pub/a/2002/12/04/svg.html
description: In this month's SVG column, Antoine Quint looks back at SVG's journey through 2002 and looks forward to 2003.
date: 2002-12-04
author: Antoine Quint

Running against our sample RSS 2.0 feed produces the same results.
This technique will handle about 90% of the RSS feeds out there; the rest are ill-formed in a variety of interesting ways, mostly caused by non-XML-aware publishing tools building feeds out of templates and not respecting basic XML well-formedness rules. Next month we'll tackle the thorny problem of how to handle RSS feeds that are almost, but not quite, well-formed XML.
Reply With Quote
  #3  
Old 07-22-2008, 12:51 PM
Sakari Sakari is offline
Member
 
Join Date: Jul 2008
Posts: 34
Sakari is on a distinguished road
Default

RSS is an XML-based format for content distribution. Webmasters create an RSS file containing headlines and descriptions of specific information.
__________________
web design services
Free Templates
Reply With Quote
  #4  
Old 12-12-2008, 07:33 AM
megalead11 megalead11 is offline
Junior Member
 
Join Date: Dec 2008
Posts: 5
megalead11 is on a distinguished road
Default

Your post is quite informative. In simple words you have explained about RSS. I liked your post. I have seen 'RSS feeds' written on many websites but never understood what it is. Your post has removed all my doubts related to 'RSS feeds' from my mind.
Reply With Quote
  #5  
Old 01-19-2009, 03:47 PM
mikecriss mikecriss is offline
Junior Member
 
Join Date: Jan 2009
Posts: 6
mikecriss is on a distinguished road
Default

how can i apply RSS feed on my blog .......and peoples use to submit their feed url's to others .......any ideA

Last edited by mikecriss : 01-23-2009 at 04:31 PM.
Reply With Quote
  #6  
Old 02-04-2009, 09:05 AM
ajit22 ajit22 is offline
Member
 
Join Date: Jan 2009
Posts: 54
ajit22 is on a distinguished road
Default

that is very good information on rss.
__________________
New York Web design
Website design

Reply With Quote
  #7  
Old 04-01-2009, 12:48 PM
vks87 vks87 is offline
Member
 
Join Date: Mar 2009
Posts: 85
vks87 is on a distinguished road
Default

Nice article...
Thanks
Reply With Quote
  #8  
Old 08-19-2009, 06:53 AM
gkumar gkumar is offline
Senior Member
 
Join Date: Jun 2009
Posts: 203
gkumar is on a distinguished road
Default What is RSS?

RSS (Rich Site Summary) is a format for delivering regularly changing web content. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it.
Reply With Quote
  #9  
Old 01-18-2010, 01:34 PM
Mardyth Mardyth is offline
Junior Member
 
Join Date: Jan 2010
Posts: 3
Mardyth is on a distinguished road
Default

RSS is a format for syndicating news and the content of news-like sites, including major news sites like Wired, news-oriented community sites like Slashdot, and personal weblogs. RSS solves a problem for people who regularly use the web.
__________________
Cat litter furniture
Reply With Quote
  #10  
Old 02-04-2010, 04:26 AM
netbugmonk netbugmonk is offline
Junior Member
 
Join Date: Feb 2010
Posts: 4
netbugmonk is on a distinguished road
Default

rss is an efficient way to check for news and updates to your favourite websites, without having to visit each site individually.in this article, we compare the various rss readers and try to determine the best way to read your subscribed rss feeds.
__________________
flower garden designs
Reply With Quote
 #Add to Ads's Reputation  
OldSponsored Ads
Ads AdsPromoter is online
Member
 
Join Date: LongTime
Posts: 1100
Ads is on a distinguished road
Default New Sponsored Ads



This message will go away once you are registered. Also, by registering, you will have access to all post topics, communicate privately with other members (PM), respond to polls, upload graphics, and access other special features! Registration is fast, simple and absolutely free so please Click Here to join our Web Hosting community today!
Reply With Quote
Reply

« my own DTD | - »

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT. The time now is 02:14 AM.


Powered by vBulletin Version 3.6.1
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
SEO by vBSEO 2.4.0