Earn money while you sleep:
Join the Gecko Tribe Affiliate Program, and earn commissions on sales of CaRP and other Gecko Tribe products.
Introduction to RSS Feed Processing Using CaRP
© 2002-7 Gecko Tribe, LLC


Where to find more information
The "extras" Directory
Installing CaRP
Displaying a feed
The parts of an RSS feed
RSS vs. Atom
Introduction to CaRP's configuration system
Using CaRP with Grouper
Using CaRP in PHP, HTML, ASP and other webpages
Question marks and garbage characters
XML Errors - what to do?
Displaying images
Turning off the "Newsfeed display by CaRP" message
CaRP-compatible web hosting


Thank you for choosing CaRP, especially if you chose CaRP Evolution! :-) This document will help you get CaRP installed and running on your website, and will answer a few of the most common questions that CaRP users ask. If you have other questions, don't hesitate to contact support or visit the CaRP User Forum (links for both can be found below).
- Antone Roundy   
Where to find more information
Search the Documentation
Web carp.docs.geckotribe.com

Search the User Forum



Search for RSS Feeds

Online Documentation
Display formatting
Index
Function reference
Example code

CaRP User Forum - special thanks to those users who are so helpful in answering others' questions in the forum

Technical Support

CaRP Tips Weblog
Weblog
RSS Feed
The "extras" Directory
If the computer that you edit your webpages on runs the Windows operating system, you can probably skip this section -- come back to it only if you have trouble editing "carpconf.php".

If you edit your webpages on Mac OS, UNIX, Linux, or BSD, you may wish to substitute the copy of "carpconf.php" from the appropriate subdirectory of the "extras" directory for the one in the "carp" directory. This will only be necessary if, when you open "carpconf.php" in your editor, the display is messed up or it displays "garbage characters" wherever there's a line break. The only difference between the files is the invisible characters that are used for the line breaks.

You shouldn't need to edit any of the other files, so alternative versions of them are not provided. However, if for some reason you do wish to convert the linebreaks in any of the other files, you may find the script "ConvertLineBreaks" useful. Please note that no technical support will be provided for this script other than to say that:
Installing CaRP
If you have already installed CaRP 3.5 or later (Free, Koi or Evolution), you may simply overwrite the old files with the files from the "carp" directory in this archive. Note that carpsetupinc.php, PHPFTP.php and PHPTelnet.php are used only during installation, and need not be uploaded when upgrading.

For a fresh installation, the following process is usually the simplest.
  1. Decide where you want to put the CaRP scripts on your webserver, and create a new folder for them if necessary.
    Recommendations:
    • We recommend that the folder be named "carp". This will make it more likely that the setup assistant will be able to locate it automatically. If you are going to name it "carp", you may create it by simply uploading the "carp" folder from the installation archive and its contents (see next step).
    • It is good security practice to install scripts like CaRP in a location that cannot be accessed directly by a web browser. For example, if your web folder is located at "/home/joe/public_html", a good location to install CaRP would be "/home/joe/carp" rather than "/home/joe/public_html/carp". However, if you see an error message about "safe mode", you may need to install CaRP inside your web directory.
  2. Upload all of the contents of the "carp" folder from the installation archive to the location chosen in the previous step.
  3. Upload carpsetup.php to your webserver (in a location where you'll be able to load it in your web browser).
    Recommendation: For simplicity, we recommend uploading it to your web root folder. For example, if your web folder is located at "/home/joe/public_html", upload it to there.
  4. Load carpsetup.php in your web browser. For example, if you uploaded it to "/home/joe/public_html/carpsetup.php" and your website address is http://www.webhost.com/~joe/, load http://www.webhost.com/~joe/carpsetup.php in your web browser.
  5. Follow the directions that are shown in that page. If you see a message about unsupported functions, that means that your web host has turned off support for functions which are necessary for CaRP to function. In that case, you will either need to contact your web host to see if they will re-enable those functions for you or install CaRP on a different server.
  6. Once you have successfully completed setup, delete carpsetup.php and carpsetupinc.php from your server.
  7. You may also delete PHPFTP.php and PHPTelnet.php--they are not used by CaRP. However, if you are a PHP programmer, you may find them useful. See the PHP FTP and PHP Telnet homepages for more information.
If you wish to use directory-based caching, but the installation script is unable to create cache directories, or is unable to create files in the cache directories after creating them, you will need to create them manually. To do so, create three subdirectories inside the directory where "carp.php" is located with the names "aggregatecache", "autocache", and "manualcache". Set the access permissions for these subdirectories to allow any user to read, write and "execute" them, or whatever settings are necessary on your server to enable CaRP to create files in them. For more information about how to do that, read here. (Note: if the installation script successfully creates these directories, it will be able to do so with more secure settings, so that method is preferred.)
Displaying a feed
When carpsetup.php has finished installing CaRP, it displays a little piece of PHP code that looks something like this (there will be more code if you select the mySQL caching option):
<?php
require_once '/YOUR/PATH/TO/carp/carp.php';
// Add any desired configuration settings before CarpCacheShow
// using "CarpConf" and other functions

CarpCacheShow('http://www.geckotribe.com/press/rss/pr.rss');
?>
NOTE: Do not use the exact code shown here -- use the code given to you by the installation script.

To display a feed, paste that code into a PHP webpage in the place where you want the feed displayed (if you want to display the feed in a webpage whose name doesn't end with ".php", see Using CaRP in PHP, HTML, ASP and other webpages). Then change the URL to the URL of the feed you wish to display. If you don't have a feed in mind already, you can search for one at Chordata.

Here's what each line of the code does: That's all it takes to display an RSS feed using CaRP, but you'll want to add some configuration code to get the feed to look right for your website. Before we get into configuration, it will be helpful to understand just a little bit about how RSS feeds are structured.
The parts of an RSS feed
You don't need to have any technical understanding of XML or the RSS format to use CaRP. But knowing a few basic things about how RSS feeds are structured will make it easier to understand CaRP's configuration system.

Channel and Items:
Let's compare an RSS feed to a newspaper. At the top of the newspaper, you'll see the paper's name, the date it was published, and a few other bits of information about the paper itself. An RSS feed is similar--the information that talks about the RSS feed itself is called the "channel" data. A newspaper also contains a number of stories. The stories in an RSS feed are called "items". An RSS feed is made up of some channel data followed by any number of items.

The following sections introduce the names that CaRP uses to refer to various pieces of data within the channel and items. We'll discuss how you'll use those names a little further down the page.

Channel Data:
The channel section of an RSS feed contains a number of different pieces of data. Some feeds contain more channel data than others. The following table lists the names that CaRP uses to refer to some of the most important pieces of channel data. If you're curious, it also lists the names of the elements in the RSS feed in which the data is found.
CaRP's name Description RSS element
title The name of the RSS feed title
desc A description of or introduction to the RSS feed description
url A URL associated with the feed--usually the address of a webpage that the data in the feed is taken from link
link CaRP uses the name "link" to refer to a combination of the title and link URL formatted as an HTML link (ie. <a href="RSS LINK">RSS TITLE</a>). link + title
date The date the RSS feed was last updated or published (requires CaRP Koi or CaRP Evolution). lastBuildDate, pubDate or dc:date
image An image associated with the RSS feed--usually a logo (requires CaRP Koi or CaRP Evolution).
NOTE: enclosure is only used by CaRP as an image if its "type" attribute starts with "image/".
image, media:thumbnail, or enclosure
NOTE: CaRP's name "image" refers to a combination of various pieces of data from the RSS feed, including the image's URL, height, width, alternate text, etc.
As you can see, CaRP simplifies things by using a single name to refer to whichever of a variety of pieces of data a particular feed contains, or even a combination of multiple pieces of data.

Item Data:
Each item in an RSS feed can contain a variety of pieces of data. Some feeds contain more pieces of item data than others--for example, many feeds do not specify a date or image for each item. The following table lists CaRP's names for each piece of data, a description of the data, and the RSS elements that the data can be found in.
CaRP's name Description RSS element
title The title or headline of the item title
desc The main content of the item--it may be the full story or just an introduction to the page that the item links to description or (if using CaRP Koi or CaRP Evolution) content:encoded
url A URL associated with the item--usually the address of a webpage that the data in the item is taken from, but sometimes a link to a webpage that they item is talking about link
link A combination of the title and link URL formatted as an HTML link (ie. <a href="RSS LINK">RSS TITLE</a>). link + title
date The date the item was last updated or published (requires CaRP Koi or CaRP Evolution) pubDate or dc:date
author The author of the item (requires CaRP Koi or CaRP Evolution) author or dc:creator
image An image associated with the item (requires CaRP Koi or CaRP Evolution).
NOTE: enclosure is only used by CaRP as an image if its "type" attribute starts with "image/".
image, media:thumbnail, media:item, media:content, or enclosure
NOTE: CaRP's name "image" refers to a combination of various pieces of data from the item, including the image's URL, height, width, alternate text, etc.
podcast An audio file linked to from the item (requires CaRP Koi or CaRP Evolution) enclosure (with a type beginning with "audio/")
RSS vs. Atom
The name "RSS" is shared by two different but similar newsfeed data formats (there are a few versions of each, but essentially it's two formats). CaRP can handle feeds in either of the formats (and any of the variants of either).

Another newsfeed format named "Atom" is gaining popularily recently. It was created to address some of the shortcomings of RSS. Although it is similar to RSS in many ways (and in fact, some people refer to Atom feeds as "RSS"), there are differences which prevent CaRP from processing Atom feeds. If you want to use Atom feeds with CaRP, you'll need to convert them to RSS first. The section Using CaRP with Grouper below discusses how to use Grouper Evolution to convert Atom feeds to RSS for use with CaRP.
Introduction to CaRP's configuration system
CaRP is configured using a function named "CarpConf". You can use CarpConf to tell CaRP to make headlines bold or bigger, to only show the first 150 characters of the description, to only show the headline of each item, and many other things. Each time you call CarpConf, you tell CaRP which of it's configuration settings to change and what to change it to, like this:
CarpConf('iorder','link,desc');
That code, for example, tells CaRP that you want to display the link (the title and URL combined in a hyperlink) and description for each item with the link first and the description second. "iorder" means "item order". It specifies which pieces of data from each item to display and in what order.

CaRP's configuration settings use these abbreviations: For example, you saw that the ORDER of the pieces of ITEM data is specified by the setting "iorder". Similarly, the order of the pieces of channel data used to be specified by the setting "corder". (Now, you specify the order of pieces of channel data to display before the list of items with the setting "cborder" ("channel before order") and the order of pieces of channel data to display after all the items have been displayed with the setting "caorder" ("channel after order")).

In the section The parts of an RSS feed above, there is a list of CaRP's names for the various pieces of data. You can specify text and HTML code to display before and after each piece of data like this:
CarpConf('bilink','<b>');
CarpConf('ailink','</b>');
That code tells CaRP to put a "<b>" tag before the item link ("b i link"), and a "</b>" tag after the item links ("a i link"), which would make the title bold. Similarly, you could use the following code to make the date display in italics with the word "Posted on" before the item date:
CarpConf('bidate','<i>Posted on ');
CarpConf('aidate','</i>');
If you want to change the color of the item description text, you could do this:
CarpConf('bidesc','<div style="color:#036;>');
CarpConf('aidesc','</div>');
The same pattern applies to any of the pieces of data listed in the section above. There are two exceptions: there are no "bclink" and "aclink" settings (use "bctitle" and "actitle" for both channel link and channel title), and there are no "bititle" and "aititle" settings (use "bilink" and "ailink" for both item link and item title).

You can also tell CaRP to display things before and after each item ("bi" = "before item" and "ai" = "after item"):
CarpConf('bi','<li>');
CarpConf('ai','</li>');
...and once before and after the entire list of items:
CarpConf('bitems','<ul>');
CarpConf('aitems','</ul>');
Those last two pieces of code would tell CaRP to display the items in an "unordered list", with each item being a "list item".

Similarly, you can tell CaRP what to display before and after the channel sections which appear before and after the items:
CarpConf('bcb','<font size="3">');
CarpConf('acb','</font>');
CarpConf('bca','<font size="1">');
CarpConf('aca','</font>');
That code would display any channel data appearing before the items (as specified by "cborder") in font size 3 ("bcb" = "before channel before" or "BEFORE the CHANNEL data that appears BEFORE the items", and so on), and any channel data appearing after the items (as specified by "caorder") in font size 1.

We've covered the most important concepts to understand for configuring CaRP. Below are a few links to the documentation for other useful configuration settings and other important documentation pages:
Using CaRP with Grouper
CaRP can be used to convert the RSS output by Grouper to HTML and display it on a webpage. There are two possible approaches: putting the code all in one file, or splitting the CaRP and Grouper code into separate files.

All in one file:
The basic code for using CaRP and Grouper together in a single file is as follows:
<?php
require_once "/YOUR/PATH/TO/grouper/grouper.php";
// add Grouper configuration code here
GrouperSearch("search terms","name_of_grouper_cache_file",0);
require_once "/YOUR/PATH/TO/carp/carp.php";
// add CaRP configuration code here
CarpShow("grouper:name_of_grouper_cache_file");
?>
Put this code in the webpage where you want the data displayed. A few points of note: Separate files:
You can also put the CaRP and Grouper code into separate files (which is useful if you're going to use the same Grouper output on multiple webpages, if you want anyone to be able to subscribe to Grouper's output in a feed reader, and also to improve performance slightly). Note that this method will not work on all servers, because some servers will not allow CaRP to access URLs on the same server. If you try this and CaRP fails to access the Grouper URL, try the all-in-one-file method shown above.

The basic Grouper code for this method is as follows. This and any configuration code you wish to add must be the entire contents of the file (ie., not <html>, <head>, <body> tags, etc.):
<?php
require_once "/YOUR/PATH/TO/grouper/grouper.php";
// add Grouper configuration code here
GrouperSearch("search terms","name_of_grouper_cache_file");
?>
The only difference between this and the Grouper code shown above is that the GrouperSearch call does not have the third argument ("0"). Because the zero is not there, Grouper will send its RSS output to whoever loads the file.

The CaRP code for this method, which you will put into the webpage where you want the feed to appear, is as follows:
<?php
require_once "/YOUR/PATH/TO/carp/carp.php";
// add CaRP configuration code here
CarpCacheShow("http://www.your-domain.com/path/to/grouper-code.php");
?>
Replace the URL "http://www.your-domain.com/path/to/grouper-code.php" with the URL of the file containing the Grouper code shown above. Note that to avoid making unnecessary HTTP connections to your own server to reload the Grouper output, you should use CarpCacheShow rather than CarpShow, and/or add a cache file name to the last line. For example, you could do either of the following:
CarpShow("http://www.your-domain.com/path/to/grouper-code.php","name_of_carp_cache_file");
CarpCacheShow("http://www.your-domain.com/path/to/grouper-code.php","name_of_carp_cache_file");
"name_of_carp_cache_file" can be any valid filename, and is the name of the file in which CaRP caches its output.

Converting Atom to RSS:
To convert an Atom feed to RSS for use with CaRP, you just need to make some small changes to the Grouper code shown above. Here's the all-in-one code for an Atom 1.0 feed (note that this code requires Grouper Evolution):
<?php
require_once "/YOUR/PATH/TO/grouper/grouper.php";
GrouperLoadPlugin('xml.php');
GrouperLoadPlugin('xml-atom-1.0.php');
// add Grouper configuration code here
GrouperConvert("http://example.com/feed.atom","name_of_grouper_cache_file",0);
require_once "/YOUR/PATH/TO/carp/carp.php";
// add CaRP configuration code here
CarpShow("grouper:name_of_grouper_cache_file");
?>
Replace "http://example.com/feed.atom" with the URL of the Atom feed. Here's the Grouper section of the split-into-two-files code (this time for an Atom 0.3 feed):
<?php
require_once "/YOUR/PATH/TO/grouper/grouper.php";
GrouperLoadPlugin('xml.php');
GrouperLoadPlugin('xml-atom-0.3.php');
// add Grouper configuration code here
GrouperConvert("http://example.com/feed.atom","name_of_grouper_cache_file");
?>
Using CaRP in PHP, HTML, ASP and other webpages
Because CaRP is a PHP script, it is easiest to use in PHP webpages (webpages whose names end with ".php"). However, CaRP can be used in a variety of types of webpages using the following instructions:
Question marks and garbage characters
What do you do if question marks and garbage characters are showing up in your feeds where things like apostrophies, quote marks, hyphens, etc. should be?

Question marks appear when the feed contains characters that can't be represented in the encoding specified in the "encodingout" configuration setting. Changing the output encoding to "UTF-8" should get rid of the question marks...but may replace them with garbage characters as described next.

Garbage characters appear when CaRP's output encoding does not match the encoding of your webpage. Here's how to solve the problem:
  1. First, find out what the character encoding of your page is. Look for a tag like one of the following near the top of your webpage source:
    • <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
    • <?xml version="1.0" encoding="ISO-8859-1" ?>
    In both examples, "ISO-8859-1" is the encoding. If you don't see one of those, look for the place in your web browser menus (usually something like View > Character Encoding) where you can select the encoding of the page. Whatever is checked in that menu is probably the correct encoding.
  2. Next, try setting "encodingout" to that encoding as described in the encodingout documentation. If the garbage characters have been replaced by question marks, then the feed contains data which cannot be displayed using the encoding of your webpage. To display the feed properly, you will have to change the encoding of your webpage.
  3. Finally, if you decide to change the encoding of your webpage, do the following:
    1. First, select an encoding. I recommend choosing "UTF-8" for a few reasons:
      • If you're using CaRP Free, UTF-8 is the only available option that will solve the problem.
      • If you're using CaRP Koi or Evolution and your server doesn't offer iconv support, UTF-8 is the only available option that will solve the problem.
      • PHP's XML parser can output UTF-8, which means that the data won't have to be transcoded after it is parsed, which will improve performance.
      There are two exceptions to the UTF-8 recommendation:
      • If your web authoring tools don't support UTF-8, and you need to enter "hi-ASCII" or other non-roman characters.
      • If the language of your website uses multibyte characters and there's another encoding available that uses less bytes per character, you might consider saving bandwidth by using the other encoding.
    2. Set CaRP's "encodingout" setting to that encoding.
    3. Make sure your webpage contains a META tag as shown above and that it specifies the new encoding (or if your webpage is XHTML, an XML declaration as shown above).
    4. Reload your webpage, and make sure that your browser has the new encoding checked in its menu. Then check the page to see if any of your content has turned into garbage characters. If so, you'll need to convert the data in your webpage to the new encoding. Generally this won't be necessary unless you have non-roman characters in your content. How to do this depends on the program you use to edit your webpage.
XML Errors - what to do?
Occasionally when attempting to process a feed, CaRP may display an error message beginning with the words "XML Error". There are a few possible causes for this, some of which are problems in the feed, and some of which are limitations in PHP's XML parser. Here's what to do when you encounter an XML error:
  1. If you are setting CaRP's "encodingin" configuration setting, remove that setting to allow CaRP to attempt to automatically determine the proper encoding.
  2. If that doesn't help and you are using CaRP Free, check the error message and see if it tells you what the character encoding of the feed is. If it does, then CaRP Free will not be able to process that feed. Either choose another feed or upgrade to CaRP Evolution.
  3. Go to Feed Validator.org and submit the URL of the feed. If the feed validator tells you that the feed is invalid, then there's probably nothing CaRP can do about it. But there is one thing you can try: add this line of code just before the line that says "CarpCacheShow":
     
    CarpConf('fixentities',1);
     
    With luck, that will enable CaRP to fix the broken part of the feed. With luck, it will do so without breaking parts of the feed that weren't already broken! If that doesn't work, then the only thing to do is to notify the publisher that their feed is broken (and point them to the Feed Validator so they can see for themself) and hope that they'll fix it.
  4. If the Feed Validator says that the feed is valid, try adding this line of code just before the line that says "CarpCacheShow":
     
    CarpConf('encodingin','ISO-8859-1');
     
    That will tell CaRP that the feed is encoded as ISO-8859-1 (don't worry if that means nothing to you!) By default, CaRP expects feeds to be encoded as UTF-8. If the feed isn't encoded as UTF-8 and doesn't indicate it's encoding in the XML prologue (the first line of the feed source), CaRP will need you to tell it what the encoding is using the "encodingin" configuration setting.
  5. If that doesn't solve the problem, then the feed is encoded using an encoding other than UTF-8 or ISO-8859-1, but doesn't explicitly specify its encoding. If you are using CaRP Free, you will not be able to use that feed. Either choose another feed or upgrade to CaRP Evolution. If you are already using CaRP Koi or CaRP Evolution, you will need to somehow find out what the encoding of the feed is and specify it in CaRP's encodingin setting.
Displaying images
Images may appear in feeds in one of two ways: the feed's <image> or <enclosure> element may point to it (in which case, the free version of CaRP will not be able to display it, but CaRP Koi and CaRP Evolution will, as long as it's a GIF or JPEG image--see the online documentation for details), or there may be an <img> tag in the HTML code in the description (or equivalent) element of the feed. If it is in the <description> element, any version of CaRP can display the image (but by default, none do). If it is in another element like <content:encoded>, CaRP Koi and CaRP Evolution can (but by default don't) display it. To make those images show up, add this line of code just before the line that says "CarpCacheShow":

CarpConfAdd('descriptiontags','|img');

CaRP keeps a list of HTML tags that it allows in the description in its "descriptiontags" setting. The default list does not contain "img" (since unexpected images would very likely disrupt the formatting of your webpage). The line of code shown above adds "img" tags to the list.

If it is in <content:encoded> rather than <description> and the feed contains both elements, add this line of code too (Koi and Evolution only):

CarpMapField('desc','content:encoded',0,10);

Similarly, if you want hyperlinks within the description to show up, add this line of code to add the <a> and </a> tags to the list of allowed HTML:

CarpConfAdd('descriptiontags','|a|/a');

...or do images and links at once like this:

CarpConfAdd('descriptiontags','|img|a|/a');
Turning off the "Newsfeed display by CaRP" message
As much as I like having that message displayed at the end of each feed, I also understand that sometimes your site would look better without it. If you do disable that message, I only ask that you either link to the CaRP homepage from somewhere on your site or purchase one of the commercial versions.

To disable the message, add the following line of code somewhere before the line that says "CarpCacheShow":

CarpConf('poweredby','');
CaRP-compatible web hosting
If you are looking for a web host, or your current host does not meet the requirements for running CaRP, Gecko Tribe recommends iPowerWeb. We have verifed that CaRP installs easily and works properly with iPowerWeb's "Web Hosting" service. (NOTE: we have not verified it on their "Windows Hosting" option, but it may work there too). iPowerWeb is a top-notch provider offering full-featured web and email hosting with excellent prices. I personally selected iPowerWeb for another business of which I am a part owner, and I continue to recommend them without reservation. In fact, if you have any difficulty installing CaRP on iPowerWeb's "Web Hosting" service, I will personally complete the installation** at no charge, even if you are using the free version of CaRP. Request installation on iPowerWeb

** Offer includes completion of the installation script, but not additional configuration (for which technical support is available in the CaRP User Forum or our support system).
Thank you for using CaRP!