2007年10月13日土曜日

[PHP] Finding the latest RSS item date

I’ve been podcasting for about a year. Recently, I thought of adding something like “(Last update: 12/10/06)” by a link to my podcast page. I also thought that it’d be neat to have this generated automatically (dynamically). After a few hours of experimentation in PHP, I came up with the following function.

I’ll list the whole code first, and then, go though it bit by bit later.

To use this code, put the following function definition in the header part of an HTMLfile.

<?php

function Find_Latest_From_RSS($filename) {
$str=file_get_contents($filename);
$p = xml_parser_create();
xml_parse_into_struct($p, $str, $vals, $index);
xml_parser_free($p);
for ($i=0; $i<count($vals); $i++)
if ($vals[$i][tag]=="ITEM") {
break;
}
}
$latest="January 1 1970 00:00:00 GMT"; // initialize
for (; $i<count($vals); $i++) { // start looking for PUBDATEs
if ($vals[$i][level]==4 && $vals[$i][tag]=="PUBDATE") {
$dateTime=strtotime($vals[$i][value]);
if ($latest<$dateTime) {
$latest=$dateTime;
}
}
}
return date("n/j/y",$latest);
}

?>

Then, in the body, where you want the date to appear, put

<?php echo Find_Latest_From_RSS("your_RSS_file"); ?>

For example,

(Lat update: <?php echo Find_Latest_From_RSS("your_RSS_file"); ?>)

will give you

(Last update: 12/11/06)

If you want to change the date format, modify this line:

date("n/j/y",$latest)

Breif documentation on the code follows. First, a function declaration.

function Find_Latest_From_RSS($filename) {

You pass the name of an RSS file as the argument. This means that it is possible to process multiple RSS files in a HTML file. This function returns the latest publication item date.

$str=file_get_contents($filename);

First, the entire RSS file is read in.

$p=xml_parser_create();
xml_parse_into_struct($p, $str, $vals, $index);
xml_parser_free($p);

These three lines go together. The first line creates a PHP xml parser object. The second line parses the RSS file content. The third line removes the parser object from the memory. The parsed data are put in $vals.

for ($i=0; $i<count($vals); $i++)
if ($vals[$i][tag]=="ITEM") {
break;
}
}

This loop looks for an <ITEM> tag and exits when it finds one. The purpose of this is to skip the materials that precede <ITEM> definitions.

$latest="January 1 1970 00:00:00 GMT";

In order to find the latest publication date, initialize $latest to some old date.

for (; $i<count($vals); $i++) {

Here’s another for loop. Notice that $i is not initialized. $i starts at the value from the last loop.

if ($vals[$i][level]==4 && $vals[$i][tag]=="PUBDATE") {

Look at each element in the array and if you find an element whose level is 4 and whose tag is PUBDATA, you’ll look at its value. What’s level 4? An item pubdate comes under <rss><channel><item>. Therefore, it’s level 4. There’s another pubdate at level 3, but it’s a publication date of the feed itself. Because the first loop skips over everything before the first item tag, we don’t really need to check on the level of each element here. This piece of code is there for the purpose of illustrating the data structure.

$dateTime=strtotime($vals[$i][value]);

strtotime is used here to convert a string representing a date into the date format. Once this is done, you can use comparison operators to compare between dates.

if ($latest<$dateTime) {
$latest=$dateTime;
}

A comparison is made here. Whenever a more recent date is found, it is stored in $latest.

This way, at the end of the loop, $latest should contain the latest item publication date.

return date("n/j/y",$latest);

The last thing to do is to convert the value in $latest into a string and return it as the return value of the whole function.

0 件のコメント: