Reading and creating XML, RSS and CVS files is essential knowledge when it comes to using PHP because they are all formats commonly used for a wide range of purposes.
I'm just in the middle of creating a Wordpress plugin that downloads affiliate scripts from a remote server. The files can be in any of the three formats.

Previously I've used the XMLWriter and XMLWriter classes, the SimplePie library and some dabbling with SOAP, but this was my first introduction to dealing with CVS.
Below are some functions that I created for this project with explanations. In time I'll re-factor the code into class methods and with the appropriate inbuilt Wordpress functions.
The first step in downloading a remote source is to check whether it actually exists. It may be the case that the user has wrongly typed the URL or maybe the server is temporarily down.
This is the function I created for this purpose:
function remote_file_exists($url) { // IMPORTANT : Check to see if the file exists on the remote server // If the HTTP code is 200 then return true - everything else false $ch = curl_init($url); curl_setopt($ch, CURLOPT_NOBODY, true); curl_exec($ch); $retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE); curl_close($ch); if ($retcode === 200) { return true; } else { return false; } }
It uses the Client URL (cURL) library which has been a component of PHP since version 4.0.2. CURLOPT_NOBODY used in the curl_setopt function excludes the body from download as we are only after the header details, while CURLINFO_HTTP_CODE in curl_getinfo returns the HTTP status code. A 200 HTTP status code means the file is okay to download.
PHP & ZIP
The first thing when accessing a ZIP file is to download it onto the server with the following code:
function grab_file($url, $new_file) { //get file $ch = curl_init(); $fp = fopen("zip/$new_file", "w"); $options = array(CURLOPT_URL => $url, CURLOPT_HEADER => 0, CURLOPT_FAILONERROR => 1, CURLOPT_AUTOREFERER => 1, CURLOPT_BINARYTRANSFER => 1, CURLOPT_RETURNTRANSFER => 1, CURLOPT_FOLLOWLOCATION => 1, CURLOPT_TIMEOUT => 5, CURLOPT_FILE => $fp); curl_setopt_array($ch, $options); $file = curl_exec($ch); curl_close($ch); fclose($fp); if (!$file) { return false; } else { return true; } }
Again, back to cURL. This component of PHP can initially look deceptively simple but is a Swiss army knife that has a multitude of options. For a list of which take a look at this page.
CURLOPT_FILE is the new file the contents will be written to, CURLOPT_TIMEOUT is set to five seconds, CURLOPT_URL is the resource to be downloaded, CURLOPT_RETURNTRANSFER sets the return as a string while CURLOPT_BINARYTRANSFER permits binary data.
I had lot of trouble setting the right options in the function above but thankfully a thread on Stackoverflow pointed me in the right direction.
Now there is a copy of the ZIP file on your server it is time to unzip it to reveal the goodies underneath. This can be done with the PHP ZipArchive class:
function get_zip($data) { $zip = new ZipArchive(); $filename = $data; if ($zip->open($filename, ZIPARCHIVE::CREATE || ZIPARCHIVE::OVERWRITE)) { $zip->extractTo('feeds'); $zip->close(); return true; } else { return false; } }
The above code should be self-explanatory. However, I have been informed that some shared servers have the PHP ZIP option switched off. I'm not sure if this is still the case in 2012 but I decided to use instead a third-party class called PclZip which is released on a GPL and LGPL free software license.
function unzip($file) { $archive = new PclZip("zip/$file"); $list = $archive->extract(PCLZIP_OPT_PATH, "feeds", PCLZIP_OPT_REMOVE_ALL_PATH); if (!$list[0]['filename']) { die("Error : " . $archive->errorInfo(true)); } else { $path = "zip/$file"; unlink($path); return $list[0]['filename']; } }
There are a number of optional arguments that can be used when extracting the file for a full list of which read this page.
PHP & GZIP
The difference between the Zip and the Zlib module is that it is possible to save a resource to memory which is why I haven't used cURL here.
function get_gzip($url, $new_file, $ext) { $new_file = "feeds/$new_file.$ext"; $remote = gzopen($url, "rb"); $home = fopen($new_file, "w"); while ($string = gzread($remote, 4096)) { fwrite($home, $string, strlen($string)); } gzclose($remote); fclose($home); if($home !== null) { return $new_file; } else { return false; } }
Thanks go to a user on Stackoveflow who pointed out a far more memory efficient way of grabbing massive files by downloading it in chunks.
RSS & PHP
There are several different RSS formats which are all XML but are radically different from each other. I would strongly suggest that you don't try to be a PHP superhero and attempt to write your own code for parsing RSS but instead use a third-party script.
The leading industry favourite for this job is SimplePie, originally created by Ryan Parman, Ryan McCue and Geoffrey Sneddon. Seriously, use SimplePie – the alternative may be premature ageing or madness.
Here is the most basic SimplePie example:
$feed = new SimplePie(); $feed->set_feed_url( url here ); $success = $feed->init();
After creating an instance of the class, call the feed and then run the init method. Easy!
CVS & PHP
Comma-separated values (CSV) files are wildly popular amongst a whole range of software applications but it's not a format that finds common use amongst web developers. Unlike XML, CVS is not a mark-up language but plain text which uses commas to separate the data. PHP does have its own CVS file function, but it's probably best if you use the PEAR class, File_CSV_DataSource. Created by Kazuyoshi Tlacaelel it is released under the permissive MIT license.
Here is File_CSV_DataSource used in a function that returns the entire CSV file in an array:
function parse_csv($file) { $csv = new File_CSV_DataSource; $feed = $file; if ($csv->load($feed)) { return $csv->getrawArray(); } else { return false; } // end if ($csv->sadfsad }
XML & PHP
For XML use the simplexml_load_file, a SimpleXML function.
In its most basic form we can create an array as so:
function parse_xml($file) { $xml = simplexml_load_file($file); return $xml; }
If you are developer that has experience in the above areas then please feel free to leave a comment because in PHP there are many different ways to skin a cat.