High Performance Industrisl Soluations

Advanced PHP, MySQL and web security related topics brought to you by PHP Guruji.

How do I… Recursively scan directories with PHP’s DirectoryIterators?

One of PHP5’s most interesting new features is the addition of Iterators, a collection of ready-made interfaces designed to help in navigating and processing hierarchical data structures. These Iterators significantly reduce the amount of code required to process an XML document tree or a file collection. A number of Iterators are available, including the ArrayIterator, CachingIterator, LimitIterator, RecursiveIterator, SimpleXMLIterator and DirectoryIterator.

It’s this last Iterator that’s the subject of this How do I… tutorial. The DirectoryIterator provides a quick and efficient way of processing the files in a directory; with a little creative coding, it can also be used to recursively process a nested directory tree. Both these tasks can be accomplished using just a few lines of code, representing a significant improvement over the “standard” way of doing things.

Processing a single-level directory

Let’s begin with something simple: processing a single-level directory. Type (or copy) the following script (Listing A), altering the directory path to reflect your local configuration:

Listing A

$it = new DirectoryIterator("/tmp/mystuff");
foreach($it as $file) {
if (!$it->isDot()) {
echo $file . "\n";
}
}
?>

When you view the output of this script in your browser, you should see a list of the files in the named directory. How did this happen? Well, the DirectoryIterator class provides a pre-built interface to iterating over the contents of a directory; once instantiated with the location of the target directory, it can then be processed as though it were a standard PHP array, with each element representing a file in the directory. Note the use of the isDot() method to filter out the “.” and “..” directories, respectively.

Processing a nested directory tree

Recursively processing a nested directory tree is almost as simple. In this case, the DirectoryIterator needs to check each object it encounters within the first-level directory, determine whether it is a file or directory, and, if a directory, drill one level deeper to examine the next level of contents. This sounds fairly complex, and in the past could easily add up to 15-plus lines of code.

With PHP5, though, all you need are two new Iterators: the RecursiveDirectoryIterator and the RecursiveIteratorIterator, which together incorporate all the above functionality. Take a look at Listing B:

Listing B

$it = new RecursiveDirectoryIterator("/tmp");
foreach(new RecursiveIteratorIterator($it) as $file) {
echo $file . "\n";
}
?>

In this case, the output should now include a list of all the files and directories under the starting directory. Needless to say, this kind of built-in recursive interface is very handy for situations that require you to process all the files under a particular directory level — for example, when recursively compressing a directory tree, or altering group/owner permissions on a series of nested files.

A real-world application: Printing a directory tree

A common application of directory recursion involves printing a graphical directory tree. With Iterators, this task is a snap, because included within the Iterator class documentation is an example class written specifically for this purpose. The DirectoryTreeIterator (credit: Marcus Boerger) provides additional enhancements to the RecursiveIteratorIterator discussed previously, most notably ASCII markers that represent depth and location within the tree structure.

You can examine the source code for this example class on the php.net Web site.

Listing C shows how the DirectoryTreeIterator can be used.

Listing C

$it = new DirectoryTreeIterator("/tmp/cookbook/");
foreach($it as $path) {
echo $path . "\n";
}
?>

And here’s a brief snippet of the output you might see:
|-ch01
| |-recipe01
| | |-example01.php
| | \-example02.php
| |-recipe02
| | |-example01.php
| | \-example02.php
| |-recipe03
| | \-example01.php
...

To better understand the value-add of these various DirectoryIterators, try coding the three applications demonstrated in this tutorial using standard file and directory functions. Once you’re done, you’ll have a new appreciation for the simplicity and ease of use the DirectoryIterators bring to PHP5. Happy coding!

Farewell, PHP 4

The end appears to be in sight for the beloved version 4 of PHP, the open-sourced scripting language that allows seasoned programmers and beginners alike to quickly and easily write code for the World Wide Web.

According to the terse announcement on the main PHP Web site:

The PHP development team hereby announces that support for PHP 4 will continue until the end of this year only. After 2007-12-31 there will be no more releases of PHP 4.4. We will continue to make critical security fixes available on a case-by-case basis until 2008-08-08. Please use the rest of this year to make your application suitable to run on PHP 5.

The announcement came on the third anniversary since PHP 5 was launched. Project programmers says that they want to devote their finite resources on the upcoming PHP 6 instead.

According to Rasmus Lerdorf, the original PHP author and now a Yahoo programmer:

“Ending PHP 4 support is driven by practical necessity. We are an open-source project with limited resources. With PHP 6 on the way, we don’t have the resources to support three different versions of PHP at the same time.”

Detractors to the decision have less kind words to say. According to Matt Mullenweg, the founder of the WordPress blogging software and site, which uses PHP:

“PHP 5 has been, from an adoption point of view, a complete flop. Most estimates place it in the single-digit percentages or at best the low tens.”

“Now the PHP core team seems to have decided that the boost their failing product needs is to kill off their successful one instead of asking the hard questions: What was it that made PHP 4 so successful?…Why wasn’t PHP 5 compelling to that same audience? Are the things we’re doing in PHP 6 crucial to our core audience or simply ‘good’ language problems to solve?”

Zend refreshes PHP platform

Zend Technologies has announced the availability of version 3.6 of Zend Platform. The new version of the PHP Web application server delivers new and enhanced functionality in three major areas: PHP intelligence, performance management, and cluster management.

PHP Intelligence has to do with the monitoring of HTTP, Apache, and Java events, as well as offering better diagnostics to improve the overall reliability and stability of an infrastructure based on PHP. Downtime is reduced by recording full context for reported problems so that “root cause” diagnostics and resolution can be made within a short time. The other two aspects are pretty much self-explanatory.

Excerpt from eWeek:

Performance has remained an issue for dynamic languages such as PHP and Ruby. Zend Platform 3.6 improves the performance of PHP applications by caching pre-optimized PHP byte code, Zend officials said. The product features support for file- or URL-based page caching, client-side caching and in-memory or disk-based data caching.

Zend also introduced Zend Studio for Eclipse, an IDE for PHP developers. It has seen more than 250,000 downloads since its launch in September 2007.

PHP 5.2.6 Released

The PHP development team would like to announce the immediateavailability of PHP 5.2.6. This release focuses on improving the stability ofthe PHP 5.2.x branch with over 120 bug fixes, several of which are security related.All users of PHP are encouraged to upgrade to this release.

Further details about the PHP 5.2.6 release can be found in the release announcement for 5.2.6, the full list of changes is available in the ChangeLog for PHP 5.

Security Enhancements and Fixes in PHP 5.2.6:

  • Fixed possible stack buffer overflow in the FastCGI SAPI identified by Andrei Nigmatulin.
  • Fixed integer overflow in printf() identified by Maksymilian Aciemowicz.
  • Fixed security issue detailed in CVE-2008-0599 identified by Ryan Permeh.
  • Fixed a safe_mode bypass in cURL identified by Maksymilian Arciemowicz.
  • Properly address incomplete multibyte chars inside escapeshellcmd() identified by Stefan Esser.
  • Upgraded bundled PCRE to version 7.6