Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /share/CACHEDEV1_DATA/Web/www/libraries/UBBcode/text_parser.class.php on line 228

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /share/CACHEDEV1_DATA/Web/www/libraries/UBBcode/text_parser.class.php on line 228

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /share/CACHEDEV1_DATA/Web/www/libraries/UBBcode/text_parser.class.php on line 228
Finding the highest index in an array

Comments Blog About Development Research Sites

Finding the highest index in an array

Jan 8, 2010
It is a well known fact that micro-optimalizations are generally a waste of time. As such, I rarely bother with them - much, much more performance is to be gained from improving the overal design of an application, in those rare cases these days when performance really is a problem. For to be frank, with todays modern hardware even a well used webapplication should rarely take more than a second to parse and that tiny fraction of a second you could spare is insignificant compared to simple network factors as latency and bandwidth.

However, sometimes it is simply interesting to look at possible solutions to a simple problem, and determine what the best solution might be. Thus, today I will discuss how to find the highest index in an randomly-indexed array. One obvious way is to simply loop over all indexes and store a number if it is higher than the previous one:
Code (php) (nieuw venster):
1
2
3
4
5
6
7
function useForeach (& $array) {
  $max = 0;
  foreach (
$array as $index => $value)
    if (
$index > $max)
      $max = $index;
  return
$max;
}

Fairly straightforward, yet it does require a lot of PHP operations - could there be a gain in using the generally faster internal functions to work for us? For example, by using PHP's ksort method (based on quicksort) and simply retrieving the last entry of the array?
Code (php) (nieuw venster):
1
2
3
4
5
6
function useSort (& $array) {
  ksort($array);
  end($array);
  $last = each($array);
  return
$last[0];
}

Or, perhaps even sneakier, by counting back from the maximum index you expect and checking if it exists?
Code (php) (nieuw venster):
1
2
3
4
5
6
function useWalk (& $array) {
  for (
$i = 50000; $i > 0; $i--)
    if (isset(
$array[$i]))
      return
$i;
  return
$i;
}

Notice the isset I used there. You are absolutely correct in stating that PHP has a method to determine whether an array key exists, called array_key_exists, but I found this method to be significantly slower than a simple isset.

On to our results. All figures are averages over a 1000 runs, for an array containing 10.000 entries with a random index distribution.

useForeach 0.00578s
useSort 0.01534s
useWalk (end at 10.000) 0.00000904s
useWalk (end at 20.000) 0.00417s
useWalk (end at 50.000) 0.01824s

Somewhat surprisingly, the sort method, which uses internal C code to sort the array, is very slow - using foreach is three times as fast here. The additional qsort is definately not worth skimming off a few operations in PHP for. More interesting though is our 'cheating' method: if we simply keep guessing, and start close enough to the actual value, this is (by far) the fastest method! Only once we use an initial guess twice the size of the original array does the foreach method come close to its performance, so if a decent guess is available it will probably outperform any 'neat' way of doing this!

On a sidenote: even the fastest method here would only result in a 0.015 second decrease in page loading time as compared to the slowest method - even your latency to this site is probably ten times as high - and this case was specifically designed to generate high parsing times. For example, a more realistic case with 'only' a thousand entries in the array takes 1 millisecond to parse using the slowest method and 0.6 milliseconds using foreach - so congratulations, you just spend an hour working on a 0.4 millisecond improvement!

FragFrog out!

Jan 8, 2010 Willem Stuursma

Of course the sort operation is slower - you have to sort the array, which requires quite a lot of operations.

What about a very simple [php]return max(array_keys($array));[/php]? That would move almost all operations to the internal C code, as you call it. It also requires very few operations (only one comparison per index). Could you benchmark that as a comparison?

I totally agree on not micro-optimizing.

Feb 28, 2010 Matthijs

With 'better late then never' in mind: it runs in 0.01324s - slightly faster than sort then.

Strangely enough, the foreach scenario now also runs in roughly that time - with exactly the same code. Probably some underlying caching mechanism at work here, and once again a good example of the futility of these kind of optimalizations :-)

New comment

Your name:
Comment: