Iterate over each line in a string in PHP

PhpString

Php Problem Overview


I have a form that allows the user to either upload a text file or copy/paste the contents of the file into a textarea. I can easily differentiate between the two and put whichever one they entered into a string variable, but where do I go from there?

I need to iterate over each line of the string (preferably not worrying about newlines on different machines), make sure that it has exactly one token (no spaces, tabs, commas, etc.), sanitize the data, then generate an SQL query based off of all of the lines.

I'm a fairly good programmer, so I know the general idea about how to do it, but it's been so long since I worked with PHP that I feel I am searching for the wrong things and thus coming up with useless information. The key problem I'm having is that I want to read the contents of the string line-by-line. If it were a file, it would be easy.

I'm mostly looking for useful PHP functions, not an algorithm for how to do it. Any suggestions?

Php Solutions


Solution 1 - Php

preg_split the variable containing the text, and iterate over the returned array:

foreach(preg_split("/((\r?\n)|(\r\n?))/", $subject) as $line){
    // do stuff with $line
} 

Solution 2 - Php

I would like to propose a significantly faster (and memory efficient) alternative: strtok rather than preg_split.

$separator = "\r\n";
$line = strtok($subject, $separator);

while ($line !== false) {
    # do something with $line
    $line = strtok( $separator );
}

Testing the performance, I iterated 100 times over a test file with 17 thousand lines: preg_split took 27.7 seconds, whereas strtok took 1.4 seconds.

Note that though the $separator is defined as "\r\n", strtok will separate on either character - and as of PHP4.1.0, skip empty lines/tokens.

See the strtok manual entry: http://php.net/strtok

Solution 3 - Php

If you need to handle newlines in diferent systems you can simply use the PHP predefined constant PHP_EOL (http://php.net/manual/en/reserved.constants.php) and simply use explode to avoid the overhead of the regular expression engine.

$lines = explode(PHP_EOL, $subject);

Solution 4 - Php

It's overly-complicated and ugly but in my opinion this is the way to go:

$fp = fopen("php://memory", 'r+');
fputs($fp, $data);
rewind($fp);
while($line = fgets($fp)){
  // deal with $line
}
fclose($fp);

Solution 5 - Php

Potential memory issues with strtok:

Since one of the suggested solutions uses strtok, unfortunately it doesn't point out a potential memory issue (though it claims to be memory efficient). When using strtok according to the manual, the:

> Note that only the first call to strtok uses the string argument. > Every subsequent call to strtok only needs the token to use, as it > keeps track of where it is in the current string.

It does this by loading the file into memory. If you're using large files, you need to flush them if you're done looping through the file.

<?php
function process($str) {
    $line = strtok($str, PHP_EOL);

    /*do something with the first line here...*/

    while ($line !== FALSE) {
        // get the next line
        $line = strtok(PHP_EOL);

        /*do something with the rest of the lines here...*/

    }
    //the bit that frees up memory
    strtok('', '');
}

If you're only concerned with physical files (eg. datamining):

According to the manual, for the file upload part you can use the file command:

 //Create the array
 $lines = file( $some_file );

 foreach ( $lines as $line ) {
   //do something here.
 }

Solution 6 - Php

foreach(preg_split('~[\r\n]+~', $text) as $line){
    if(empty($line) or ctype_space($line)) continue; // skip only spaces
    // if(!strlen($line = trim($line))) continue; // or trim by force and skip empty
    // $line is trimmed and nice here so use it
}

^ this is how you break lines properly, cross-platform compatible with Regexp :)

Solution 7 - Php

Kyril's answer is best considering you need to be able to handle newlines on different machines.

> "I'm mostly looking for useful PHP functions, not an algorithm for how > to do it. Any suggestions?"

I use these a lot:

  • [explode()][1] can be used to split a string into an array, given a single delimiter.
  • implode() is explode's counterpart, to go from array back to string.

[1]: http://php.net/explode "explode()"

Solution 8 - Php

Similar as @pguardiario, but using a more "modern" (OOP) interface:

$fileObject = new \SplFileObject('php://memory', 'r+');
$fileObject->fwrite($content);
$fileObject->rewind();

while ($fileObject->valid()) {
    $line = $fileObject->current();
    $fileObject->next();
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTopher FangioView Question on Stackoverflow
Solution 1 - PhpKyrilView Answer on Stackoverflow
Solution 2 - PhpErwin WesselsView Answer on Stackoverflow
Solution 3 - PhpFerCaView Answer on Stackoverflow
Solution 4 - PhppguardiarioView Answer on Stackoverflow
Solution 5 - PhpAbsoluteƵERØView Answer on Stackoverflow
Solution 6 - PhpCodeAngryView Answer on Stackoverflow
Solution 7 - PhpJoe KileyView Answer on Stackoverflow
Solution 8 - PhpFabien SaView Answer on Stackoverflow