Remove Top Line of Text File with PowerShell

Powershell Problem Overview

I am trying to just remove the first line of about 5000 text files before importing them.

I am still very new to PowerShell so not sure what to search for or how to approach this. My current concept using pseudo-code:

set-content file (get-content unless line contains amount)

However, I can't seem to figure out how to do something like contains.

Powershell Solutions

Solution 1 - Powershell

While I really admire the answer from @hoge both for a very concise technique and a wrapper function to generalize it and I encourage upvotes for it, I am compelled to comment on the other two answers that use temp files (it gnaws at me like fingernails on a chalkboard!).

Assuming the file is not huge, you can force the pipeline to operate in discrete sections--thereby obviating the need for a temp file--with judicious use of parentheses:

(Get-Content $file | Select-Object -Skip 1) | Set-Content $file

... or in short form:

(gc $file | select -Skip 1) | sc $file

Solution 2 - Powershell

It is not the most efficient in the world, but this should work:

get-content $file |
    select -Skip 1 |
    set-content "$file-temp"
move "$file-temp" $file -Force

Solution 3 - Powershell

Using variable notation, you can do it without a temporary file:

${C:\file.txt} = ${C:\file.txt} | select -skip 1

function Remove-Topline ( [string[]]$path, [int]$skip=1 ) {
  if ( -not (Test-Path $path -PathType Leaf) ) {
    throw "invalid filename"
  }

  ls $path |
    % { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" }
}

Solution 4 - Powershell

I just had to do the same task, and gc | select ... | sc took over 4 GB of RAM on my machine while reading a 1.6 GB file. It didn't finish for at least 20 minutes after reading the whole file in (as reported by Read Bytes in Process Explorer), at which point I had to kill it.

My solution was to use a more .NET approach: StreamReader + StreamWriter. See this answer for a great answer discussing the perf: https://stackoverflow.com/questions/2159763/in-powershell-whats-the-most-efficient-way-to-split-a-large-text-file-by-recor

Below is my solution. Yes, it uses a temporary file, but in my case, it didn't matter (it was a freaking huge SQL table creation and insert statements file):

PS> (measure-command{
    $i = 0
    $ins = New-Object System.IO.StreamReader "in/file/pa.th"
    $outs = New-Object System.IO.StreamWriter "out/file/pa.th"
    while( !$ins.EndOfStream ) {
        $line = $ins.ReadLine();
        if( $i -ne 0 ) {
            $outs.WriteLine($line);
        }
        $i = $i+1;
    }
    $outs.Close();
    $ins.Close();
}).TotalSeconds

It returned:

188.1224443

Solution 5 - Powershell

Inspired by AASoft's answer, I went out to improve it a bit more:

Avoid the loop variable $i and the comparison with 0 in every loop
Wrap the execution into a try..finally block to always close the files in use
Make the solution work for an arbitrary number of lines to remove from the beginning of the file
Use a variable $p to reference the current directory

These changes lead to the following code:

$p = (Get-Location).Path

(Measure-Command {
    # Number of lines to skip
    $skip = 1
    $ins = New-Object System.IO.StreamReader ($p + "\test.log")
    $outs = New-Object System.IO.StreamWriter ($p + "\test-1.log")
    try {
        # Skip the first N lines, but allow for fewer than N, as well
        for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) {
            $ins.ReadLine()
        }
        while( !$ins.EndOfStream ) {
            $outs.WriteLine( $ins.ReadLine() )
        }
    }
    finally {
        $outs.Close()
        $ins.Close()
    }
}).TotalSeconds

The first change brought the processing time for my 60 MB file down from 5.3s to 4s. The rest of the changes is more cosmetic.

Solution 6 - Powershell

$x = get-content $file
$x[1..$x.count] | set-content $file

Just that much. Long boring explanation follows. Get-content returns an array. We can "index into" array variables, as demonstrated in this and other Scripting Guys posts.

For example, if we define an array variable like this,

$array = @("first item","second item","third item")

so $array returns

first item
second item
third item

then we can "index into" that array to retrieve only its 1st element

$array[0]

or only its 2nd

$array[1]

or a range of index values from the 2nd through the last.

$array[1..$array.count]

Solution 7 - Powershell

I just learned from a website:

Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }

Or you can use the aliases to make it short, like:

gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }

Solution 8 - Powershell

Another approach to remove the first line from file, using multiple assignment technique. Refer Link

 $firstLine, $restOfDocument = Get-Content -Path $filename 
 $modifiedContent = $restOfDocument 
 $modifiedContent | Out-String | Set-Content $filename

Solution 9 - Powershell

skip` didn't work, so my workaround is

$LinesCount = $(get-content $file).Count
get-content $file |
    select -Last $($LinesCount-1) | 
    set-content "$file-temp"
move "$file-temp" $file -Force

Solution 10 - Powershell

For smaller files you could use this:

& C:\windows\system32\more +1 oldfile.csv > newfile.csv | out-null

... but it's not very effective at processing my example file of 16MB. It doesn't seem to terminate and release the lock on newfile.csv.

Content Type	Original Author	Original Content on Stackoverflow
Question	Buddy Lindsey	View Question on Stackoverflow
Solution 1 - Powershell	Michael Sorens	View Answer on Stackoverflow
Solution 2 - Powershell	Richard Berg	View Answer on Stackoverflow
Solution 3 - Powershell	hoge	View Answer on Stackoverflow
Solution 4 - Powershell	AASoft	View Answer on Stackoverflow
Solution 5 - Powershell	Oliver	View Answer on Stackoverflow
Solution 6 - Powershell	noam	View Answer on Stackoverflow
Solution 7 - Powershell	Luke Du	View Answer on Stackoverflow
Solution 8 - Powershell	Venkataraman R	View Answer on Stackoverflow
Solution 9 - Powershell	Mariusz Biesiekierski	View Answer on Stackoverflow
Solution 10 - Powershell	danielmbarlow	View Answer on Stackoverflow