Split on different newlines

RubyRegexSplitNewline

Ruby Problem Overview


Right now I'm doing a split on a string and assuming that the newline from the user is \r\n like so:

string.split(/\r\n/)

What I'd like to do is split on either \r\n or just \n.

So how what would the regex be to split on either of those?

Ruby Solutions


Solution 1 - Ruby

Did you try /\r?\n/ ? The ? makes the \r optional.

Example usage: http://rubular.com/r/1ZuihD0YfF

Solution 2 - Ruby

Ruby has the methods String#each_line and String#lines

returns an enum: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-each_line

returns an array: http://www.ruby-doc.org/core-2.1.2/String.html#method-i-lines

I didn't test it against your scenario but I bet it will work better than manually choosing the newline chars.

Solution 3 - Ruby

# Split on \r\n or just \n
string.split( /\r?\n/ )

Although it doesn't help with this question (where you do need a regex), note that String#split does not require a regex argument. Your original code could also have been string.split( "\r\n" ).

Solution 4 - Ruby

\n is for unix 
\r is for mac 
\r\n is for windows format

To be safe for operating systems. I would do /\r?\n|\r\n?/

"1\r2\n3\r\n4\n\n5\r\r6\r\n\r\n7".split(/\r?\n|\r\n?/)
=> ["1", "2", "3", "4", "", "5", "", "6", "", "7"]

Solution 5 - Ruby

The alternation operator in Ruby Regexp is the same as in standard regular expressions: |

So, the obvious solution would be

/\r\n|\n/

which is the same as

/\r?\n/

i.e. an optional \r followed by a mandatory \n.

Solution 6 - Ruby

Are you reading from a file, or from standard in?

If you're reading from a file, and the file is in text mode, rather than binary mode, or you're reading from standard in, you won't have to deal with \r\n - it'll just look like \n.

C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"

Solution 7 - Ruby

Perhaps do a split on only '\n' and remove the '\r' if it exists?

Solution 8 - Ruby

Another option is to use String#chomp, which also handles newlines intelligently by itself.

You can accomplish what you are after with something like:

lines = string.lines.map(&:chomp)

Or if you are dealing with something large enough that memory use is a concern:

<string|io>.each_line do |line|
  line.chomp!
  #  do work..
end

Performance isn't always the most important thing when solving this kind of problem, but it is worth noting the chomp solution is also a bit faster than using a regex.

On my machine (i7, ruby 2.1.9):

Warming up --------------------------------------
           map/chomp    14.715k i/100ms
  split custom regex    12.383k i/100ms
Calculating -------------------------------------
           map/chomp    158.590k (± 4.4%) i/s -    794.610k in   5.020908s
  split custom regex    128.722k (± 5.1%) i/s -    643.916k in   5.016150s

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionShpigfordView Question on Stackoverflow
Solution 1 - RubyNickAldwinView Answer on Stackoverflow
Solution 2 - Ruby23inhouseView Answer on Stackoverflow
Solution 3 - RubyPhrogzView Answer on Stackoverflow
Solution 4 - RubyClarkView Answer on Stackoverflow
Solution 5 - RubyJörg W MittagView Answer on Stackoverflow
Solution 6 - RubyAndrew GrimmView Answer on Stackoverflow
Solution 7 - RubySjoerdRavnView Answer on Stackoverflow
Solution 8 - RubyMatt SandersView Answer on Stackoverflow