Regular Expression to match cross platform newline characters

PythonRegexCross PlatformEol

Python Problem Overview


My program can accept data that has newline characters of \n, \r\n or \r (eg Unix, PC or Mac styles)

What is the best way to construct a regular expression that will match whatever the encoding is?

Alternatively, I could use universal_newline support on input, but now I'm interested to see what the regex would be.

Python Solutions


Solution 1 - Python

The regex I use when I want to be precise is "\r\n?|\n".

When I'm not concerned about consistency or empty lines, I use "[\r\n]+", I imagine it makes my programs somewhere in the order of 0.2% faster.

Solution 2 - Python

The pattern can be simplified to \r?\n for a little performance gain, as you probably don't have to deal with the old Mac style (OS 9 is unsupported since February 2002).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlanView Question on Stackoverflow
Solution 1 - Pythontoo much phpView Answer on Stackoverflow
Solution 2 - PythonDiego VView Answer on Stackoverflow