40 – OS-agnostic line splitting

40 – OS-agnostic line splitting#

Do you want to split a piece of text into its lines?

You’d think text.split("\n") would do the trick, right..?

It kind of does, but not very well. If you’re working with the contents of Windows files, that might not work perfectly because of carriage return characters.

The most robust way to split a string into its lines is with the method str.splitlines! If you want to preserve the line-ending characters, use .splitlines(keepends=True).

For example, assume the file some_file.txt lives on a Windows machine:

windows_string = WindowsPath("some_file.txt").read_text()

The lines of this file might be terminated with "\r\n" instead of just "\n". If that’s the case, using .split("\n") will leave the invisible carriage returns:

print(windows_string.split("\n"))
# ['This is a\r',
#  'multiline string\r',
#  'from a Windows machine.']

Using .splitlines fixes that:

print(windows_string.splitlines())
# ['This is a',
#  'multiline string',
#  'from a Windows machine.']

If you set keepends=True, the line-ending characters are left on the lines:

print(windows_string.splitlines(keepends=True))
# ['This is a\r\n',
#   'multiline string\r\n',
#   'from a Windows machine.']