DEV Community

Cover image for String split in Python
Python64
Python64

Posted on

String split in Python

In this article we simply talk about two methods of text parsing in Python. What we will do is given a string like

>>> line = 'aaa bbb ccc'

Split it into substrings, create strings based on this string.

Slice string

The first method is fragment by fragment. Define the recording offset, and then extract the desired string. [start:end]. Example:

>>> line = 'aaa bbb ccc'
>>> col1 = line[0: 3]
>>> col3 = line[8:]
>>> col1
'aaa'
>>> col3
'ccc'
>>>

However, this is undoable with a large string. Many developers use the .split () function.

Split function

The split() function turns a string into a list of strings. By default this function splits on spaces, meaning every word in a sentence will be a list item.

>>> line = 'aaa bbb ccc'
>>> a = line.split ( '')
>>> a
[ 'Aaa', 'bbb', 'ccc']
>>> a[0]
'Aaa'
>>> a[1]
'Bbb'
 >>> a[2]
'Ccc'
 >>>

You can split on character in the string, by setting the character in the split function. This can be a comma, a dash, a semicolon or even a dot (phrases).

>>> line = 'aaa, bbb, ccc'
>>> a = line.split(',')
>>> a
[ 'Aaa', 'bbb', 'ccc']
>>>

Top comments (1)

Collapse
 
kamonwan profile image
Kamonwan Achjanis

Just don't use split to break text into words. It will not work well with punctuation or Asian languages like Chinese or Japanese. In JavaScript there is a special object for this use case:
dev.to/kamonwan/the-right-way-to-b...

Perhaps, in Python there is something similar?