| On this page: .split(), .join(), and list(). Splitting a Sentence into Words: .split()Below, mary is a single string. Even though it is a sentence, the words are not represented as discreet units. For that, you need a different data type: a list of strings where each string corresponds to a word. .split() is the method to use: | >>> mary = 'Mary had a little lamb'>>> mary.split() ['Mary', 'had', 'a', 'little', 'lamb'] |
.split() splits mary on whitespce, and the returned result is a list of words in mary. This list contains 5 items as the len() function demonstrates. len() on mary, by contrast, returns the number of characters in the string (including the spaces). | >>> mwords = mary.split() >>> mwords['Mary', 'had', 'a', 'little', 'lamb'] >>> len(mwords) # number of items in mwords5 >>> len(mary) # number of characters22 |
Whitespace characters include space ' ', the newline character '\n', and tab '\t', among others. .split() separates on any combined sequence of those characters: | >>> chom = ' colorless green \n\tideas\n' # ' ', '\n', '\t' bunched up>>> print(chom) colorless green ideas >>> chom.split()['colorless', 'green', 'ideas'] |
Splitting on a Specific SubstringBy providing an optional parameter, .split('x') can be used to split a string on a specific substring 'x'. Without 'x' specified, .split() simply splits on all whitespace, as seen above. | >>> mary = 'Mary had a little lamb'>>> mary.split('a') # splits on 'a'['M', 'ry h', 'd ', ' little l', 'mb'] >>> hi = 'Hello mother,\nHello father.'>>> print(hi)Hello mother,Hello father. >>> hi.split() # no parameter given: splits on whitespace['Hello', 'mother,', 'Hello', 'father.'] >>> hi.split('\n') # splits on '\n' only['Hello mother,', 'Hello father.'] |
String into a List of Characters: list()But what if you want to split a string into a list of characters? In Python, characters are simply strings of length 1. The list() function turns a string into a list of individual letters: | >>> list('hello world')['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] |
More generally, list() is a built-in function that turns a Python data object into a list. When a string type is given, what's returned is a list of characters in it. When other data types are given, the specifics vary but the returned type is always a list. See this tutorial for details. Joining a List of Strings: .join()If you have a list of words, how do you put them back together into a single string? .join() is the method to use. Called on a "separator" string 'x', 'x'.join(y) joins every element in the list y separated by 'x'. Below, words in mwords are joined back into the sentence string with a space in between: | >>> mwords['Mary', 'had', 'a', 'little', 'lamb'] >>> ' '.join(mwords)'Mary had a little lamb' |
Joining can be done on any separator string. Below, '--' and the tab character '\t' are used. | >>> '--'.join(mwords)'Mary--had--a--little--lamb' >>> '\t'.join(mwords)'Mary\thad\ta\tlittle\tlamb' >>> print('\t'.join(mwords))Mary had a little lamb |
The method can also be called on the empty string '' as the separator. The effect is the elements in the list joined together with nothing in between. Below, a list of characters is put back together into the original string: | >>> hi = 'hello world'>>> hichars = list(hi)>>> hichars['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] >>> ''.join(hichars)'hello world' |
|