Strings in Python

***DRAFT*** This is a work in progress!

String Methods

capitalize()

The capitalize() method returns a copy of the string with only its first character capitalized. Example:

s = "this is a sentence"

print s.capitalize()

center()

The center() method returns a string centered in a string of length width.

Usage: center(width[,fillchar])

Padding is done using the specified fillchar (default is a space).

The following example centers "this is a title" within 30 spaces using the "=" character as fill:

s = "this is a title"

print s.center(30,'=')

count()

The count() method returns the number of occurrences of a substring in a string.

Usage: count(sub[, start[, end]])

The following example returns the value 3:

s = "this is a string with 'is' in it"

print s.count("is")

decode() and encode()

The encode() and decode() methods are used to transform string data to and from a specific character encoding.

decode([encoding[,errors]])

Decodes the string using the codec registered for encoding. encoding defaults to the default string encoding. errors may be given to set a different error handling scheme. The default is 'strict', meaning that encoding errors raise UnicodeError. Other possible values are 'ignore', 'replace' and any other name registered via codecs.register_error. New in version 2.2. Changed in version 2.3: Support for other error handling schemes added.

encode([encoding[,errors]])

Return an encoded version of the string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error. For a list of possible encodings, see section 4.9.2. New in version 2.0. Changed in version 2.3: Support for 'xmlcharrefreplace' and 'backslashreplace' and other error handling schemes added.

endswith()

The endswith() method returns True if the string ends with the specified suffix; otherwise it returns False.

Usage: endswith(suffix[,start[,end]])

The following example returns "True":

s = "this is a string"

print s.endswith("ing") # returns True

expandtabs()

The expandtabs() method returns a copy of the string with all tab characters replaced with spaces. If the tab size argument is not given, a tab size of 8 characters is assumed.

Usage: expandtabs([tabsize])

The following example replaces all tabs with 8 spaces:

s = "this\tis\ta\tstring"

print s.expandtabs()

find()

The find() method return the lowest index in a string where a substring is found. You may optionally specify the start and end points of the string to search. The value -1 is returned if the substring is not found.

Usage: find(sub[,start[,end]])

The following example returns the value 2:

s = "this is a string with 'is' in it"

print s.find("is")

Recall that the first character is at position zero, which is why the value 2 is returned for this example.

index()

Usage: index(sub[, start[, end]])

Like find(), but raise ValueError when the substring is not found.

isalnum()

The isalnum() method returns True if all the characters in the string are alphanumeric and there is at least one character, otherwise it returns False.

The first two examples below return True; the last two example returns False:

s = "aaa"

print s.isalnum() # returns True

s = "123"

print s.isalnum() # returns True

s = "123!"

print s.isalnum() # returns False

s = "*"

print s.isalnum() # returns False

isalpha()

The isalpha() method return True if all the characters in a string are alphabetic and there is at least one character, otherwise it returns False.

In the following example, only the first code segment returns True:

s = "aaa"

print s.isalpha() # returns True

s = "123"

print s.isalpha() # returns False

s = "ABC!"

print s.isalpha() # returns False

s = "*"

print s.isalpha() # returns False

isdigit()

The isdigit() method return True if all the characters in a string are digits and there is at least one character, otherwise it returns False.

In the following example, only the first code segment returns True:

s = "123"

print s.isdigit() # returns True

s = "aaa"

print s.isdigit() # returns False

s = "123!"

print s.isdigit() # returns False

s = "*"

print s.isdigit() # returns False

islower()

The islower() method return True if all the characters in a string are lowercase and there is at least one character, otherwise it returns False.

In the following example, only the first two code segments returns True:

s = "aaa"

print s.islower() # returns True

s = "abc!"

print s.islower() # returns True

s = "ABC"

print s.islower() # returns False

s = "*"

print s.islower() # returns False

isspace()

The isspace() method return True if there is only whitespace in a string and there is at least one character, otherwise it returns False. Recall that

In the following example, only the first two code segments returns True:

s = " "

print s.isspace() # returns True

s = " \t\n "

print s.isspace() # returns True

s = "a c"

print s.isspace() # returns False

istitle()

The istitle() method returns true if the string is a titlecased string and there is at least one character, for example uppercase characters may only follow uncased characters and lowercase characters only cased ones. Otherwise, it returns false.

In the following example, the first code segment returns True:

s = "This Is A Sentence"

print s.istitle() # returns True

s = "This is a sentence"

print s.istitle() # returns False

isupper( )

The isupper() method returns true if all cased characters in the string are uppercase and there is at least one cased character, otherwise it returns false.

In the following example, only the first two code segments returns True:

s = "ABC"

print s.isupper() # returns True

s = "ABC!"

print s.isupper() # returns True

s = "aaa"

print s.isupper() # returns False

s = "*"

print s.isupper() # returns False

join()

join(seq)

Return a string which is the concatenation of the strings in the sequence seq. The separator between elements is the string providing this method.

ljust()

ljust(width[, fillchar])

Return the string left justified in a string of length width. Padding is done using the specified fillchar (default is a space). The original string is returned if width is less than len(s). Changed in version 2.4: Support for the fillchar argument.

lower()

lower()

Return a copy of the string converted to lowercase.

For 8-bit strings, this method is locale-dependent.

lstrip()

lstrip([chars])

Return a copy of the string with leading characters removed. If chars is omitted or None, whitespace characters are removed. If given and not None, chars must be a string; the characters in the string will be stripped from the beginning of the string this method is called on. Changed in version 2.2.2: Support for the chars argument.

replace()

replace(old, new[, count])

Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

rfind()

rfind(sub [,start [,end]])

Return the highest index in the string where substring sub is found, such that sub is contained within s[start,end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.

rindex()

rindex(sub[, start[, end]])

Like rfind() but raises ValueError when the substring sub is not found.

rjust()

rjust(width[, fillchar])

Return the string right justified in a string of length width. Padding is done using the specified fillchar (default is a space). The original string is returned if width is less than len(s). Changed in version 2.4: Support for the fillchar argument.

rsplit()

rsplit([sep [,maxsplit]])

Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done, the rightmost ones. If sep is not specified or None, any whitespace string is a separator. Except for splitting from the right, rsplit() behaves like split() which is described in detail below. New in version 2.4.

rstrip()

rstrip([chars])

Return a copy of the string with trailing characters removed. If chars is omitted or None, whitespace characters are removed. If given and not None, chars must be a string; the characters in the string will be stripped from the end of the string this method is called on. Changed in version 2.2.2: Support for the chars argument.

split()

split([sep [,maxsplit]])

Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified, then there is no limit on the number of splits (all possible splits are made). Consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, "'1,,2'.split(',')"returns "['1', '', '2']"). The sep argument may consist of multiple characters (for example, "'1, 2, 3'.split(', ')" returns "['1', '2', '3']"). Splitting an empty string with a specified separator returns an empty list.

If sep is not specified or is None, a different splitting algorithm is applied. First, whitespace characters (spaces, tabs, newlines, returns, and formfeeds) are stripped from both ends. Then, words are separated by arbitrary length strings of whitespace characters. Consecutive whitespace delimiters are treated as a single delimiter ("'1 2 3'.split()" returns "['1', '2', '3']"). Splitting an empty string or a string consisting of just whitespace will return "['']".

splitlines()

splitlines([keepends])

Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.

startswith()

startswith prefix[, start[, end]])

Return True if string starts with the prefix, otherwise return False. With optional start, test string beginning at that position. With optional end, stop comparing string at that position.

strip()

strip([chars])

Return a copy of the string with leading and trailing characters removed. If chars is omitted or None, whitespace characters are removed. If given and not None, chars must be a string; the characters in the string will be stripped from the both ends of the string this method is called on. Changed in version 2.2.2: Support for the chars argument.

swapcase()

swapcase()

Return a copy of the string with uppercase characters converted to lowercase and vice versa.

For 8-bit strings, this method is locale-dependent.

title()

title()

Return a titlecased version of the string: words start with uppercase characters, all remaining cased characters are lowercase.

For 8-bit strings, this method is locale-dependent.

translate()

translate(table[, deletechars])

Return a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256.

For Unicode objects, the translate() method does not accept the optional deletechars argument. Instead, it returns a copy of the s where all characters have been mapped through the given translation table which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted. Note, a more flexible approach is to create a custom character mapping codec using the codecs module (see encodings.cp1251 for an example).

upper()

upper()

Return a copy of the string converted to uppercase.

For 8-bit strings, this method is locale-dependent.

zfill( )

zfill(width)

Return the numeric string left filled with zeros in a string of length width. The original string is returned if width is less than len(s). New in version 2.2.2.

(Original material is from http://docs.python.org/release/2.4.1/lib/string-methods.html)

String Constants

Note: to access these constants you will need to put:

import string

at the top of your program.

string.ascii_letters

The concatenation of the ascii_lowercase and ascii_uppercase constants described below. This value is not locale-dependent.

string.ascii_lowercase

The lowercase letters 'abcdefghijklmnopqrstuvwxyz'. This value is not locale-dependent and will not change.

string.ascii_uppercase

The uppercase letters 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'. This value is not locale-dependent and will not change.

string.digits

The string '0123456789'.

string.hexdigits

The string '0123456789abcdefABCDEF'.

string.letters

The concatenation of the strings lowercase and uppercase described below. The specific value is locale-dependent, and will be updated when locale.setlocale() is called.

string.octdigits

The string '01234567'.

string.punctuation

String of ASCII characters which are considered punctuation characters in the C locale.

string.printable

String of characters which are considered printable. This is a combination of digits, letters, punctuation, and whitespace.

string.whitespace

A string containing all characters that are considered whitespace. On most systems this includes the characters space, tab, linefeed, return, formfeed, and vertical tab.

(Original material is from http://docs.python.org/library/string.html)