3. Methods by class
See 1.6 for general information on methods
3.1. String Methods
find()
str_obj.find(sub)searches for the string sub in the string objectstr_obj. If found, this method returns the starting index of the first occurrence ofsubinstr_obj. If not found, this method returns -1. Ifstr_obj = "One Two Three Five Two", thenstr_obj.find("Two")returns4join()
str_obj.join(seq)concatenates (joins) the strings in seq separated by the stringstr_obj. Ifstr_obj = '_'andseq = ['One', 'Two', 'Three'], thenstr_obj.join(seq)returns'One_Two_Three'.lower()
Makes all the letters in
str_objlowercase. Ifstr_obj = "ONE two THREE", thenstr_obj.lower()returns"one two three".isalpha()
Returns
Trueif all characters in str_obj are alphabetic. Ifstr_obj = 'abcd'thenstr_obj.isalpha()returnsTrue. Ifstr_obj = 'abc1.'thenstr_obj.isalpha()returnsFalse.isdigit()
Returns
Trueif all characters in str_obj are numbers. Ifstr_obj = '314'thenstr_obj.isdigit()returnsTrue. Ifstr_obj = '3.14'thenstr_obj.isdigit()returnsFalse.islower()
Returns
Trueif all alphabetic characters in str_obj are lowercase. Ifstr_obj = 'pi is 3.14!'thenstr_obj.islower()returnsTrue. Ifstr_obj = 'Pi is 3.14!'thenstr_obj.islower()returnsFalse.replace()
str_obj.replace(old,new,count)replaces the firstcountoccurrences ofoldwith the valuenew. Note:countis optional. If left out, all occurrences are replaced. Ifstr_obj = 'a apple, a orange, a banana', thenstr_obj.replace('a ','an ', 2)returns'an apple, an orange, a banana'.split()
str_obj.split(sep)returns a list of the characters instr_objusingsepas the delimiter. Ifstr_obj = '1, 2, 3'thenstr_obj.split(',')returns the list['1', ' 2', ' 3']upper()
Makes all the letters in
str_objuppercase. Ifstr_obj = 'ONE two THREE', thenstr_obj.upper()returns'ONE TWO THREE'.
3.2. List Methods
append()
list_obj.append(s)adds elementsto the end of the listlist_obj. Iflist_obj = ['a','b','c'], thenlist_obj.append('d')updateslist_objto['a', 'b', 'c', 'd']extend()
list_obj.extend(iter)adds the items in the iterable objectiterto the end of the listlist_obj. Iflist_obj = ['a','b','c']andstr1 = 'def', thenlist_obj.extend(str1)updateslist_objto['a', 'b', 'c', 'd', 'e', 'f']index()
list_obj.index(s)returns the index value of the first element inlist_objthat is equal tos. Iflist_obj = [1,2,'a','a']thenlist_obj.index('a')returns 2. (Recall, indices start at 0)insert(i,s)
list_obj.insert(i,s)inserts elementsin indexi. Iflist_obj = ['a','c','d']thenlist_obj.insert(1,'b')updates list_obj to['a', 'b', 'c', 'd']pop()
list_obj.pop(i)returns and removes the element in index i in list_obj. Iflist_obj = ['a','a','b','c']thenlist_obj.pop(1)returns ‘a’ and updates list_obj to['a', 'b', 'c']sort()
list_obj.sort()sorts the elements in the listlist_obj. Iflist_obj = ['a','d','c','b']thenlist_obj.sort()updateslist_objto['a', 'b', 'c', 'd']
3.3. Dictionary Methods
Adding items
Unlike list, dictionaries have no insert() or append() method. To add data to an existing dictionary, name the dictionary and key and set it equal to the vale. Ex.
dict['New Key'] = valueadds value to the dictionary objectdictwith key'New Key'.get()
dict_obj.get('s')returns the value for the keys. Ifdict_obj = {'a' : 2, 'b' : 4, 'c' : 6}thendict_obj.get('b')returns4. If the key does not exist, the method returnsNone.items()
dict_obj.items()returns the current list of dictionary elements in the form (’key’, value). Ifdict_obj = {'a' : 1, 'b' : 2, 'c' : 3}thendict_obj.items()returnsdict_items([('a', 1), ('b', 2), ('c', 3)]). Useful for looping over list within dictionaries (see section 6.1.3.)
3.4. Pandas (DataFrames and Series) Methods
drop_duplicates()
df.drop_duplicates()drops duplicate rows in the pandas DataFrame objectdf.dropna()
df.dropna()drops all rows with at least 1 missing value in the pandas DataFrame objectdf.duplicated()
df.duplicated()****returns a boolean (TrueorFalse) pandas Series indicating duplicate rows. The first occurrence of a duplicate returnsFalseand all following duplicate rows returnTrue. The length of the returned pandas series is equal to the number of rows in the pandas DataFrame objectdf.fillna(value=s)
df.fillna()fills NaN/NA values with specified value s across all columns and rows in the pandas DataFrame objectdf.groupby(’columnName’)
df.groupby(’columnName’)groups data by unique values in columncolumnNamein the pandas DataFrame objectdf. Can be used for the first stage of grouping (split) before applying some operation to the grouped data.head(n)
df.head(n)returns the first n rows of the pandas DataFrame objectdf. If n is not specified, it returns the first 5 rows by default.isna()
df.isna()returns a boolean object the same size as the pandas DataFrame objectdf, indicating if the values are NA.loc[]
df.loc[]is used to access a group of rows and columns in the pandas DataFrame object df. (See section 4.4 for details)max()
df.max()returns the maximum value per column in the pandas DataFrame objectdf.mean()
df.mean()returns the mean value per column in the pandas DataFrame objectdf.median()
df.median()returns the median value per column in the pandas DataFrame objectdf.min()
df.min()returns the minimum value per column in the pandas DataFrame objectdf.read_csv()
df = pd.read_csv(’fileName’)reads data from a csv file and creates a DataFrame objectdf. Note:pdis the alias assigned to pandas when importing the library, i.e.,import pandas as pdrename(columns=dict_obj)
df.rename(columns=dict_obj)renames columns in the pandas DataFrame object df using data in the dictionary objectdict_obj. You can create the dictionary objectdict_objwith key=old column name and values=new column names to rename the columns indf. (See section 1.3.2 for info on dictionary objects)replace(thisValue,thatValue)
df.replace(this_Value,that_Value)searches, finds, and replaces all instances ofthis_Valuewiththat_Valuein the pandas DataFrame objectdf.reset_index()
df.reset_index()resets the index of the pandas DataFrame objectdfto consecutive numbers and creates a new column that stores the old index value (before the method is applied). This method is typically used after data processing when rows are removed.sort_values(by=’column_name’, ascending=True)
df.sort_values(by='column_name')sorts the rows in the pandas DataFrame objectdfby columncolumn_name. By default,ascendingisTrue. Set toascending = Falseto sort in descending order.sum()
df.sum()returns the sum of the columns in the pandas DataFrame objectdf.tail(n)
df.tail(n)returns the last n rows of the pandas DataFrame objectdf. If n is not specified, it returns the last 5 rows by default.unique()
df['Col'].unique()returns unique values in the column Col in the pandas DataFrame objectdf. Note:unique()must be called on a pandas Series object which is created when referencing a single column in the DataFrame object.Attributes (general information about the data in the DataFrame)
df.dtypes- returns the data type for each column indf.df.columns- returns the column names for each column indf.df.shape- returns the size (# rows, # columns) ofdf.df.info()- returns information about the pandas DataFrame objectdfincluding data structure info, indices info, column names, number of of non-null values, data types, and memory usage.