Getting Data From Files
Reading CSV Files
Comma-separated values (CSV) file is a text-based format with the typical structure
- The first line often represents the header or the names of fields.
- The other lines are records.
- Values are separated by delimiters, frequently used delimiters are
,(hence the name of the format),;and the tab character, but other symbols for delimiters are also possible.
CSV file can be loaded into a DataFrame with the .read_csv() method (API).
1import pandas as pd23df = pd.read_csv('file.csv')
Most useful additional arguments:
header: row number for the header, can be set to None if there is no headersep: the delimiter symbol or which symbol marks the end of one column and the beginning of the next
The other useful arguments:
decimal: the symbol used for decimals
Example of using the additional arguments:
1import pandas as pd23df = pd.read_csv('file.csv', header=None, sep=';', decimal=',')
Reading Excel Files
Excel file can be loaded into a DataFrame with the .read_excel() method (API). It’s similar to read_csv() but with different subset of arguments.
Useful unique arguments:
sheet_name: the name of sheet to read. If there’s nosheet_nameargument, the method reads the first sheet by default.
Reading the first sheet.
1import pandas as pd23df = pd.read_excel('file.xlsx')
Reading a certain sheet.
1import pandas as pd23df = pd.read_excel('file.xlsx', sheet_name='Sheet 1')