Getting Data From Files
Reading CSV Files
Comma-separated values (CSV) file is a text-based format with the typical structure
- The first line often represents the header or the names of fields.
- The other lines are records.
- Values are separated by delimiters, frequently used delimiters are
,
(hence the name of the format),;
and the tab character, but other symbols for delimiters are also possible.
CSV file can be loaded into a DataFrame with the .read_csv()
method (API).
1import pandas as pd23df = pd.read_csv('file.csv')
Most useful additional arguments:
header
: row number for the header, can be set to None if there is no headersep
: the delimiter symbol or which symbol marks the end of one column and the beginning of the next
The other useful arguments:
decimal
: the symbol used for decimals
Example of using the additional arguments:
1import pandas as pd23df = pd.read_csv('file.csv', header=None, sep=';', decimal=',')
Reading Excel Files
Excel file can be loaded into a DataFrame with the .read_excel()
method (API). It’s similar to read_csv()
but with different subset of arguments.
Useful unique arguments:
sheet_name
: the name of sheet to read. If there’s nosheet_name
argument, the method reads the first sheet by default.
Reading the first sheet.
1import pandas as pd23df = pd.read_excel('file.xlsx')
Reading a certain sheet.
1import pandas as pd23df = pd.read_excel('file.xlsx', sheet_name='Sheet 1')