Knowledge Base

Getting Data From Files

Reading CSV Files

Comma-separated values (CSV) file is a text-based format with the typical structure

  • The first line often represents the header or the names of fields.
  • The other lines are records.
  • Values are separated by delimiters, frequently used delimiters are , (hence the name of the format), ; and the tab character, but other symbols for delimiters are also possible.

CSV file can be loaded into a DataFrame with the .read_csv() method (API).

1import pandas as pd
2
3df = pd.read_csv('file.csv')

Most useful additional arguments:

  • header: row number for the header, can be set to None if there is no header
  • sep: the delimiter symbol or which symbol marks the end of one column and the beginning of the next

The other useful arguments:

  • decimal: the symbol used for decimals

Example of using the additional arguments:

1import pandas as pd
2
3df = pd.read_csv('file.csv', header=None, sep=';', decimal=',')

Reading Excel Files

Excel file can be loaded into a DataFrame with the .read_excel() method (API). It’s similar to read_csv() but with different subset of arguments.

Useful unique arguments:

  • sheet_name: the name of sheet to read. If there’s no sheet_name argument, the method reads the first sheet by default.

Reading the first sheet.

1import pandas as pd
2
3df = pd.read_excel('file.xlsx')

Reading a certain sheet.

1import pandas as pd
2
3df = pd.read_excel('file.xlsx', sheet_name='Sheet 1')
Send Feedback
close
  • Bug
  • Improvement
  • Feature
Send Feedback
,