Knowledge Base

Viewing Data

Getting General Information of DataFrame

Getting the general information about a DataFrame

  • Number of rows
  • Number of columns
  • The name of each column (Column)
  • Number of values in each column that are not missing (Non-Null Count)
  • The data type of each column (Dtype)

list of columns, their types and the number of non-null values.

1df.info()

Getting Shape of DataFrame

Shape of a DataFrame is a tuple with two elements: the number of rows and the number of columns.

1df.shape

Getting Head/Random/Last Rows

Each method below allows specifying the number of rows to get. The sample() method allows getting a more diverse preview of rows than head() or tail().

1df.head()
1df.sample()
1df.tail()

Getting Descriptive Statistics

The describe() method (API) returns typical statistics

  • For numerical columns: count, mean, std, min, max as well as lower, 50 and upper percentiles. By default the lower percentile is 25 and the upper percentile is 75. The 50 percentile is the same as the median.
  • For object columns (e.g. strings or timestamps): count, unique, top, and freq. The top is the most common value. The freq is the most common value’s frequency. Timestamps also include the first and last items.

For mixed data types provided via a DataFrame, the default is to return only an analysis of numeric columns.

1df.describe()

Getting descriptive statistics only for columns of certain types

1df.describe(include='object')

Getting descriptive statistics for all columns, regardless of its data type

1df.describe(include='all')

Counting Values

Number of values for a column

1df['column'].count()

Number of unique values for a column

1df['column'].nunique()

List of unique values with their counts

1df['column'].value_counts()
Send Feedback
close
  • Bug
  • Improvement
  • Feature
Send Feedback
,