10

Pandas – Count True Values in a Dataframe Column

 2 years ago
source link: https://thispointer.com/pandas-count-true-values-in-a-dataframe-column/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Pandas – Count True Values in a Dataframe Column

In this article, we will discuss different ways to count True values in a Dataframe Column.

First of all, we will create a Dataframe from a list of tuples i.e.

import pandas as pd
import numpy as np
# List of Tuples
list_of_tuples = [ (False, False, True, False, True, True),
(True, False, True, False, True, np.NaN),
(False, True, False, False, True, True),
(True, True, True, False, True, np.NaN),
(True, True, False, True, True, True),
(False, False, True, True, True, np.NaN)]
# Create a DataFrame object
df = pd.DataFrame( list_of_tuples,
columns=['A', 'B', 'C', 'D', 'E', 'F'])
print(df)
import pandas as pd
import numpy as np

# List of Tuples
list_of_tuples = [  (False, False, True,  False, True, True),
                    (True,  False, True,  False, True, np.NaN),
                    (False, True,  False, False, True, True),
                    (True,  True,  True,  False, True, np.NaN),
                    (True,  True,  False, True,  True, True),
                    (False, False, True,  True,  True, np.NaN)]


# Create a DataFrame object
df = pd.DataFrame(  list_of_tuples, 
                    columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(df)

Output

A B C D E F
0 False False True False True True
1 True False True False True NaN
2 False True False False True True
3 True True True False True NaN
4 True True False True True True
5 False False True True True NaN
       A      B      C      D     E     F
0  False  False   True  False  True  True
1   True  False   True  False  True   NaN
2  False   True  False  False  True  True
3   True   True   True  False  True   NaN
4   True   True  False   True  True  True
5  False  False   True   True  True   NaN

This Dataframe contains either boolean values or NaN values, and it has six columns. Now let’s see how to get the count of True values in any column of this Dataframe.

Count True values in a Dataframe Column using Series.sum()

Select the Dataframe column using the column name and subscript operator i.e. df[‘C’]. It returns the column ‘C’ as a Series object of only bool values. After that, call the sum() function on this boolean Series object, and it will return the count of only True values in the Series/column.

Let’s understand with an example, where we will get the count of True values in column C,

Looking for a Career in Data Science or Machine Learning with Python?

Get a Professional Certificate in data Science by IBM. Build data science skills, learn Python & SQL, analyze & visualize data, build machine learning models. No degree or prior experience required. Checkout the detailed review.

Explore a new career path with a 7 day free trial.

# Get count of True values in column 'C'
count = df['C'].sum()
print('Count of True values in Column C : ', count)
# Get count of True values in column 'F'
count = df['F'].sum()
print('Count of True values in Column F : ', count)
# Get count of True values in column 'C' 
count = df['C'].sum()

print('Count of True values in Column  C : ', count)


# Get count of True values in column 'F' 
count = df['F'].sum()

print('Count of True values in Column  F : ', count)

Output:

Count of True values in Column C : 4
Count of True values in Column F : 3
Count of True values in Column  C :  4
Count of True values in Column  F :  3

Columns ‘C’ & ‘F’ had 4 and 3 True values respectively. We can achieve the same thing using another technique too. Let’s see that in practice,

Count True values in a Dataframe Column using numpy.count_nonzero()

Select the Dataframe column by its name, i.e., df[‘D’]. It returns the column ‘D’ as a Series object of only bool values. Then pass the bool Series object to NumPy’s count_nonzero() function, and it will return the count of only True values in the Series/column.

Let’s understand with an example, where we will get the count of True values in column ‘D’,

# Get count of True values in column 'D'
count = np.count_nonzero(df['D'])
print('Count of True values in Column D : ', count)
# Get count of True values in column 'D' 
count = np.count_nonzero(df['D'])

print('Count of True values in Column  D : ', count)

Output:

Count of True values in Column D : 2
Count of True values in Column  D :  2

Count True values in a Dataframe Column using Series.value_counts()

Select the Dataframe column by its name, i.e., df[‘D’]. It returns the column ‘D’ as a Series object of only bool values. then call the value_counts() function on this Series object. It will return the occurrence count of each value in the series/column. Then fetch the occurrence count of value True. For example,

# Get count of True values in column 'D'
count = df['D'].value_counts()[True]
print('Count of True values in Column D : ', count)
# Get count of True values in column 'D' 
count = df['D'].value_counts()[True]

print('Count of True values in Column  D : ', count)

Output:

Count of True values in Column D : 2
Count of True values in Column  D :  2

It returned the count of True values in column ‘D’ of the Dataframe.

The complete example is as follow,

import pandas as pd
import numpy as np
# List of Tuples
list_of_tuples = [ (False, False, True, False, True, True),
(True, False, True, False, True, np.NaN),
(False, True, False, False, True, True),
(True, True, True, False, True, np.NaN),
(True, True, False, True, True, True),
(False, False, True, True, True, np.NaN)]
# Create a DataFrame object
df = pd.DataFrame( list_of_tuples,
columns=['A', 'B', 'C', 'D', 'E', 'F'])
print(df)
## Technique 1 ##
# Get count of True values in column 'C'
count = df['C'].sum()
print('Count of True values in Column C : ', count)
# Get count of True values in column 'F'
count = df['F'].sum()
print('Count of True values in Column F : ', count)
## Technique 2 ##
# Get count of True values in column 'D'
count = np.count_nonzero(df['D'])
print('Count of True values in Column D : ', count)
## Technique 3 ##
# Get count of True values in column 'D'
count = df['D'].value_counts()[True]
print('Count of True values in Column D : ', count)
import pandas as pd
import numpy as np

# List of Tuples
list_of_tuples = [  (False, False, True,  False, True, True),
                    (True,  False, True,  False, True, np.NaN),
                    (False, True,  False, False, True, True),
                    (True,  True,  True,  False, True, np.NaN),
                    (True,  True,  False, True,  True, True),
                    (False, False, True,  True,  True, np.NaN)]


# Create a DataFrame object
df = pd.DataFrame(  list_of_tuples, 
                    columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(df)

## Technique 1 ##

# Get count of True values in column 'C' 
count = df['C'].sum()

print('Count of True values in Column  C : ', count)


# Get count of True values in column 'F' 
count = df['F'].sum()

print('Count of True values in Column  F : ', count)

## Technique 2 ##

# Get count of True values in column 'D' 
count = np.count_nonzero(df['D'])

print('Count of True values in Column  D : ', count)

## Technique 3 ##

# Get count of True values in column 'D' 
count = df['D'].value_counts()[True]

print('Count of True values in Column  D : ', count)

Output:

A B C D E F
0 False False True False True True
1 True False True False True NaN
2 False True False False True True
3 True True True False True NaN
4 True True False True True True
5 False False True True True NaN
Count of True values in Column C : 4
Count of True values in Column F : 3
Count of True values in Column D : 2
Count of True values in Column D : 2
       A      B      C      D     E     F
0  False  False   True  False  True  True
1   True  False   True  False  True   NaN
2  False   True  False  False  True  True
3   True   True   True  False  True   NaN
4   True   True  False   True  True  True
5  False  False   True   True  True   NaN

Count of True values in Column  C :  4
Count of True values in Column  F :  3
Count of True values in Column  D :  2
Count of True values in Column  D :  2

Summary:

We learned three different ways to count only True values in any Dataframe column in Pandas.

Advertisements


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK