4

Pandas Indexing: loc, iloc, and ix in Python

 2 years ago
source link: https://www.journaldev.com/52996/pandas-indexing-in-python
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Pandas is a robust data manipulation library available in Python. If your data wrangling needs are any, then available pandas functions are many :P. Today, we will be focusing on pandas indexing. In simple words, indexing includes selecting particular rows and columns of data from a data frame.

For this indexing purpose, Pandas offers three methods – loc, iloc, and ix. Let’s discuss each of them.


A bit about Pandas

  • Named after a word ‘Panel Data’ – which means dataset in economic language.
  • Mr. Wes Mckinney, Researcher at AQR capital, developed pandas with more functionalities.
  • Pandas is a most important library for data manipulation and analysis in python.
  • Some of the major applications of pandas include – working with data, statistical analysis, data normalization and data cleaning.
  • Slicing and indexing data will be easy with Pandas.
  • Offers series and dataframe functions for 1D and 2D data.

1. Pandas loc

The loc attribute in pandas works on data slicing based on explicit indexing. In other words, you can call it label-based indexing.

For this process let’s import a dataset and will try these indexing methods.

#Import the data
import pandas as pd
data = pd.read_csv('mtcars.csv', index_col = 'model')
data
Mtcars Data

Well, we got the ‘mtcars’ data for indexing purposes. Let’s see how we can make use of the pandas loc attribute to index the data.

  1. Here, let’s index every row for a particular column.
#Index all rows for a particular columns
indexing = data.loc[:,'disp']
indexing
model
Mazda RX4              160.0
Mazda RX4 Wag          160.0
Datsun 710             108.0
Hornet 4 Drive         258.0
Hornet Sportabout      360.0
Valiant                225.0
Duster 360             360.0
Merc 240D              146.7
Merc 230               140.8
Merc 280               167.6
Merc 280C              167.6
Merc 450SE             275.8
Merc 450SL             275.8
Merc 450SLC            275.8
Cadillac Fleetwood     472.0
Lincoln Continental    460.0
Chrysler Imperial      440.0
Fiat 128                78.7
Honda Civic             75.7
Toyota Corolla          71.1
Toyota Corona          120.1
Dodge Challenger       318.0
AMC Javelin            304.0
Camaro Z28             350.0
Pontiac Firebird       400.0
Fiat X1-9               79.0
Porsche 914-2          120.3
Lotus Europa            95.1
Ford Pantera L         351.0
Ferrari Dino           145.0
Maserati Bora          301.0
Volvo 142E             121.0
Name: disp, dtype: float64

2. Now, let’s index all rows for multiple columns.

#Indexing all rows for multiple columns
indexing = data.loc[:,['disp','hp']]
indexing
Pandas Loc

3. Particular rows for all columns

#Particular rows for all columns
data.loc[5:10, ]
Pandas Iloc

Like this, you can access particular columns of all variables in the data using pandas loc.


2. Pandas iloc

The pandas iloc function performs the slicing as same as in the implicit python style. Let’s look at some of the examples to understand more.

  1. Accessing particular value
#accesing particular value
df.iloc[0,1]

21.0

You can see that the iloc function extracts the first value in the second column(1), which is 21.0


2. Accessing particular rows of particular column

#accesing exact rows of exact column
df.iloc[1:5, 2]
1    6
2    4
3    6
4    8
Name: cyl, dtype: int64

Well, the iloc function extracted first 4 rows from the 2nd column i.e. cyl.


3. Accessing particular rows of all columns

#some rows of all columns
df.iloc[1:5, ]
modelmpgcyldisphpdratwtqsecvsamgearcarb1Mazda RX4 Wag21.06160.01103.902.87517.020142Datsun 71022.84108.0933.852.32018.611143Hornet 4 Drive21.46258.01103.083.21519.441034Hornet Sportabout18.78360.01753.153.44017.02003

You can see that we have accessed 4 rows of all columns of the data.


3. Pandas ix

The loc function uses the explicit slicing and the iloc function uses the python implicit styling. But, the ix function is the hybrid mix of both approaches.

#using ix
df.ix[:3, : 'mpg']
model          mpg
0   Mazda RX4      21.0
1   Mazda RX4 Wag  21.0
2   Datsun 710     22.8

You will get the same result as loc and iloc. If you are getting an attribution error, try installing the latest version of pandas. As I mentioned earlier, the ix function works with a mix of both loc and iloc functions.


Pandas Indexing – Conclusion

Pandas is the go-to library in python for data manipulation and analysis. When it comes to indexing the data, nothing can serve better and easier than pandas loc, iloc, and ix functions. Try accessing particular data in your dataset as shown above. I hope now you got a better understanding of pandas indexing in python.

That’s all for now. Happy Python!!!

More read: Data indexing using pandas


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK