.set_index & .reset_index() in Pandas |
60. [Hindi]Machine Learning : .set_index & .reset_index() in Pandas| 2018 |Python 3
Are right in this lesson I'll introduce the set index and reset index methods. Let's begin by executing our code to import our James Bond data set. And as a reminder there is a index_call parameter available on the read CXXVI method. CIOL allows for column and it can accept a string of the column name that you'd like to serve as the index. So as an example if I want to set the film which is my very first column as the index I can simply write film. Make sure it's spelled out exactly as it is including the case sensitivity. And that will allow us to import the file and place the film column directly as our index would to actually take this code out.
So let's pretend that we have a regular data like this. Let's say this is our starting point. So we have a set index method that we can call directly on a data frame after it's been created and the set in next argument can accept a key and that key can be either a string of a column name or actually a list of strings which will create a multi index We'll dive into that much later. For now let's stick to a single column. Let's stick to the film column. So I'm going to write in film as my string. And again this argument is going to go to the Keys parameter and what this is going to do is move my filled column from being a regular column in the data frame to being my index on the left we can see it distinctly because it's going to be highlighted in bold here.
Now if we want to make that operation permanent We do have to add that little in place equals true component. And so now if I preview the first three rows again you can compare the old data that we had right here with our new one where the film column has been moved or migrated over to serve as the index on the left. Now there is a complimentary method called Reset index. It's going to be Bond reset index and all that does is move the current index back into a column position and then create a brand new numeric pantless index. So if I call this method right now on our current data frame you can see that our film column which was our index has been moved back into the first column position and we've generated a brand new brand new index consisting of the standard numeric started 0 scheme. Now we do have an additional parameter available here called drop. It's set to false by default. That means that it's not going to drop the current index. If for example we wanted to drop the film index as we are replacing it with our standard default one we can do something like drop equals. True and that's going to remove this index not bring it back to be one of the columns and just create our standard numeric index by itself.
So there we have it if we want to perform from that in a single step we can use that drop parameter. Let's say that I actually do want to keep my film a column. So there it is. And I actually want to bring it back to its original state by making this operation permanent. The reset index method does include an in place parameter as well that we could set equal to true. Now if I preview my first three rows I have my data frame back in its original state. So there's one additional thing that I'd like to conclude this lesson with and this is where it gets a little bit funky. Let's say I bring it back to its film index Let's say I do bond that set index and I set it equal to film and I make it in place equals true. So now we're back to having the film as the index. And now let's say I decide Well actually I don't want the film I want the year to serve as my index. Whenever we do set index again and we pass at something like a year it's not going to move the current index which is the film back into the column. What is actually going to do is entirely replace it which basically wipes out the film column or the film labels in the index rather and replaces them with the year column values. So in order to keep our film index right here what we first have to do is first reset the index in order to ensure that the film column is brought back as a column in our data frame and only after that operation is true can we set the index to be a year. And now if we preview the first three rows you can see that the film column has not been lost its been moved back into regular column position from the index and that regular numeric index that took its place was then replaced by the year column in this second line right here. And all of these operations have to be in place. Otherwise you're going to run into issues where it's only going to be temporary and then you might run into some kinds of bugs and then ours.
So I recommend when you're dealing with stuff like this and you know removing an index and setting a new one not to method chain and rather to keep your commands in one line at a time one bit of code at a time.
So that's the set index and reset index methods.
Code Link : ML_60
Code :
#!/usr/bin/env python
# coding: utf-8
# In[44]:
import pandas as pd
bond = pd.read_csv("jamesbond.csv")
bond.head()
# In[45]:
bond.set_index("Film", inplace=True)
bond.head()
# In[47]:
bond.set_index("Year").head()
# In[51]:
bond.reset_index(inplace=True)
# In[52]:
bond.set_index("Year",inplace=True)
bond.head()
# In[ ]:
YouTube Link :
0 Comments