How to delete the last row of data of a pandas dataframe

I think this should be simple, but I tried a few ideas and none of them worked:

last_row = len(DF)
DF = DF.drop(DF.index[last_row]) #<-- fail!

I tried using negative indices but that also lead to errors. I must still be misunderstanding something basic.

10 Answers

To drop last n rows:

df.drop(df.tail(n).index,inplace=True) # drop last n rows

By the same vein, you can drop first n rows:

df.drop(df.head(n).index,inplace=True) # drop first n rows

DF[:-n]

where n is the last number of rows to drop.

To drop the last row :

DF = DF[:-1]

Since index positioning in Python is 0-based, there won't actually be an element in index at the location corresponding to len(DF). You need that to be last_row = len(DF) - 1:

In [49]: dfrm
Out[49]: A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
9 0.834706 0.002989 0.333436
[10 rows x 3 columns]
In [50]: dfrm.drop(dfrm.index[len(dfrm)-1])
Out[50]: A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
[9 rows x 3 columns]

However, it's much simpler to just write DF[:-1].

Surprised nobody brought this one up:

# To remove last n rows
df.head(-n)
# To remove first n rows
df.tail(-n)

Running a speed test on a DataFrame of 1000 rows shows that slicing and head/tail are ~6 times faster than using drop:

>>> %timeit df[:-1]
125 µs ± 132 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit df.head(-1)
129 µs ± 1.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit df.drop(df.tail(1).index)
751 µs ± 20.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Just use indexing

df.iloc[:-1,:]

That's why iloc exists. You can also use head or tail.

The nicest solution I've found that doesn't (necessarily?) do a fully copy is

df.drop(df.index[-1], inplace=True)

Of course, you can simply omit inplace=True to create a new dataframe, and you can also easily delete the last N rows by simply taking slices of df.index (df.index[-N:] to drop the last N rows). So this approach is not only concise but also very flexible.

stats = pd.read_csv("C:\\py\\programs\\second pandas\\ex.csv")

The Output of stats:

 A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
9 0.834706 0.002989 0.333436

just use skipfooter=1

skipfooter : int, default 0

Number of lines at bottom of file to skip

stats_2 = pd.read_csv("C:\\py\\programs\\second pandas\\ex.csv", skipfooter=1, engine='python')

Output of stats_2

 A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723

drop returns a new array so that is why it choked in the og post; I had a similar requirement to rename some column headers and deleted some rows due to an ill formed csv file converted to Dataframe, so after reading this post I used:

newList = pd.DataFrame(newList)
newList.columns = ['Area', 'Price']
print(newList)
# newList = newList.drop(0)
# newList = newList.drop(len(newList))
newList = newList[1:-1]
print(newList)

and it worked great, as you can see with the two commented out lines above I tried the drop.() method and it work but not as kool and readable as using [n:-n], hope that helps someone, thanks.

For more complex DataFrames that have a Multi-Index (say "Stock" and "Date") and one wants to remove the last row for each Stock not just the last row of the last Stock, then the solution reads:

# To remove last n rows
df = df.groupby(level='Stock').apply(lambda x: x.head(-1)).reset_index(0, drop=True)
# To remove first n rows
df = df.groupby(level='Stock').apply(lambda x: x.tail(-1)).reset_index(0, drop=True)

As the groupby() is adding an additional level to the Multi-Index we just drop it at the end using reset_index(). The resulting df keeps the same type of Multi-Index as before the operation.

you know what, you just need to gave -1 in first line, like this

last_row = len(DF) - 1
DF = DF.drop(DF.index[last_row])

Velvet Star Monitor

How to delete the last row of data of a pandas dataframe

10 Answers

Your Answer

Sign up or log in

Post as a guest

Similar Journal

Do lure modules and incense cause new pokemon to spawn near you, or existing pokemon to be drawn to you?

What is the strongest fixed location equipment you can obtain at Level 1?

What's the best strategy to keep the chaos low?

Where do you find slime chunks?