Python KeyError: How to fix and avoid key errors
LearnDataSci is reader-supported. When you purchase through links on our site, earned commissions help support our team of writers, researchers, and designers at no extra cost to you.
A KeyError
occurs when a Python attempts to fetch a non-existent key from a dictionary.
This error commonly occurs in dict
operations and when accessing Pandas Series or DataFrame values.
In the example below, we made a dictionary with keys 1–3 mapped to different fruit. We get a KeyError: 0
because the 0 key doesn't exist.
There is a handful of other places where we might see the KeyError
(the os
and zipfile
modules, for example). Yet, the main reason for the error stays the same: the searched key is not there.
The easiest, immediate, all-fitting solution to the key error would be wrapping the value-fetching portion of the code in a try-except
block. Like the code below does:
The try-except
construct saved our program from terminating, allowing us to avoid the keys that have no match.
In the next section, we'll use more nuanced solutions, one of which is the _proper_ way of adding and removing dictionary elements.
Generic Solutions
Solution 1: Verifying the key using 'in
'
While working with dictionaries, Series and DataFrames, we can use the in
keyword to check whether a key exists.
Below you can see how we can use in
with a conditional statement to check the existence of a dictionary key.
This method does not change in the slightest when applying to a Pandas Series, as you can see below:
We can use the same if key in collection
structure when verifying DataFrame column names. However, we have to add a bit more if we want to check a row name.
Let's start by building a DataFrame to work with:
col1 | col2 | |
---|---|---|
row1 | 1 | 3 |
row2 | 2 | 4 |
Now we can check whether a column name is in df
or a row name is in df.index
:
Solution 2: Assigning a fall-back value using get()
We can use the get()
method to fetch dictionary elements, Series values, and DataFrame columns (only _columns_, unfortunately).
The get()
method does not raise a KeyError
when it fails to find the key given. Instead, it returns None
, which is more desirable since it doesn't crash your program.
Take a look at the code below, where fetching the non-existent key3
returns None
:
get()
also allows us to define our own default values by specifying a second parameter.
For example, say we have a website with a few URLs and want to fall back to a 404 page:
The get()
method also works on Pandas DataFrames.
Let's define one like so:
Name | Age | Job | |
---|---|---|---|
0 | John | 34 | Engineer |
1 | Jane | 19 | Engineer |
We can try and grab two columns by name and provide a default value if one doesn't exist:
Since not all the keys match, get()
returned 'Non-Existent'
.
Accessing Items in Pandas: The loc-iloc Mishap
Programmers learning Pandas often mistake loc
for iloc
, and while they both fetch items, there is a slight difference in mechanics:
loc
uses row and column names as identifiersiloc
uses integer location, hence the name
Let's create a Series to work with:
How would we retrieve the name "John" from this Series?
We can see John lies in the "a" row, which we can target using loc
, like so:
If we were to use iloc
for the same purpose, we'd have to use the row's integer index. Since it's the first row, and Series are 0-indexed, we need to do the following:
If we used an integer for loc
we would get a KeyError
, as you can see below:
Note that this is only true for the cases where the row labels have different values than the indexes.
Dictionary-specific solutions
Now we'll look closer at the operations that may cause KeyError
and offer good practices to help us avoid it.
- Avoiding KeyError when populating a dictionary
Let's give an example of how this may go wrong:
It's clear this is a mistake since the code is trying to fetch items from an empty dictionary, but this example demonstrates the problem of wanting to use a dictionary as if it already had the keys present.
We could write another loop at the start that initializes each value to zero, but Python offers defaultdict
s for such situations. They are type-specific dictionaries with defaults for handling new keys.
Take a look:
The only change needed is swapping in defaultdict
for the empty brackets. The defaultdict
is of type int
, meaning that the access of any new key will auto-create that key with an initial value of 0.
This also works for more complex scenarios, like if you want a default value to be a list
. In the following example, we generate ten random numbers and store them as either even or odd:
Using defaultdict(list)
we're able to immediately append to the "even" or "odd" keys without needing to inialized lists beforehand.
2. Avoiding KeyError when deleting dictionary items
Deleting dictionary keys runs into the same problem as accessing keys: first we need to get the key using \[\] to delete it.
We can always check whether the key exists before attempting to delete the value assigned to it, like so:
A quicker way, however, would be to pop()
the value out of the dictionary, effectively deleting it if we don't assign it to a variable.
pop()
takes the desired key as its first parameter and, similar to get()
, allows us to assign a fall-back value as the second parameter.
Take a look:
Since Python couldn't find the key, pop()
returned the default value we assigned.
If the key exists, Python will remove it. Let's run pop()
one more time with a key we know exists:
The 'cat'
was found and removed.
Summary
KeyError
occurs when searching for a key that does not exist. Dictionaries, Pandas Series, and DataFrames can trigger this error.
Wrapping the key-fetching code in a try-except
block or simply checking whether the key exists with the in
keyword before using it are common solutions to this error. One can also employ get()
to access elements from a dictionary, Series or DataFrame without risking a KeyError
.