Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Is NaN a float?
#1
Is NaN a float?

The girlfriend works for a company that exports agrochemicals. She sent me an Excel with about 250 customers details, mostly in Africa and South America. I will try and extract customer number, company name, contact person and email, then batch send emails to all the customers, with the latest info.

I open the Excel:

df = pd.read_excel(path2XL)
print(df.loc[2, : ])
The above gives, for example:
Output:
客户编码 11820122.0 客户名称 WILLOWOOD FZE (Free Zone Establishment) 国家 美国 网址 NaN 联系人 NaN 邮箱 NaN 电话 NaN 公司地址 NaN 主要产品 NaN Name: 2, dtype: object
to make things easier, I used .to_dict() on each row, extracted the bits I want and saved to json.

I tried to identify NaN values, so I could skip customers whose email is not recorded in the Excel. In the dictionary, NaN is nan:

d = df.iloc[2, :].to_dict() 
d['邮箱'] # '邮箱' = 'email'
Output:
nan
and
type(d['邮箱'])
gives:

Output:
<class 'float'>
Is NaN a float? Seems weird!

I can str(d['邮箱']) and get "nan" so I used that to eliminate customers without email records.
Reply
#2
(Jun-08-2024, 06:19 AM)Pedroski55 Wrote: Is NaN a float?
Here is what the official documentation says about NaN.

So nan is a float in Python because Python's float are based on the IEEE754 standard, which includes nan values. Note this is not specifically a Python problem. Python just uses the hardware implementation of floating point arithmetics.
« We can solve any problem by introducing an extra level of indirection »
Reply
#3
Thanks!

My exclusion condition, to leave out empty email lines is:

if str(d['邮箱']) == 'nan'  :
            continue
Seems to work!
Reply
#4
(Jun-08-2024, 08:58 AM)Pedroski55 Wrote: My exclusion condition
There is also math.isnan(), so
if isnan(d['mail']):
    continue
Pedroski55 likes this post
« We can solve any problem by introducing an extra level of indirection »
Reply
#5
In context of Pandas and NumPy, NaN(Not a Number) is considered a float as they follow IEEE 754 floating-point standard.
Pandas has of course different ways to deal with NaN values.
import pandas as pd
import json
pd.set_option('display.expand_frame_repr', False)

data = {
    '客户编码': [11820122.0, 11820123.0, 11820124.0],
    '客户名称': ['WILLOWOOD FZE (Free Zone Establishment)', 'AgroChem Inc', 'Green Fields'],
    '国家': ['美国', '巴西', '南非'],
    '网址': [pd.NA, 'www.agrochem.com', 'www.greenfields.co.za'],
    '联系人': [pd.NA, 'John Doe', 'Jane Smith'],
    '邮箱': [pd.NA, '[email protected]', '[email protected]'],
    '电话': [pd.NA, '555-1234', '555-5678'],
    '公司地址': [pd.NA, '123 AgroChem Road, Brazil', '789 Green Fields Ave, South Africa'],
    '主要产品': [pd.NA, 'Fertilizers', 'Pesticides']
}

df = pd.DataFrame(data)
# Filter out rows where the '邮箱' (email) column is NaN
df_filtered = df.dropna(subset=['邮箱'])

# Save the customer list to a JSON file
customer_list = df_filtered[['客户编码', '客户名称', '联系人', '邮箱']].to_dict(orient='records')
with open('customers.json', 'w', encoding='utf-8') as fp:
    json.dump(customer_list, fp, ensure_ascii=False, indent=4)

print(f'Customers with email addresses have been saved to {fp.name}')
Output:
[ { "客户编码": 11820123.0, "客户名称": "AgroChem Inc", "联系人": "John Doe", "邮箱": "[email protected]" }, { "客户编码": 11820124.0, "客户名称": "Green Fields", "联系人": "Jane Smith", "邮箱": "[email protected]" } ]
Pedroski55 likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python calculate float plus float is incorrect? sirocawa 6 653 Apr-16-2024, 01:45 PM
Last Post: DeaD_EyE
  Comaparing Float Values of Dictionary Against A Float Value & Pick Matching Key firebird 2 3,571 Jul-25-2019, 11:32 PM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020