Python Forum
Trying to clean the selected columns
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Trying to clean the selected columns
#1
I have astype(int) yet the columns are coming up as object
Reply
#2
Post the code, output and possible errors here in proper tags. You can find help here.
Reply
#3
the whole notebook ?
Reply
#4
Yes, or at least the relevant part of code which is causing problems. Along with actual output vs desired output.
Reply
#5
ValueError Traceback (most recent call last)
<ipython-input-97-460999a4b9d1> in <module>()
5 df.Budget = df.Budget.str.replace(',', '')
6 df.Budget = df.Budget.str.replace('$', '')
----> 7 df.Budget = df.Budget.astype(int)
8 # Budget column is ready.

C:\ProgramData\Anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
175 else:
176 kwargs[new_arg_name] = new_arg_value
--> 177 return func(*args, **kwargs)
178 return wrapper
179 return _deprecate_kwarg

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs)
4995 # else, only a single dtype is given
4996 new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 4997 **kwargs)
4998 return self._constructor(new_data).__finalize__(self)
4999

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals.py in astype(self, dtype, **kwargs)
3712
3713 def astype(self, dtype, **kwargs):
-> 3714 return self.apply('astype', dtype=dtype, **kwargs)
3715
3716 def convert(self, **kwargs):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
3579
3580 kwargs['mgr'] = self
-> 3581 applied = getattr(b, f)(**kwargs)
3582 result_blocks = _extend_blocks(applied, result_blocks)
3583

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals.py in astype(self, dtype, copy, errors, values, **kwargs)
573 def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
574 return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 575 **kwargs)
576
577 def _astype(self, dtype, copy=False, errors='raise', values=None,

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals.py in _astype(self, dtype, copy, errors, values, klass, mgr, **kwargs)
662
663 # _astype_nansafe works fine with 1-d only
--> 664 values = astype_nansafe(values.ravel(), dtype, copy=True)
665 values = values.reshape(self.shape)
666

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy)
707 # work around NumPy brokenness, #1987
708 if np.issubdtype(dtype.type, np.integer):
--> 709 return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
710
711 # if we have a datetime/timedelta array of objects

pandas\_libs\lib.pyx in pandas._libs.lib.astype_intsafe()

pandas/_libs/src\util.pxd in util.set_value_at_unsafe()

ValueError: invalid literal for int() with base 10: 'Production Co: Path BBC Films Proud Films See more'

df.Budget = df.Budget.astype(int)

it keeps coming up "object"
Reply
#6
Probably, some rows in that column cannot be converted. You masy convert those rows that contain only digits is the way shown below

digits = s.str.isdigit().fillna(False)
s[digits] = s[digits].astype(int)
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  manipulating .csv file into columns of selected data Karen_Masila 2 2,881 Feb-14-2018, 06:50 AM
Last Post: Karen_Masila

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020