Python Forum
Pandas dataframes and numpy arrays
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas dataframes and numpy arrays
#1
Hello,

I have some basic questions about using Pandas:

-- Structured and Unstructured Data: my understanding is that Pandas deals only with tabular 2D data (dataframes) or 1D data (dataseries) and can only convert data that is structured and tabular (csv tables, excel file, etc.) into either of two two data structures. I guess a JSON file is an example of semi-structure data and can be converted into a dataframe too, correct? Pandas cannot handle unstructured data at all...

-- Pandas and Numpy: Pandas does not have the mathematical functionalities of the package Numpy in the sense that Numpy is more "mathematical". However, numpy requires its elements to be all numbers while pandas dataframe are more heterogeneous (except for requiring the same data type in the same column).
If both pandas and numpy are imported, is it possible to apply pandas methods directly to numpy arrays and numpy methods to pandas dataframes? Or do we need to first convert pandas dataframes into numpy arrays, perform mathematical calculations, and then reconvert to pandas dataframes?

-- Categorical Data: In pandas, all categorical data must always need to be converted into a numerical form, even if that numerical form is not a real number but just a code, using approaches like one-hot keying, correct? Straight categorical data can only be used in visualizations (box graphs, where the x-axis is the categorical labels) in which case we can keep the categorical data as string data types...Is that correct?

Thank you!
bytecrunch
Reply
#2
bytecrunch Wrote:-- Structured and Unstructured Data: my understanding is that Pandas deals only with tabular 2D data (dataframes) or 1D data (dataseries) and can only convert data that is structured and tabular (csv tables, excel file, etc.) into either of two two data structures. I guess a JSON file is an example of semi-structure data and can be converted into a dataframe too, correct? Pandas cannot handle unstructured data at all...
Pandas handles json files very well, and can hadle quite complicated structures. See https://python-forum.io/thread-38263-pos...#pid161978 about normalizing json content.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [Numpy] How to store different data type in one numpy array? water 7 550 Mar-26-2024, 02:18 PM
Last Post: snippsat
  Numpy returns "TypeError: unsupported operand type(s) for *: 'numpy.ufunc' and 'int'" kalle 2 2,605 Jul-19-2022, 06:31 AM
Last Post: paul18fr
  Arrays faster than pandas? Mark17 4 2,839 Aug-02-2021, 03:14 PM
Last Post: Mark17
  comparing floating point arrays to arrays of integers in Numpy amjass12 0 1,627 Jul-26-2021, 11:58 AM
Last Post: amjass12
  Pandas/Dataframes, Strings and Regular Expressions... Stephan 0 1,316 Nov-25-2020, 08:08 AM
Last Post: Stephan
  Merging sorted dataframes using Pandas Robotguy 1 2,195 Aug-12-2020, 07:11 PM
Last Post: jefsummers
  How to compare two json and write to third json differences with pandas and numpy onenessboy 0 4,700 Jul-24-2020, 01:56 PM
Last Post: onenessboy
  What is the mechanism of numpy function returning pandas object? Ibaraki 2 2,499 Apr-04-2020, 10:57 PM
Last Post: Ibaraki
  Merging two DataFrames based on indexes from two other DataFrames lucinda_rigeitti 0 1,746 Jan-16-2020, 08:36 PM
Last Post: lucinda_rigeitti
  Corrupted numpy arrays when save to file. DreamingInsanity 2 3,231 Dec-14-2019, 12:12 PM
Last Post: DreamingInsanity

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020