Python Forum

Full Version: Advices to use class or pure functions
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello folks!

I'm totally new using Python (coming from Java/Scala world) and I'm refactoring a project that does a lot ETL things using pandas. This project has some classes to extract data from sources like Google Adwords, Analytics and etc.

Right now, I'm starting to refactoring extraction data from Google Adword. The intern who wrote this part, wrote as a class and a lot of functions that are not being used by class, they are used on other parts of modules.

For a Java developer it's common to write classes everywhere and use fancy names as namespaces but I didn't figure out why should I use a class on python. I can write small and pure functions instead of fancy classes that I'll need to instantiate to use functions like "format my timestamp as string".

Going back to my problem, I need to write some code that will do the following thing:
  • Get a CSV report from Google Analytics;
  • Return a dataframe with new columns;
  • Write data into redshift (I have a class that encapsulates a lot of things from redshift).

Based on that, I was studying about the correct approach for that. Should I write pure functions or keep using a class?

I'm trying to avoid high coupling what could be ordinary in Java world.

Any advice, please?
My first piece of advice would be to not program Python like it's Java. The two languages have very different philosophies, especially when it comes to classes.

(Mar-09-2019, 02:19 PM)Jackhammer Wrote: [ -> ]The intern who wrote this part, wrote as a class and a lot of functions that are not being used by class, they are used on other parts of modules.

Do you mean functions or methods? Methods are functions that are part of a class, functions are independent function. Because if the functions are not being used by the class they should be functions. But if they are used by the class (an not elsewhere), they should be methods. Your comment about formatting timestamps made me confused about what you meant here.

Beyond that, it's hard to advise on classes vs. functions without a better understanding of the problem. If you have data that is at least somewhat complex and common functions that apply only to it, I would use classes. If your data is reasonably generic, I wouldn't.
I'm definitely interested in this post, being a Python enthusiastic who mostly writes Scala at work (and used to use Java). On top of what ichabhod801 said, I think if you're still unsure / need help, being more specific about an example you're asking about would help ground us in your problem.