Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Help for technical choice
#1
Hello everyone,
Could you please help me for technical choice?

The context, we have a php app that parsing log file and then analyze data to provide advice solution customers (all is in php rest api and data analysis).
Today we want to switch to python for machine learning and data science.
The first need is that we would like to classify our solutions according to certains filters (type, date ...).
Currently we plan to go on python TF-IDF (https://github.com/hrs/python-tf-idf) ACP (https://etav.github.io/python/scikit_pca.html), and classification with k-means. I also saw PyTorch or TensorFlow
Do you recommend using Flask or Django?
What tools do you recommend in our case?

thank you so much
Quote
#2
are you aware of the python repository: https://pypi.org/
It's a great place to search for existing packages that might save you a lot of time

211,850 projects
394,834 users
Quote
#3
Thanks for you reply,

I want just knowwhat stack for datascience and deep elarning you would use if you have problem like parsing big log file, and web scrapping finally analyse data for better action. What tools you would use (flask, django, panda, tensorflow, goutte)?

Thanks it's just for to have idea.
Quote
#4
Flask is good for developing a web interface for your application. There are a variety of GUI interfaces available as well. Tkinter is very common, I like wxPython.

That out of the way, you will want to use Pandas as it is foundational to a lot of the other packages. Get very comfortable with the notation, slicing, and manipulating of dataframes.

You can do a lot with SciKit Learn and that is where I would start for data analysis. Get comfortable with the Linear Regression, Polynomial regression, K Means, K Nearest Neighbor, and other techniques there.

Moving to machine learning, I'd use TensorFlow, though there are others. Keras plays nicely with TensorFlow and extends your capabilities.

A little depends on where you plan to do your data analysis. Reasonable sized datasets on your own system can be done with just about any of the available packages. If you have a massive dataset (all Medicare claims for 2018, all taxi rides in NYC 2010-2012, etc.) you will likely use a cloud system with virtual machines. Correct me if wrong, but IBM/Watson uses PyTorch more than TensorFlow. Google Cloud Services uses TensorFlow.

Please ask more specifics if you need, will do my best to answer or direct you.
J

Oh, and Pandas will webscrape for the right formats - will pull all tables on a webpage into an array of dataframes. For more sophisticated webscraping Beautiful Soup seems to be the most popular right now.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Technical info on reading csv with python garikhgh0 3 1,275 Apr-12-2018, 04:00 PM
Last Post: snippsat

Forum Jump:


Users browsing this thread: 1 Guest(s)