Python Forum
Customizing an sklearn submodule with cython
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Customizing an sklearn submodule with cython
#1
I'd like to create a custom DecisionTreeRegressor to be used with sklearn's RandomForestRegressor, however, to get the desired effect, I also need to create a custom Splitter, which determines how the training data is divided into leaf nodes and is written in Cython for sklearn. Here are the relevant sklearn files:

RandomForestRegressor (python): https://github.com/scikit-learn/scikit-l..._forest.py
DecisionTreeRegressor (python): https://github.com/scikit-learn/scikit-l...classes.py
Splitter (cython): https://github.com/scikit-learn/scikit-l...litter.pyx and https://github.com/scikit-learn/scikit-l...litter.pxd

The way to do this that intuitively makes sense to me is to create copies of the _forest.py file and the entire tree submodule, edit the files as needed to customize the relevant classes, and perform any recompilation steps, however, I want to make sure that I'm compiling in a manner that is consistent with the rest of sklearn. The problem is that I'm not sure what exactly sklearn is doing to compile its cython files and I can't replicate compilation using standard methods (https://cython.readthedocs.io/en/latest/...orial.html) without getting errors. Upon inspecting the local sklearn module folder, I see that sklearn generates a number of .so files that are not present in the GitHub repo. These appear to be generated by the setup.py file within the tree submodule (https://github.com/scikit-learn/scikit-l...e/setup.py).

One thing worth mentioning is the fact that someone using sklearn doesn't have to go through a manual compilation step. With that said, is anyone aware of a way to compile customized cython code from within a python file (i.e., without additional command line operations - similar to how sklearn apparently does it) that allows for easy recompilation in the event that a cython file is edited?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Column Transformer with Mixed Types - sklearn aaldb 0 250 Feb-22-2024, 03:27 PM
Last Post: aaldb
  sklearn.neural_network MLPClassifier forecast variances CK1960 1 1,780 Oct-29-2020, 10:13 AM
Last Post: CK1960
  sklearn and train_test_split nsadams87xx 1 1,794 Apr-23-2020, 05:32 PM
Last Post: jefsummers
  Error When Using sklearn Predict Function firebird 0 2,026 Mar-21-2020, 04:34 PM
Last Post: firebird
  Outputing LogisticRegression Coefficients (sklearn) RawlinsCross 6 4,658 Feb-27-2020, 02:47 PM
Last Post: RawlinsCross
  Predicting an output variable with sklearn Ccross1 1 2,487 Jun-04-2019, 03:11 PM
Last Post: michalmonday
  sklearn regression to excel punksnotdead 1 2,724 Apr-14-2019, 12:32 PM
Last Post: punksnotdead
  sklearn imported but not recognized kerberg 6 16,365 Jun-18-2017, 12:32 PM
Last Post: snippsat
  Sklearn Agglomerative Hierarchical Clustering - help with array set up pstarrett 4 5,228 Feb-21-2017, 05:05 AM
Last Post: pstarrett

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020