You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, I'm starting to work around the use of ScikitLearn in Julia.
In my understanding, there are a few models and tools curently implemented in Julia and the rest of the code are bindings to Python function. It is possible to use all of Scikit's functions via the @py_import macro.
If I'm not mistaken, the scalers have not been ported yet to Julia.
I made this quick workaround to implement my own Scaler class in Julia and it seems that they are way faster (which is why we are using Julia?). My Scaler class is far from being complete (no keywords arguments), but it seems that there exists such scalers JuliaML.
Is there a reason why Scalers (and the question could be extended to a lot of other tools) are not currently in ScikitLearn.jl?
Thanks again for your time.
using Statistics, ScikitLearn, ScikitLearnBase, BenchmarkTools
import ScikitLearnBase: fit!, transform, inverse_transform
@sk_import preprocessing: StandardScaler
"""
A Julia standardScaler
"""
mutable struct JStandardScaler <: BaseEstimator
epsilon::Float64
mean_::Matrix
std_::Matrix
real_std_::Matrix
JStandardScaler(; epsilon=0.001) = new(epsilon)
end
function fit!(model::JStandardScaler, X, y=nothing)
model.mean_ = mean(X, dims=1)
model.real_std_ = std(X, dims=1)
model.std_ = map(model.real_std_) do x
x > model.epsilon && return x
return model.epsilon
end
return model
end
function transform(model::JStandardScaler, X)
return @. (X - model.mean_) / model.std_
end
function inverse_transform(model::JStandardScaler, X)
return @. X * model.std_ + model.mean_
end
n = Int(10e6)
X = rand(Int, n, 12)
julia_scaler = JStandardScaler()
fit!(julia_scaler, X)
X_ = transform(julia_scaler, X)
X__ = inverse_transform(julia_scaler, X_)
@assert isapprox(X, X__)
python_scaler = StandardScaler()
fit!(python_scaler, X)
X_ = transform(python_scaler, X)
X__ = inverse_transform(python_scaler, X_)
@assert isapprox(X, X__)
julia_scaler = JStandardScaler()
@btime begin
fit!($julia_scaler, $X)
X_ = transform($julia_scaler, $X)
X__ = inverse_transform($julia_scaler, $X_)
end
python_scaler = StandardScaler()
@btime begin
fit!($python_scaler, $X)
X_ = transform($python_scaler, $X)
X__ = inverse_transform($python_scaler, $X_)
end
The text was updated successfully, but these errors were encountered:
Is there a reason why Scalers (and the question could be extended to a lot of other tools) are not currently in ScikitLearn.jl?
Just lack of time! If you would like to contribute them, that would be a very nice PR.
Meanwhile, as happy as I am to see interest in ScikitLearn.jl... Have you checked out MLJ.jl? It is very actively developed. Unless someone steps up to push it further, ScikitLearn.jl will continue its life as a "gateway package", easing Python users into a new ecosystem.
If you would like to contribute them, that would be a very nice PR.
I don't know yet if I will have the time to make a nicer Scaler object. For now, I just need the very basic one.
I checked out MLJ, but it seems llike juste another wrapper/interface for 3rd parties MachineLearning packages. If it is performance-wise more efficient I migth switch to it but for now I prefer using the ScikitLearn's algorithms.
Thanks a lot for your help, I migth ask new questions soon.
Hi everyone, I'm starting to work around the use of ScikitLearn in Julia.
In my understanding, there are a few models and tools curently implemented in Julia and the rest of the code are bindings to Python function. It is possible to use all of Scikit's functions via the @py_import macro.
If I'm not mistaken, the scalers have not been ported yet to Julia.
I made this quick workaround to implement my own Scaler class in Julia and it seems that they are way faster (which is why we are using Julia?). My Scaler class is far from being complete (no keywords arguments), but it seems that there exists such scalers JuliaML.
Is there a reason why Scalers (and the question could be extended to a lot of other tools) are not currently in ScikitLearn.jl?
Thanks again for your time.
The text was updated successfully, but these errors were encountered: