New spectral clustering algorithm #31648
Replies: 1 comment
-
Hi, I'm working on something similar. I'm developing an algorithm for anomaly detection. I can't tell you exactly how to submit your algorithm to scikit-learn (I think your paper probably needs to be published first), but I can tell you that if you plan to work on the algorithm, you need to follow this guide to make it compatible with the scikit-learn environment. There are a number of conventions and validations you need to follow. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey everyone,
As part of a paper in Machine Learning Research Journal, our research team is trying to submit a new clustering algorithm to scikit learn.
It is a "generalization" of Spectral Clustering to directed graphs. Spectral clustering is already present in sklearn, along with two variations.
This implementation is random walk based, and in practice, the main difference with standard SC is the computation of new (so called generalized for now) laplacian matrices.
The rest of the pipeline is pretty much the same as standard SC (kneighbors graph of the data, adjacency matrix, eigenvectors of the laplacian (spectral embedding), assigning labels).
The research is still ongoing, especially the task of finding optimal base parameters, but the results are promising. They are at least as good as classical spectral clustering (being a subcase) and often outperform it.
We are unsure on how to submit this. Based on the other spectral algorithms implemented, we think the best would be building a new estimator (probably called DiSpectral for directed spectral clustering).
The thing is this estimator would be very similar to the spectral clustering one, however it would source the laplacian from an external utils file.
Thus, we are thinking of adding it directly to the SpectralClustering estimator.
We are thus looking for help on the development of this new method, whether it be sklearn contribution rules help or help for performance optimization.
Thanks in advance !
Beta Was this translation helpful? Give feedback.
All reactions