Functions to apply class-prior-preserving univariate transforms to data
trainOOBClassifier
[source]
trainOOBClassifier
(X
,y
,modelFactory
=<lambda>
,n_estimators
=100
,n_jobs
=10
)
Train ensemble of
Required Arguments:
- X : ndarray shape (n,d) : feature matrix
- y : ndarray shape (n,) : positive v. unlabeled component assignments for each instance
Optional Arguments:
- modelFactory : lambda function returning sklearn-style model instance (has fit, fit_predict, predict_proba, ... functions) : default DicisionTreeRegressor
- n_estimators : size of the ensemble : default 100
Returns
- transform_scores : ndarray (n,) : probability that each instance came from labeled positive set, calculating using out-of-bag scores
- auc_pu : float : the AUROC of this non-traditional classifier
trainKFoldClassifier
[source]
trainKFoldClassifier
(X
,y
,modelFactory
=<lambda>
,KFoldValue
=10
)
Train model using K-fold cross-validation Required Arguments:
- X : ndarray shape (n,d) : feature matrix
- y : ndarray shape (n,) : positive v. unlabeled component assignments for each instance
Optional Arguments:
- modelFactory : lambda function returning sklearn-style model instance (has fit, fit_predict, predict_proba, ... functions) : default SVC
- KFoldValue : number of folds to use in k-fold cross-validation : default 10
Returns
- transform_scores : ndarray (n,) : probability that each instance came from labeled positive set
- auc_pu : float : the AUROC of this non-traditional classifier
Test k-fold and oob transform functions
getOptimalTransform
[source]
getOptimalTransform
(X
,y
)
Train the 6 univariate transforms from (Zeiberg 2020) and return the transform scores and auc_pu for the best transform
Required Arguments:
- X : ndarray shape (n,d) : feature matrix
- y : ndarray shape (n,) : positive v. unlabeled component assignments for each instance
Returns:
- transform_scores : ndarray (n,) : probability that each instance came from labeled positive set
- auc_pu : float : the AUROC of this non-traditional classifier
trainOOBClassifier(X,y)
(array([0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.19512195, 0. , 0.03030303, 0. , 0.02631579, 0. , 0. , 0. , 0.03030303, 0.05882353, 0. , 0. , 0. , 0.09375 , 0.03225806, 0.58823529, 0.58333333, 0. , 0.02631579, 0. , 0. , 0.025 , 0. , 0.24137931, 0.02380952, 0. , 0. , 0. , 0. , 0.45652174, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.27027027, 0. , 0.04761905, 0. , 0. , 0. , 0. , 0. , 0.08823529, 0.90625 , 0.97142857, 0.15384615, 0.55555556, 0.96969697, 1. , 0.38888889, 0.93103448, 1. , 0.875 , 1. , 0.95 , 0.88888889, 0.575 , 0.74285714, 0.97058824, 0.91666667, 0.94285714, 1. , 1. , 1. , 1. , 0.175 , 1. , 0.07692308, 1. , 1. , 1. , 1. , 1. , 1. , 1. , 0.84782609, 1. , 1. , 1. , 0.98 , 0.97560976, 1. , 0.86111111, 1. , 1. , 0.97058824, 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 0.97777778, 1. , 0.42105263, 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 0.38461538, 1. , 0.86666667, 0.97727273, 1. , 0.43589744, 1. , 1. , 1. , 0.11904762, 0. , 0. , 0.08571429, 0.84210526, 0. , 0.23529412, 0.12820513, 0. , 0. , 0. , 0. , 0.05 , 0. , 0.10869565, 0. , 0.59375 , 0.33333333, 0. , 0. , 0.04444444, 0.05555556, 0.47619048, 0.02857143, 0.25806452, 0.02857143, 0.03125 , 0.08333333, 0. , 0. , 0.15789474, 0. , 0.03125 , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.14705882, 0.3125 , 0. , 0. , 0. , 0. , 0. , 0. ]), 0.9892062656311702)