Functions to apply class-prior-preserving univariate transforms to data



Apply z-score normalization to nxd feature matrix


trainOOBClassifier(X, y, modelFactory=<lambda>, n_estimators=100, n_jobs=10)

Train ensemble of models predicting the probability that each instance came from the labeled positive, rather than the unlabeled mixture, set.

Required Arguments:

- X : ndarray shape (n,d) : feature matrix
- y : ndarray shape (n,)  : positive v. unlabeled component assignments for each instance

Optional Arguments:

- modelFactory : lambda function returning sklearn-style model instance (has fit, fit_predict, predict_proba, ... functions) : default DicisionTreeRegressor
- n_estimators : size of the ensemble : default 100


- transform_scores : ndarray (n,) : probability that each instance came from labeled positive set, calculating using out-of-bag scores
- auc_pu : float : the AUROC of this non-traditional classifier


trainKFoldClassifier(X, y, modelFactory=<lambda>, KFoldValue=10)

Train model using K-fold cross-validation Required Arguments:

- X : ndarray shape (n,d) : feature matrix
- y : ndarray shape (n,)  : positive v. unlabeled component assignments for each instance

Optional Arguments:

- modelFactory : lambda function returning sklearn-style model instance (has fit, fit_predict, predict_proba, ... functions) : default SVC
- KFoldValue : number of folds to use in k-fold cross-validation : default 10


- transform_scores : ndarray (n,) : probability that each instance came from labeled positive set
- auc_pu : float : the AUROC of this non-traditional classifier

Test k-fold and oob transform functions


getOptimalTransform(X, y)

Train the 6 univariate transforms from (Zeiberg 2020) and return the transform scores and auc_pu for the best transform

Required Arguments:

- X : ndarray shape (n,d) : feature matrix
- y : ndarray shape (n,)  : positive v. unlabeled component assignments for each instance


- transform_scores : ndarray (n,) : probability that each instance came from labeled positive set
- auc_pu : float : the AUROC of this non-traditional classifier
(array([0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.19512195,
        0.        , 0.03030303, 0.        , 0.02631579, 0.        ,
        0.        , 0.        , 0.03030303, 0.05882353, 0.        ,
        0.        , 0.        , 0.09375   , 0.03225806, 0.58823529,
        0.58333333, 0.        , 0.02631579, 0.        , 0.        ,
        0.025     , 0.        , 0.24137931, 0.02380952, 0.        ,
        0.        , 0.        , 0.        , 0.45652174, 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.27027027, 0.        , 0.04761905, 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.08823529, 0.90625   ,
        0.97142857, 0.15384615, 0.55555556, 0.96969697, 1.        ,
        0.38888889, 0.93103448, 1.        , 0.875     , 1.        ,
        0.95      , 0.88888889, 0.575     , 0.74285714, 0.97058824,
        0.91666667, 0.94285714, 1.        , 1.        , 1.        ,
        1.        , 0.175     , 1.        , 0.07692308, 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 0.84782609, 1.        , 1.        , 1.        ,
        0.98      , 0.97560976, 1.        , 0.86111111, 1.        ,
        1.        , 0.97058824, 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        0.97777778, 1.        , 0.42105263, 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 0.38461538, 1.        , 0.86666667, 0.97727273,
        1.        , 0.43589744, 1.        , 1.        , 1.        ,
        0.11904762, 0.        , 0.        , 0.08571429, 0.84210526,
        0.        , 0.23529412, 0.12820513, 0.        , 0.        ,
        0.        , 0.        , 0.05      , 0.        , 0.10869565,
        0.        , 0.59375   , 0.33333333, 0.        , 0.        ,
        0.04444444, 0.05555556, 0.47619048, 0.02857143, 0.25806452,
        0.02857143, 0.03125   , 0.08333333, 0.        , 0.        ,
        0.15789474, 0.        , 0.03125   , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.14705882, 0.3125    , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ]),