pylipid.func.calculate_scores¶
- pylipid.func.calculate_scores(dist_matrix, kde_bw=0.15, pca_component=0.9, score_weights=None)[source]¶
Calculate scores based on probability density.
This function first lower the dimension of dist_matrix by using a PCA. Then the distribution of the distance vectors for each atom is estimated using KDEMultivariate.
The score of a lipid pose is calculated based on the probability density function of the atom positions in the binding site and weights given to the atoms:
\[\text { score }=\sum_{i} W_{i} \cdot \hat{f}_{i, H}(D)\]where \(W_{i}\) is the weight given to atom i of the lipid molecule, H is the bandwidth and \(\hat{f}_{i, H}(D)\) is a multivariate kernel density etimation of the position of atom i in the specified binding site. \(\hat{f}_{i, H}(D)\) is calculated from all the bound lipid poses in that binding site.
- Parameters
dist_matrix (numpy.ndarray, shape=(n_lipid_atoms, n_poses, n_binding_site_residues)) – The distance vectors describing the position of bound poses in the binding site. This dist_matrix can be generated by
vectorize_poses()
.kde_bw (scalar, default=0.15) –
The bandwidth for kernel density estimation. Used by KDEMultivariate. By default, the bandwidth is set to 0.15nm which roughly corresponds to the vdw radius of MARTINI 2 beads.
pca_component (scalar, default=0.9) –
The number of components to keep. if
0 < pca_component<1
, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components. It is used by PCA.score_weights (None or dict) – A dictionary that contains the weight for n_lipid_atoms, {idx_atom: weight}
- Returns
scores – Scores for bound poses.
- Return type
numpy.ndarray, shape=(n_samples,)
See also
pylipid.func.collect_bound_poses
Collect bound poses from trajectories.
pylipid.func.vectorize_poses
Convert bound poses to distance vectors.