pylipid.api.LipidInteraction.compute_binding_nodes

LipidInteraction.compute_binding_nodes(threshold=4, print_data=True)[source]

Calculate binding sites.

Binding sites are defined based on a community analysis of protein residue-interaction networks that are created from the lipid interaction correlation matrix. Given the definition of a lipid binding site, namely a cluster of residues that bind to the same lipid molecule at the same time, PyLipID creates a distance vector for each residue that records the distances to all lipid molecules as a function of time, and calculate the Pearson correlation matrix of protein residues for binding the same lipid molecules. This correlation matrix is calculated by collect_residue_contacts() and stored in the class variable interaction_corrcoef.

The protein residue interaction network is constructed based on the Pearson correlation matrix. In this network, the nodes are the protein residues and the weights are the Pearson correlation coefficients of pairs of residues. The interaction network is then decomposed into sub-units or communities, which are groups of nodes that are more densely connected internally than with the rest of the network.

For the calculation of communities, the Louvain algorithm 1 is used to find high modularity network partitions. Modularity, which measures the quality of network partiions, is defined as 2

\[Q=\frac{1}{2 m} \sum_{i, j}\left[A_{i j}-\frac{k_{i} k_{j}}{2 m}\right] \delta\left(c_{i}, c_{j}\right)\]

where \(A_{i j}\) is the weight of the edge between node i and node j; \(k_{i}\) is the sum of weights of the nodes attached to the node i, i.e. the degree of the node; \(c_{i}\) is the community to which node i assigned; \(\delta\left(c_{i}, c_{j}\right)\) is 1 if i=j and 0 otherwise; and \(m=\frac{1}{2} \sum_{i j} A_{i j}\) is the number of edges. In the modularity optimization, the Louvain algorithm orders the nodes in the network, and then, one by one, removes and inserts each node in a different community c_i until no significant increase in modularity. After modularity optimization, all the nodes that belong to the same community are merged into a single node, of which the edge weights are the sum of the weights of the comprising nodes. This optimization-aggregation loop is iterated until all nodes are collapsed into one.

By default, this method returns binding sites of at least 4 residues. This filtering step is particularly helpful for analysis on smaller amount of trajectory frames, in which false correlation is more likely to happen among 2 or 3 residues.

Parameters
  • threshold (int, default=4) – The minimum size of binding sites. Only binding sites with more residues than the threshold will be returned.

  • print (bool, default=True) – If True, print a summary of binding site information.

Returns

  • node_list (list) – Binding site node list, i.e. a list of binding sites which contains sets of binding site residue indices

  • modularity (float or None) – The modularity of network partition. It measure the quality of network partition. The value is between 1 and -1. The bigger the modularity, the better the partition.

See also

pylipid.func.get_node_list

Calculates community structures in interaction network.

References

1

Blondel, V. D.; Guillaume, J.-L.; Lambiotte, R.; Lefebvre, E., Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 2008 (10), P10008

2

Newman, M. E. J., Analysis of weighted networks. Physical Review E 2004, 70 (5), 056131.