pylipid.api

class pylipid.api.LipidInteraction(trajfile_list, cutoffs=[0.475, 0.7], lipid='CHOL', topfile_list=None, lipid_atoms=None, nprot=1, resi_offset=0, save_dir=None, timeunit='us', stride=1, dt_traj=None)[source]

The main class that handles calculation and controls workflow.

LipidInteraction reads trajectory information via mdtraj.load(), so it supports most of the trajectory formats. LipidInteraction calculates lipid interactions with both protein residues and the calculated binding sites, and provides a couple of assisting functions to plot data and present data in various forms.

The methods of LipidInteraction can be divided into three groups based on their roles: one for calculation of interaction with protein residues, one for binding site and the last that contains assisting functions for plotting and generating data. Each of the first two groups has a core function to collect/calculate the required data for the rest of the functions in that group, i.e. collect_residue_contacts that builds lipid index for residues as a function of time for residue analysis; and compute_binding_sites that calculates the binding sites using the interaction network of the residues. The rest of the methods in each group are independent of each other.

LipidInteraction also has an attribute, named dataset, which stores the calculation interaction data in a pandas.DataFrame object and updates automatically after calculation. It records interaction data for protein residues by rows, including, for each residue, the interaction residence times, averaged durations, occupancy and lipid count etc., and binding site IDs and the various interaction data of the belonging binding site.

For the computing-demanding functions of, i.e. compute_residue_koff, compute_site_koff, analyze_bound_poses, and compute_surface_area, PyLipID uses the python multiprocessing library to speed up the calculation. Users can specify the number of CPUs these functions can use, otherwise all the CPUs in the system will be used by default.

Parameters
  • trajfile_list (str or a list of str) – Trajectory filename(s). Read by mdtraj.load() to obtain trajectory information.

  • cutoffs (list of two scalar or a scalar, default=[0.475, 0.7]) – Cutoff value(s) for defining contacts. When a list of two scalar are provided, the dual-cutoff scheme will be used. A contact in the dual-cutoff scheme starts when a lipid gets closer than the lower cutoff, and ends when this lipid moves farther than the upper cutoff. The duration between the two time points is the duration of this contact.

  • lipid (str, default="CHOL") – Lipid name in topology.

  • topfile_list (str or a list of str, default=None) – Topology filename(s). Most trajectory formats do not contain topology information. Provide either the path to a RCSB PDB file, a trajectory, or a topology for each trajectory in trajfile_list for the topology information. See mdtraj.load(). for more information.

  • lipid_atoms (list of str, default=None) – Lipid atom names. Only interactions of the provided atoms will be considered for the calculation of contacts. If None, all atoms of the lipid molecule will be used.

  • nprot (int, default=1) – Number of protein copies in the system. If the system has N copies of the protein, ‘nprot=N’ will report averaged values from the N copies, but ‘nprot=1’ will report interaction values for each copy.

  • resi_offset (int, default=0) – Shift residue index in the reported results from what is shown in the topology. Can be useful for MARTINI force field.

  • save_dir (str, default=None) – The root directory to store the generated data. By default, a directory Interaction_{lipid} will be created in the current working directory, under which all the generated data are stored.

  • timeunit ({"us", "ns"}, default="us") – The time unit used for reporting results. “us” is micro-second and “ns” is nanosecond.

  • stride (int, default=1) – Only read every stride-th frame. The same stride in mdtraj.load().

  • dt_traj (float, default=None) – Timestep of trajectories. It is required when trajectories do not have timestep information. Not needed for trajectory formats of e.g. xtc, trr etc. If None, timestep information will take from trajectories.

Methods

collect_residue_contacts()

Create contacting lipid index for residues.

compute_residue_duration([residue_id])

Calculate lipid contact durations for residues

compute_residue_occupancy([residue_id])

Calculate the percentage of frames in which the specified residue formed lipid contacts for residues.

compute_residue_lipidcount([residue_id])

Calculate the average number of contacting lipids for residues.

compute_residue_koff([residue_id, …])

Calculate interaction koff and residence time for residues.

compute_binding_nodes([threshold, print_data])

Calculate binding sites.

compute_site_duration([binding_site_id])

Calculate interaction durations for binding sites.

compute_site_occupancy([binding_site_id])

Calculate the percentage of frames in which the specified lipid contacts are formed for binding sites.

compute_site_lipidcount([binding_site_id])

Calculate the average number of contacting lipids for binding site.

compute_site_koff([binding_site_id, …])

Calculate interactions koff and residence time for binding sites.

analyze_bound_poses([binding_site_id, …])

Analyze bound poses for binding sites.

compute_surface_area([binding_site_id, …])

Calculate binding site surface areas.

write_site_info([sort_residue, save_dir, fn])

Write a report on binding site with lipid interaction information.

show_stats_per_traj([write_log, print_log, …])

Show stats of lipid interaction per trajectory.

save_data(item[, save_dir])

Assisting function for saving data.

save_coordinate(item[, save_dir, fn_coord])

Save protein coordinates in PDB format with interaction data in the B factor column.

save_pymol_script(pdb_file[, save_dir])

Save a pymol script that maps interactions onto protein structure in PyMol.

plot(item[, save_dir, gap, fig_close, …])

Assisting function for plotting interaction data.

plot_logo(item[, letter_map, color_scheme, …])

Plot interactions using Logomaker.

Attributes

dataset()

Summary of lipid interaction stored in a pandas.DataFrame() object.

residue_list

A list of Residue names.

node_list

A list of binding site residue indices.

lipid

Lipid residue name.

lipid_atoms

Lipid atom names

cutoffs

Cutoffs used for calculating contacts.

nprot

Number of protein copies in system.

stride

Stride

dt_traj

Trajectory timestep

trajfile_list

Trajectory filenames

topfile_list

Topology filenames

resi_offset

Residue index offset

save_dir

Root directory for the generated data.

timeunit

Time unit used for reporting results.

koff([residue_id, residue_name])

Residue koff

res_time([residue_id, residue_name])

Residue residence time

koff_bs(bs_id)

Binding site koff

res_time_bs(bs_id)

Binding site residence time

residue([residue_id, residue_name, print_data])

Obtain the lipid interaction information for a residue

binding_site(binding_site_id[, print_data, …])

Obtain the lipid interaction information for a binding site.