pylipid.func.calculate_surface_area_wrapper¶

pylipid.func.calculate_surface_area_wrapper(trajfile, topfile, traj_idx, binding_site_map={}, nprot=1, timeunit='us', stride=1, dt_traj=None, radii=None)[source]¶

A wrapper function for calculating surface area.

A wrapper function is to assist the calculation using p_tqdm, which is a multiprocessing library incorporated with progress bars. So some of the setting or parameters are used to assist the use of multiprocessing.

In this function, the provided trajectory is read by mdtraj.read() and the solvent accessible surface of each residue is calculated by mdtraj.shrake_rupley. The binding site surface area is calculated as the sum of surface areas of its comprising residues. The binding sites are defined by binding_site_map which is a python dictionary object that includes binding site IDs as its keys and the residue indices of the binding site residues as the corresponding values.

The calculation of surface area requires a definition of the protein atom radius. MDtraj defines the radii for common atoms (see here). The radius of the BB bead in MARTINI2 is defined as 0.26 nm, the SC1/SC2/SC3 are defined as 0.23 nm in this function. Use the param radii to define or change of definition of atom radius.

The returned data surface_area is a list of pandas.DataFrame objects, and each of these pandas.DataFrame object contains surface area data for one copy of the proteins in the simulation system. That is, if the system has N copies of the receptor, surface_area will have N pandas.DataFrame objects. Each pandas.DataFrame contains the surface areas as a function of time for the selected binding site are shown by column with the column name of “Binding Site {idx}”. This pandas.DataFrame object also has a “Time” column which records the timestep at which the surface areas are measured. The other returned data data_keys contains a list of tuples of (traj_idx, protein_idx), which corresponds to the list of pandas.DataFrame objects of surface_area. The reason for having this arrangement is to assist the generation of a big pandas.DataFrame that includes surface area data from all trajectories in pylipid.api.LipidInteraction.compute_surface_area() when used in combination with p_tqdm.

Parameters

trajfile (str) – Trajectory filename.
topfile (str) – Topology filename.
traj_idx (idx) – The index of the given trajectory in the original trajectory list.
binding_site_map (dict) – A python dictionary to defines the selected binding site IDs and the residue indices of the binding site residues
nprot (int, default=1) – Number of copies of the receptor in the simulation system.
timeunit ({"us", "ns"}) – The time unit used for reporting the timestpes.
stride (int, default=1) – Analyze every stride-th frame from the trajectory.
dt_traj (int or None) – Timestep of trajectories. It is required when trajectories do not have timestep information. Not needed for trajectory formats of e.g. xtc, trr etc. If None, timestep information will take from trajectories.
radii (dict or None) – Protein atom radii {atom name: radius}. The radius is reported in nanometer.

Returns

surface_area (list) – A list of {nprot} pandas.DataFrame objects. Each pandas.DataFrame object records the surface areas as a function of time for the selected binding sites on one copy of the receptor in the system.
data_keys (list) – A list of tuples (traj_idx, protein_idx) which are the traj index and protein index information of the pandas.DataFrame objects in surface_area.