ihm_validation package
Submodules
ihm_validation.cx module
Crosslinking-MS validation for PDB-IHM
- class ihm_validation.cx.CxValidation(*args, **kwargs)[source]
Bases:
GetInputInformation- check_conditional_flag(data: DataFrame) None[source]
Check consistency of conditional flags in a restraint group
- get_cx_data() -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)[source]
Extract crosslinking-MS data from mmcif file
- static pyhmmer_alignment_to_map(hit) -> (<class 'dict'>, <class 'list'>)[source]
Convert HMMER alignment into residue map
ihm_validation.em module
3DEM validation for PDB-IHM
- class ihm_validation.em.EMValidation(*args, **kwargs)[source]
Bases:
GetInputInformation- VA_TIMEOUT = 900
- static get_em_reconstruction_method(map_metadata)[source]
Get EM reconsruction method from the conrolled vocabulary
ihm_validation.excludedvolume module
Excluded volume assessment for PDB-IHM
ihm_validation.format_checker module
Detect file format and check residue and atom names in IHMCIF file.
This module provides functionality to: - Detect file format (PDB, mmCIF, or IHMCIF) - Validate CIF files against PDBx dictionary - Validate IHMCIF files against combined PDBx+IHM dictionary (following the approach
Check residue and atom names in IHMCIF files
- class ihm_validation.format_checker.FileFormat(value)[source]
Bases:
EnumEnumeration of supported file formats
- IHMCIF = 'IHMCIF'
- MMCIF = 'PDBx/mmCIF'
- PDB = 'PDB'
- UNKNOWN = 'UNKNOWN'
- ihm_validation.format_checker.check_all_exception(system: System)[source]
Perform all checks. Throw an exception if a check fails.
- ihm_validation.format_checker.check_all_log(system: System) int[source]
Perform all checks. Throw a message in the log if a check fails and return a non-zero exit code
- ihm_validation.format_checker.check_atom_names_chimerax(cif_file, timeout=180)[source]
Run ChimeraX on a CIF file and extract warnings about atoms not in residue templates.
- Parameters:
cif_file – Path to the CIF file
timeout – Timeout in seconds (default: 180)
- Returns:
List of warning messages (empty list if no warnings)
- ihm_validation.format_checker.check_entities_histidines(system: System, histidines=frozenset({'HID', 'HIE', 'HIP'}))[source]
Find any non-standard histidine chemical components
- ihm_validation.format_checker.check_file_exception(fname: str, check_format: bool = True, validate_dictionary: bool = True)[source]
Parse a file, do all checks, throw an exception if a check fails.
- Parameters:
fname – Path to the file to check
check_format – If True, verify the file format before checking
validate_dictionary – If True, validate CIF/IHMCIF files against dictionaries
- ihm_validation.format_checker.check_file_format(fname: str, validate_dictionary: bool = True, raise_on_error: bool = True)[source]
Check file format and validate that it is IHMCIF.
- Parameters:
fname – Path to the file to check
validate_dictionary – If True, validate IHMCIF files against dictionaries
raise_on_error – If True, raise ValueError on format errors; if False, return error message
- Returns:
(success: bool, error_msg: str or None) If raise_on_error is True: None (raises ValueError on error)
- Return type:
If raise_on_error is False
- Raises:
ValueError – If file format is not IHMCIF (when raise_on_error is True)
- ihm_validation.format_checker.check_file_log(fname: str, check_format: bool = True, validate_dictionary: bool = True) int[source]
Parse a file, do all checks, throw a log message if a check fails and return a non-zero exit code
- Parameters:
fname – Path to the file to check
check_format – If True, verify the file format before checking
validate_dictionary – If True, validate PDBx/mmCIF or IHMCIF files against dictionaries
- Returns:
0 for success, non-zero for failure
- Return type:
Exit code
- ihm_validation.format_checker.check_models(system: System)[source]
Find any non-standard histidine chemical components
- ihm_validation.format_checker.detect_format(file_path: str, max_lines: int = 3000) Tuple[FileFormat, str][source]
Detect the format of a structural biology file.
- Parameters:
file_path – Path to the file to analyze
max_lines – Maximum number of lines to read for detection (default: 1000)
- Returns:
Tuple of (FileFormat enum, reason string)
- Raises:
FileNotFoundError – If the file does not exist
IOError – If the file cannot be read
- ihm_validation.format_checker.parse_ihm_cif(fname, encoding='utf8') tuple[source]
Parse an IHMCIF file using the ihm library
- ihm_validation.format_checker.validate_cif_against_dictionary(file_path: str, dictionary) None[source]
Validate a CIF file against a dictionary.
- Parameters:
file_path – Path to the CIF file to validate
dictionary – Dictionary object to validate against
- Raises:
ihm.dictionary.ValidatorError – If validation fails
IOError – If file cannot be read
- ihm_validation.format_checker.validate_ihmcif(file_path: str) None[source]
Validate an IHMCIF file against the combined PDBx/mmCIF+IHMCIF dictionary.
Deposited integrative models should conform to both the PDBx dictionary (used to define basic structural information such as residues and chains) and the IHM dictionary (used for information specific to integrative modeling). Some entries also use the FLRCIF dictionary for FRET/fluorescence data.
- Parameters:
file_path – Path to the IHMCIF file to validate
- Raises:
ihm.dictionary.ValidatorError – If validation fails
ihm_validation.futures module
This is a beta-version of a new validation framework
- class ihm_validation.futures.CXMSValidator[source]
Bases:
Validator- property ertypes
Get all extendend restraint types
- property ertypes_df
- load_restraint(restraint: CrossLinkRestraint) None[source]
Extract crosslinking-MS data from mmcif file
- property number_of_restraint_groups: int
- property number_of_restraints: int
- property rtdtypes
Get all crosslink types
ihm_validation.generate_static_html_pages module
A script for generation of validation_help.html and about_validation.html
ihm_validation.get_plots module
Generate overview plots
- class ihm_validation.get_plots.Plots(*args, imageDirName, **kwargs)[source]
Bases:
GetInputInformation- plot_quality_at_glance(molprobity_data: dict | None = None, exv_data: dict | None = None, sas_data_quality: dict | None = None, sas_fit: dict | None = None, cx_data_quality: dict | None = None, cx_fit: dict | None = None, em_data_quality: dict | None = None, em_fit: dict | None = None) dict[source]
ihm_validation.ihm_validator module
Main running script
- ihm_validation.ihm_validator.write_html(prefix: str, template_dict: dict, template_list: list, dirName: str)[source]
- ihm_validation.ihm_validator.write_json(mmcif_file: str, template_dict: dict, dirName: str, dirName_Outputs: str)[source]
ihm_validation.images module
BASE64 encoded images for the report
ihm_validation.mmcif_io module
Read/write IHMCIF file
- class ihm_validation.mmcif_io.GetInputInformation(mmcif_file: str | None = None, system: System | None = None, encoding: str = 'utf-8', cache: str = '.', nocache: bool = False)[source]
Bases:
object- property atomic
- property cg
- check_sphere() int[source]
check resolution of structure, returns 0 if its atomic and 1 if the model is multires
- delete_extra_loops(some_text=[]) list[source]
function to help re-write mmcif file for molprobity this cleans up extra loops in the cif file
- property deposition_date
Return initial deposition date
- get_RB_flex_dict() -> (<class 'dict'>, <class 'dict'>, <class 'int'>, <class 'int'>)[source]
get RB and flexible segments from model information
- get_model_rep_dict() dict[source]
Map models to representations useful especially for multi-state systems
- get_representation()[source]
get details on number of model composition based on types of representation listed
- get_representation_details() dict[source]
Extract details about representation (atomic/coarse-grained)
- get_representation_scale(rep, chid=None) dict[source]
Extract details about representation (atomic/coarse-grained)
- property has_crosslinking_ms_dataset
- property has_em_dataset
- property has_sas_dataset
- mmcif_get_lists(filetemp=None) -> (<class 'list'>, <class 'dict'>, <class 'dict'>, <class 'list'>)[source]
function to help re-write mmcif file for molprobity this function reads the atom_site dictionary terms and returns a list
- property num_models
- remove_flr(some_text=[]) list[source]
function to help re-write mmcif file for molprobity this deletes all flr related text/key-value pairs
- property sequences: dict
Return dictionary with sequences of polymeric entities from mmcif entry
- class ihm_validation.mmcif_io.IHMVAvailableModes(value)[source]
Bases:
EnumAn enumeration.
- DEVELOPMENT = 1
- PRODUCTION = 0
- ihm_validation.mmcif_io.get_operational_mode() IHMVAvailableModes[source]
Check environment variables and set the operational mode
ihm_validation.molprobity module
Perform MolProbit assessment
- class ihm_validation.molprobity.AtomSiteVariant(model_id: int | None = None)[source]
Bases:
VariantUsed to select typical PDBx/IHM file output. See
write().
- class ihm_validation.molprobity.GetMolprobityInformation(*args, **kwargs)[source]
Bases:
GetInputInformation- convert = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/ihmvalidation/checkouts/v3.1/ihm_validation/molprobity_convert.py')
- data = {}
- get_internal_version(tool: str = 'molprobity.clashscore') str[source]
Get internal molprobity version. We assume that all tools belong to the same release.
- property models: list
- property molprobity_version: str
Get MolProbity version. We assume that all tools belong to the same release.
- class ihm_validation.molprobity.MyModelDumper(model_id: int | None = None)[source]
Bases:
_ModelDumper- dump(system, writer)[source]
Use writer to write information about system to mmCIF or BinaryCIF.
- Parameters:
system (
ihm.System) – Theihm.Systemobject containing all information about the system.writer (
ihm.format.CifWriterorihm.format_bcif.BinaryCifWriter.) – Utility class to write data to the output file.
- model_id = None
ihm_validation.molprobity_convert module
Convert MolProbity output from json to a pickled object
ihm_validation.precision module
Run local precision assessment (PrISM)
ihm_validation.report module
Generation of PDF and HTML reports
- class ihm_validation.report.WriteReport(mmcif_file: str, db: str = '.', cache: str = '.', nocache: bool = False, enable_sas: bool = False, enable_cx: bool = False, enable_em: bool = False, enable_prism: bool = False)[source]
Bases:
object- report_version = '3.1'
- run_cx_validation(Template_Dict: dict) dict[source]
get cx validation information from mmcif files NOTE: this function is incomplete it currently evaluates satisfaction from mmcif files and not the enetire ensemble
- run_cx_validation_plots(Template_Dict: dict, imageDirName: str) None[source]
create validation plots for cx datasets NOTE: this function is incomplete, the plots are also ugly and need to be refined
- run_entry_composition(Template_Dict: dict) dict[source]
get entry composition, relies on IHM library
- run_model_quality(Template_Dict: dict, csvDirName: str, htmlDirName: str) -> (<class 'dict'>, <class 'dict'>, <class 'dict'>, <class 'dict'>, <class 'dict'>)[source]
get excluded volume for multiscale models get MolProbity info for atomic models exception: models with DNA–we need a way to assess models with DNA
- run_quality_glance(molprobity_dict: dict, exv_data: dict, sas_data_quality: dict, sas_fit: dict, cx_data_quality: dict, cx_fit: dict, em_data_quality: dict, em_fit: dict, imageDirName: str) dict[source]
get quality at glance image; will be updated as validation report is updated
- run_sas_validation(Template_Dict: dict) -> (<class 'dict'>, <class 'dict'>, <class 'dict'>)[source]
get sas validation information from SASCIF or JSON files
- run_sas_validation_plots(Template_Dict: dict, imageDirName: str)[source]
get sas validation information from SASCIF or JSON files
- run_supplementary_table(Template_Dict, location='N/A', physics=['Information about physical principles was not provided'], method_details='N/A', sampling_validation=None, validation_input=['-'], cross_validation='N/A', Data_quality=['-'], clustering=None, resolution='N/A')[source]
get supplementary table, will be updated as validation report is updated
ihm_validation.sas module
SAS assessment for PDB-IHM
- class ihm_validation.sas.SasValidation(*args, **kwargs)[source]
Bases:
GetInputInformation- get_Guinier_data() -> (<class 'dict'>, <class 'dict'>)[source]
get Guinier plot data from JSON files
- get_sas_ids() list[source]
function to get all SASBDB codes used in the model, returns a list of SASBDB codes
ihm_validation.sas_plots module
Generate plots for SAS assessment
- class ihm_validation.sas_plots.SasValidationPlots(*args, imageDirName, **kwargs)[source]
Bases:
SasValidation- plot_intensities_log(sasbdb: str, df: DataFrame)[source]
plot intensities on a log scale with errors
ihm_validation.utility module
Various helper functions
- ihm_validation.utility.calc_optimal_range(counts: list) tuple[source]
heuristics to find optimal range for plots
>>> calc_optimal_range((10, 1567)) (9.0, 1568.5669999999998)
- ihm_validation.utility.check_for_dataset_type(dataset_list: list | None = None, dataset_type=None) bool[source]
check if the specific dataset type is present in the dataset list
- ihm_validation.utility.compress_cx_stats(cx_stats: dict) list[source]
Extract per-model satisfactions stats as a flat list
- ihm_validation.utility.dict_to_JSlist_rows(dict1: dict, dict2: dict) list[source]
format rigid and flexible segments
- ihm_validation.utility.format_RB_text(tex: list) str[source]
convert RB information to text for supp table
- ihm_validation.utility.format_flex_text(tex: list) str[source]
convert flex information to text for supp table
- ihm_validation.utility.format_pdbx_id(pdbx_id: str) str[source]
Format all PDB IDs to extended format
- ihm_validation.utility.format_wwpdb_url(pdb_id: str) str[source]
Generate a url to the wwPDB entry page
- ihm_validation.utility.get_RB(data_list: list) list[source]
format RB for supplementary/summary table
- ihm_validation.utility.get_alphafolddb_link(acc: str) str | None[source]
Format link for AlphaFold DB
- ihm_validation.utility.get_cx_data_fits(cx_dict: dict) list[source]
format crosslinking-MS data for supplementary/summary table
- ihm_validation.utility.get_datasets(data_dict: dict) list[source]
format datasets for supplementary/summary table
- ihm_validation.utility.get_datasets_summary(system: System) list[source]
Get counts for all data types used for modeling
- ihm_validation.utility.get_flex(data_list: list) list[source]
format flexible regions for supplementary/summary table
- ihm_validation.utility.get_hierarchy_from_atoms(atoms) dict[source]
Construct polymer hierarchy from a list of atoms
- ihm_validation.utility.get_hierarchy_from_model(model) dict[source]
Construct polymer hierarchy from atoms and beads in the model
- ihm_validation.utility.get_method_name(sample_dict: dict) str[source]
format method name for supplementary/summary table
- ihm_validation.utility.get_method_type(sample_dict: dict) str[source]
format method type for supplementary/summary table
- ihm_validation.utility.get_restraints_info(restraints: dict) list[source]
format restraints info for supplementary/summary table
- ihm_validation.utility.get_rg_data(rg_dict: dict) list[source]
format rg data for supplementary/summary table
- ihm_validation.utility.get_rg_data_fits(rg_dict: dict) list[source]
format sas model fits for supplementary/summary table
- ihm_validation.utility.get_software(data_dict: dict) list[source]
format software for supplementary/summary table
- ihm_validation.utility.get_subunits(sub_dict: dict) list[source]
format chains for supplementary/summary table
- ihm_validation.utility.get_unique_datasets(name: dict) list[source]
get all datatypes that are yet to be validated the ones that can’t or the ones that have already been validated are in the sub_data set
- ihm_validation.utility.is_pdbx_id(pdbx_id: str) bool[source]
Check if the PDB ID is in extended format
- ihm_validation.utility.mp_readable_format(mp: dict) list[source]
Format MolProbity results for supplementary/summary table
- ihm_validation.utility.order_of_magnitude(value: float) float[source]
calculate the order of magnitude for a given number
>>> order_of_magnitude(135) 2.0
Module contents
__init__.py - Init file for the package