VirtualMicrobes.data_tools package¶
Submodules¶
VirtualMicrobes.data_tools.store module¶
-
class
VirtualMicrobes.data_tools.store.
DataCollection
(save_dir='', name='dummy', filename=None)[source]¶ Bases:
object
-
get_data_point
(index)[source]¶ Get a data point at index.
Exception catching to handle case where due to code update a loaded form of a DataStore does not yet hold a particular DataCollection. In that case a dummy dict producing empty numpy arrays is returned.
Parameters: index – key to a dict shaped data point
-
prune_data_file_to_time
(min_tp, max_tp)[source]¶ prune a data file by dropping lines starting from max_tp.
This function assumes that the first line contains column names and subsequent lines are time point data, starting with an comma separated (integer) time point.
Parameters: - max_tp – time point before which lines are dropped
- max_tp – time point after which lines are dropped
-
-
class
VirtualMicrobes.data_tools.store.
DataStore
(base_save_dir, name, utility_path, n_most_frequent_metabolic_types, n_most_frequent_genomic_counts, species_markers, reactions_dict, small_mols, clean=True, create=True)[source]¶ Bases:
object
Storage object for simulation data
Keep a store of simulation data that can be written to disk and retrieved for online plotting. Typically, the store will only hold the most recent data points before appending the data to relevant on disk storage files.
-
add_ancestry_data_point
(comp_dat, ref_lods, time_point, leaf_samples=100)[source]¶ Compare lines of descent in a reference tree to a population snapshot at a previous time point.
The comp_dat is a po
-
best_stats_dir
= 'best_dat'¶
-
change_save_location
(base_save_dir=None, name=None, clean=False, copy_orig=True, create=True, current_save_path=None)[source]¶
-
class_version
= '1.7'¶
-
crossfeed_stats
= ['crossfeeding', 'strict crossfeeding', 'exploitive crossfeeding']¶
-
eco_diversity_stats_dict
= {'consumer type': <function <lambda>>, 'export type': <function <lambda>>, 'genotype': <function <lambda>>, 'import type': <function <lambda>>, 'metabolic type': <function <lambda>>, 'producer type': <function <lambda>>, 'reaction genotype': <function <lambda>>}¶
-
eco_stats_dir
= 'ecology_dat'¶
-
eco_type_stats
= ['metabolic_type_vector', 'genotype_vector']¶
-
fit_stats_names
= ['toxicity', 'toxicity_change_rate', 'raw_production', 'raw_production_change_rate']¶
-
functional_stats_names
= ['conversions_type', 'genotype', 'reaction_genotype', 'metabolic_type', 'import_type', 'export_type', 'tf_sensed', 'consumes', 'produces']¶
-
gain_loss_columns
= ['gain', 'loss']¶
-
genome_dist_stats
= ['copy_numbers', 'copy_numbers_tfs', 'copy_numbers_enzymes', 'copy_numbers_inf_pumps', 'copy_numbers_eff_pumps']¶
-
genome_simple_stats_names
= ['tf_promoter_strengths', 'enz_promoter_strengths', 'pump_promoter_strengths', 'tf_ligand_differential_ks', 'enz_subs_differential_ks', 'pump_subs_differential_ks', 'pump_ene_differential_ks', 'tf_differential_reg', 'tf_k_bind_ops', 'enz_vmaxs', 'pump_vmaxs', 'tf_ligand_ks', 'enz_subs_ks', 'pump_ene_ks', 'pump_subs_ks']¶
-
genome_simple_val_stats_names
= ['genome_size', 'chromosome_count', 'tf_count', 'enzyme_count', 'eff_pump_count', 'inf_pump_count', 'tf_avrg_promoter_strengths', 'enz_avrg_promoter_strengths', 'pump_avrg_promoter_strengths', 'tf_sum_promoter_strengths', 'enz_sum_promoter_strengths', 'pump_sum_promoter_strengths']¶
-
grid_stats
= ['neighbor crossfeeding', 'strict neighbor crossfeeding', 'exploitive neighbor crossfeeding', 'grid production values', 'grid production rates', 'grid death rates', 'grid cell sizes', 'lineages', 'grid ages', 'grid genomesizes', 'grid divided', 'metabolic_type_grid']¶
-
grn_edits
= {'': <function <lambda>>, '_pruned_1._cT_iT': <function <lambda>>}¶
-
init_dict_stats_store
(save_dir, stats_name, column_names, index_name='time_point', **kwargs)[source]¶
-
init_save_dirs
(clean=False, create=True)[source]¶ create the paths to store various data types
Parameters: - clean – (bool) remove existing files in path
- create – create the path within the file system
-
init_stats_column_names
(n_most_freq_met_types, n_most_freq_genomic_counts, species_markers, reactions_dict, small_mols)[source]¶ Initialize column names for various data collections :param n_most_freq_met_types: :param n_most_genomic_counts:
-
lod_stats_dir
= 'lod_dat'¶
-
meta_stats_names
= ['providing_count', 'strict_providing_count', 'exploiting_count', 'strict_exploiting_count', 'producing_count', 'consuming_count', 'importing_count', 'exporting_count']¶
-
metabolic_categories
= ['producer', 'consumer', 'import', 'export']¶
-
mut_stats_names
= ['point_mut_count', 'chromosomal_mut_count', 'stretch_mut_count', 'chromosome_dup_count', 'chromosome_del_count', 'chromosome_fuse_count', 'chromosome_fiss_count', 'sequence_mut_count', 'tandem_dup_count', 'stretch_del_count', 'stretch_invert_count', 'translocate_count', 'internal_hgt_count', 'external_hgt_count']¶
-
network_stats_funcs
= {'all_node_connectivities': <function <lambda>>, 'degree': <function <lambda>>, 'in_degree': <function <lambda>>, 'out_degree': <function <lambda>>}¶
-
phy_dir
= 'phylogeny_dat'¶
-
pop_genomic_stats_dict
= {'chromosome counts': <function <lambda>>, 'genome sizes': <function <lambda>>}¶
-
pop_simple_stats_dict
= {'ages': <function <lambda>>, 'cell sizes': <function <lambda>>, 'death rates': <function <lambda>>, 'differential regulation': <function <lambda>>, 'enz_subs_ks': <function <lambda>>, 'enzyme counts': <function <lambda>>, 'enzyme promoter strengths': <function <lambda>>, 'enzyme vmaxs': <function <lambda>>, 'exporter counts': <function <lambda>>, 'hgt neigh': <function <lambda>>, 'importer counts': <function <lambda>>, 'iterages': <function <lambda>>, 'offspring counts': <function <lambda>>, 'pos production': <function <lambda>>, 'production rates': <function <lambda>>, 'production values': <function <lambda>>, 'pump promoter strengths': <function <lambda>>, 'pump vmaxs': <function <lambda>>, 'pump_ene_ks': <function <lambda>>, 'pump_subs_ks': <function <lambda>>, 'regulator_score': <function <lambda>>, 'tf counts': <function <lambda>>, 'tf promoter strengths': <function <lambda>>, 'tf_k_bind_ops': <function <lambda>>, 'tf_ligand_ks': <function <lambda>>, 'times divided': <function <lambda>>, 'toxicity rates': <function <lambda>>, 'uptake rates': <function <lambda>>}¶
-
pop_stats_dir
= 'population_dat'¶
-
save_dir
¶
-
simple_stats_columns
= ['avrg', 'min', 'max', 'median', 'std', 'total']¶
-
simple_value_column
= ['value']¶
-
snapshot_stats_names
= ['historic_production_max']¶
-
trophic_type_columns
= ['fac-mixotroph', 'autotroph', 'heterotroph', 'obl-mixotroph']¶
-
upgrade
(odict)[source]¶ Upgrading from older pickled version of class to latest version. Version information is saved as class variable and should be updated when class invariants (e.g. fields) are added. (see also __setstate__)
Adapted from recipe at http://code.activestate.com/recipes/521901-upgradable-pickles/
-
-
class
VirtualMicrobes.data_tools.store.
DictDataCollection
(column_names, index_name, to_string=None, **kwargs)[source]¶ Bases:
VirtualMicrobes.data_tools.store.DataCollection
Entry point for storing and retrieving structured data