syconn.processing¶
syconn.processing.axoness¶
-
syconn.processing.axoness.
calc_distance2soma
(graph, nodes)[source]¶ Calculates the distance to a soma node for each node and stores it inplace in node.data[‘dist2soma’]. Building depth first search graph at each node for sorted node ordering.
Parameters: - graph (graph of SKeletonAnnotation) –
- nodes (SkeletonNodes) –
-
syconn.processing.axoness.
predict_axoness_from_node_comments
(anno)[source]¶ Exctracts axoness prediction from node comment for given contact site annotation.
Parameters: anno (SkeletonAnnotation) – Contact site Returns: skeleton IDS, skeleton axoness Return type: numpy.array, numpy.array
-
syconn.processing.axoness.
predict_axoness_from_nodes
(anno)[source]¶ Exctracts axoness prediction from nodes for given contact site annotation
Parameters: anno (SkeletonAnnotation) – contact site. Returns: skeleton IDS, skeleton axoness Return type: numpy.array, numpy.array
syconn.processing.cell_types¶
-
syconn.processing.cell_types.
calc_neuron_feat
(path, wd)[source]¶ Calculate neuron features using neuron class
Parameters: path (str) – path to mapped annotation kzip Returns: cell type features, skeleton ID Return type: numpy.array, numpy.array
-
syconn.processing.cell_types.
draw_feat_hist
(wd, k=15, classes=(0, 1, 2, 3), nb_bars=20)[source]¶ Draws the histgoram(s) of the most k important feature(s)
Parameters: - wd (str) – Path to working directory
- k (int) – Number of features to be plotted
- classes (tuple) – Class labels to evaluate
- nb_bars (int) – Number of bars in histogram
-
syconn.processing.cell_types.
find_cell_types_from_dict
(wd, cell_type)[source]¶ Parameters: - wd (str) – Path to working directory
- cell_type (int) – label (0 = EA, 1 = MSN, 2 = GP, 3 = INT)
Returns: paths of cells of type cell_type.
Return type: list of str
-
syconn.processing.cell_types.
get_cell_type_classes_dict
()[source]¶ Returns: dictionary from integer label to full cell name as string Return type: dict
-
syconn.processing.cell_types.
get_cell_type_labels
()[source]¶ Cell type labels for HVC(0), LMAN(0), STN(0), MSN(1), GP(2), FS(3)
Returns: convetion dictionary for cell type labels (str), returns integer Return type: dict
-
syconn.processing.cell_types.
get_id_dict_from_skel_ids
(skel_ids)[source]¶ Calc dictionary to get new label (from 0 to len(skel_paths) from skeleton ID.
Parameters: skel_ids (list of int) – Returns: Dictionary to get new ID with old ID, dictionary to get old ID with new ID Return type: dict, dict
-
syconn.processing.cell_types.
load_cell_gt
(skel_ids, wd)[source]¶ Load cell types of skel ids :param skel_ids: :type skel_ids: list of int :param wd: :type wd: str
Returns: cell type labels Return type: np.array
-
syconn.processing.cell_types.
load_celltype_feats
(wd)[source]¶ Loads cell type feature and corresponding ids from dictionaries
Parameters: wd (str) – Path to working directory Returns: cell type features Return type: np.array
-
syconn.processing.cell_types.
load_celltype_gt
(wd, load_data=True, return_ids=False)[source]¶ Load ground truth of cell types
(HVC, LMAN, STN) => excitatory axons (0) (MSN) => medium spiny neuron (1) (GP) => pallidal-like neurons (2) (FS) => inhibitory interneuron (3)
Parameters: wd (str) – Path to working directory Returns: cell type features, cell type labels Return type: numpy.array, numpy.array
-
syconn.processing.cell_types.
load_celltype_preds
(wd)[source]¶ Loads cell type predictions and corresponding ids from dictionaries
Parameters: wd (str) – Path to working directory Returns: cell ids, cell labels Return type: np.array, np.array
-
syconn.processing.cell_types.
load_celltype_probas
(wd)[source]¶ Loads cell type probabilities and corresponding ids from dictionaries
Parameters: wd (str) – Path to working directory Returns: cell ids, cell label probabilities Return type: np.array, np.array
-
syconn.processing.cell_types.
predict_celltype_label
(wd)[source]¶ Predict celltyoe labels in working directory with pre-trained classifier in subfolder models/rf_celltypes/rf.pkl
Parameters: wd (str) – path to working directory
-
syconn.processing.cell_types.
save_cell_type_clf
(gt_path, clf_used='rf', load_data=True)[source]¶ Save axoness clf specified by clf_used to gt_directory.
Parameters: - gt_path (str) – path to cell type gt
- clf_used (str) –
- load_data (bool) –
syconn.processing.features¶
-
syconn.processing.features.
assign_property2node
(node, pred, property)[source]¶ Assign prediction of property to node
Parameters: - node (NewSkeletonNode) –
- pred (prediction appropriate to property) –
- property (property to change) –
-
syconn.processing.features.
calc_prop_feat_dict
(source, dist=6000)[source]¶ Calculates property feature
Parameters: - source (SkeletonAnnotation) –
- dist (int) –
Returns: - dict, list of str, bool
- Dictionary of property features, list of feature names, bool if spiness
- feature are given
-
syconn.processing.features.
celltype_axoness_feature
(anno)[source]¶ Calculates axones feature of mapped sekeleton for cell type prediction. These include proportion of axon, dendrite and soma pathlengths and maximum degree of soma nodes.
Returns: axoness features Return type: np.array (n x 4)
-
syconn.processing.features.
get_obj_density
(source, property='axoness_pred', value=1, obj='mito', return_abs_density=True)[source]¶ Calculate pathlength of nodes using edges
Parameters: - anno (list of SkeletonAnnotation) –
- property (str) – e.g. ‘axoness_pred’
- value (int) – value of property to check
- obj (str) – mito/vc/sj
- return_abs_density (bool) –
Returns: Return type: return: length in um
-
syconn.processing.features.
majority_vote
(anno, property='axoness', max_dist=6000)[source]¶ Smoothes (average using sliding window of 2 times max_dist and majority vote) property prediction in annotation, whereas for axoness somata are untouched.
Parameters: - anno (SkeletonAnnotation) –
- property (str) – which property to average
- max_dist (int) – maximum distance (in nm) for sliding window used in majority voting
-
syconn.processing.features.
morphology_feature
(source, max_nn_dist=6000)[source]¶ Calculates features for discrimination tasks of neurite identities, such as axon vs. dendrite or cell types classification. Estimated on interpolated skeleton nodes. Features are calculated with a sliding window approach for each node. Window is 2*max_nn_dist (nm).
Parameters: - source (str) – Path to anno or MappedSkeletonObject
- max_nn_dist (float) – Radius in which neighboring nodes are found and used for calculating features in nm.
Returns: two arrays of features for each node. number of nodes x 28 (22 radius feature and 6 object features), bool if spiness feature are given
Return type: numpy.array, numpary.array, list of int, bool
-
syconn.processing.features.
node_branch_end_distance
(nml, dist)[source]¶ Set distances to next branch resp. end point for each node (distance is capped by given parameter dist) in .data dictionary of each node and returns values with node ids
Parameters: - nml (SkeletonAnnotation) –
- dist (int) – maximum distance value to occur
Returns: distances to nearest end/branch point, node ids
Return type: np.array, np.array
-
syconn.processing.features.
nodes_in_pathlength
(anno, max_path_len)[source]¶ Find nodes reachable in max_path_len from source node, calculated for every node in anno.
Parameters: - anno (AnnotationObject) –
- max_path_len (float) – Maximum distance from source node
Returns: - list of lists containing reachable nodes in max_path_len where
- outer list has length len(anno.getNodes())
-
syconn.processing.features.
objfeat2skelnode
(node_coords, node_radii, node_ids, nearby_node_list, obj_dict, scaling)[source]¶ Calculate features of UltrastructuralDatasetObjects along Skeleton
Parameters: - node_coords (np.array) –
- node_radii (np.array) –
- node_ids (np.array) –
- nearby_node_list (list of list of SkeletonNodes) –
- obj_dict (UltrastructuralDataset) –
- scaling (tuple) –
Returns: The two features are absolute number of assigned objects and mean voxel size of the objects
Return type: np.array (dimension nb_skelnodes x 2)
-
syconn.processing.features.
pathlength_of_property
(anno, property, value)[source]¶ Calculate pathlength of nodes with certain property value
Parameters: - anno (SkeletonAnnotation) – mapped cell tracing
- property (str) – spiness / axoness
- value (int) – classification result, e.g. 0, 1, 2
Returns: length (in um)
Return type: int
-
syconn.processing.features.
propertyfeat2skelnode
(node_list)[source]¶ Calculate nodewise radius feature
Parameters: node_list (list) – grouped nodes Returns: number of nodes times 22 features, containing mean radius, sigma of radii, 20 hist features Return type: np.array
-
syconn.processing.features.
radfeat2skelnode
(nearby_node_list)[source]¶ Calculate nodewise radius feature
Parameters: nearby_node_list (list of list of SkeletonNodes) – grouped tracing nodes Returns: array of number of nodes times 22 features, containing mean radius, sigma of radii, 20 hist features, spinehead features and whether spiness features are returned Return type: np.array, np.array, bool
-
syconn.processing.features.
radius_feats_from_nodes
(nodes, nb_bins=10, max_rad=5000)[source]¶ Calculates mean, std and histogram features
Parameters: - nodes (list of SkeletonNodes) –
- nb_bins (int) – Number of bins for histogram features
- max_rad (int) – maximum radius to plot on histogram x-axis
Returns: radius features with dim. nb_bins+2
Return type: np.array
-
syconn.processing.features.
sj_per_spinehead
(anno)[source]¶ Calculate number of sj per spinehead. Iterate over all mapped sj objects and find nearest skeleton node. If skeleton node has spiness prediction == 1 (spinehead) then increment counter of this node by one. After the loop sum over all counter and divide by the number of nodes which have at least one sj assigned.
:param : :type : param anno: SkeletonAnnotation
Returns: Average number of sj per spinehead (assumes there is no spinehead without sj) Return type: float
-
syconn.processing.features.
spiness_feats_from_nodes
(nodes)[source]¶ Calculates spiness feats including abs. number of spineheads, mean and standard deviation (std) of spinehead size, mean spinehead probability and mean and std of spineneck lengths.
Parameters: nodes (list of SkeletonNodes) – Returns: spiness features, dim. of 6 Return type: np.array
syconn.processing.learning_rfc¶
-
syconn.processing.learning_rfc.
cell_classification
(node_pred)[source]¶ Perform majority vote
Parameters: node_pred (np.array of int) – arbitrary array of integer Returns: maximum occurring integer in array Return type: int
-
syconn.processing.learning_rfc.
feature_importance
(rf, save_path=None)[source]¶ Plots feature importance of sklearn RandomForest
Parameters: - rf (RandomForestClassifier) –
- save_path (str) –
-
syconn.processing.learning_rfc.
fscore
(rec, prec, beta=1.0)[source]¶ Calculates f-score with beta value
Parameters: - rec (np.array) – recall
- prec (np.array) – precision
- beta (float) – weighting of precision
Returns: f-score
Return type: np.array
-
syconn.processing.learning_rfc.
load_csv2feat
(fpath, prop='axoness')[source]¶ Load csvfile from kzip and return numpy array and list of header names. List line is supposed to be the probability prediction.
Parameters: - fpath (str) – Source file path
- prop (str) – property which should be loaded
Returns: features, feature names
Return type: np.array, list of str
-
syconn.processing.learning_rfc.
load_rfcs
(rf_axoness_p, rf_spiness_p)[source]¶ Loads pickeled Random Forest Classifier for axoness and spiness. If path is not valid returns None
Parameters: - rf_axoness_p (str) – Path to pickeled axonnes rf directory
- rf_spiness_p (str) – Path to pickeled spiness rf directory
Returns: Return type: RFC axoness, spiness or None
-
syconn.processing.learning_rfc.
loo_proba
(x, y, clf_used='rf', use_pca=False, params=None)[source]¶ Perform leave-one-out
Parameters: - x (np.array) – features
- y (np.array) – labels
- clf_used (str) – classifier
- use_pca (bool) – perform principal component analysis on features x in advance
- params (dict) – parameter for classifier
Returns: class probability, hard classification
Return type: np.array, np.array
-
syconn.processing.learning_rfc.
plot_corr
(x, y, title='', xr=[-1, -1], yr=[-1, -1], save_path=None, nbins=5, xlabel='Size x', ylabel='Size y')[source]¶
-
syconn.processing.learning_rfc.
plot_pr
(precision, recall, title='', r=[0.67, 1.01], legend_labels=None, save_path=None, nbins=5, colorVals=None, xlabel='Recall', ylabel='Precision', l_pos='lower left', legend=True, r_x=[0.67, 1.01], ls=22)[source]¶
-
syconn.processing.learning_rfc.
save_train_clf
(X, y, clf_used, dir_path, use_pca=False, params=None)[source]¶ Train classifier specified by clf_used to dir_path. Train with features X and labels y
Parameters: - X (np.array) – features
- y (np.array) – labels
- clf_used (str) – ‘rf’ or ‘svm’ for RandomForest or SupportVectorMachine, respectively
- dir_path (str) – directory where to save pkl files of clf
- use_pca (bool) – flag if pca should be performed
- params (dict) – parameters for classifier
syconn.processing.mapper¶
-
class
syconn.processing.mapper.
SkeletonMapper
(source, dh, ix=None, soma=None, context_range=6000)[source]¶ Bases:
object
Class to handle mapping of cell objects (mitochondria, vesicle clouds, synaptic clefts) to tracings. Mapping parameters are saved as attributes.
-
soma
¶ SkeletonAnnotation – Soma tracing
-
old_anno
¶ SkeletonAnnotation – original tracing where estimated cell radius is saved at each node
-
anno
¶ SkeletonAnnotation – interpolated tracing skeleton for hull calculation
-
mitos/vc/sj
segmentationDataset – Dictionaries in which mapped cell objects are saved
-
ix
¶ int – mapped skeleton id
-
write_obj_voxel
¶ bool – write object voxel to kzip as binary file
-
annotate_object
(objects, radius, method, objtype)[source]¶ Redirects mapping task to desired method-function
Parameters: - objects (UltrastructuralDataset) –
- radius (int) – Radius of kd-tree in units of nm.
- method (str) – either ‘hull’, ‘kd’ or ‘supervoxel’
- objtype (string) – characterising object type
Returns: mapped object ID’s
Return type: list
-
annotate_objects
(dh, radius=1200, method='hull', thresh=2.2, filter_size=(0, 0, 0), nb_neighbors=20, nb_hull_vox=500, neighbor_radius=220, detect_outlier=True, nb_rays=20, nb_voting_neighbors=100, max_dist_mult=1.4)[source]¶ Creates self.object with annotated objects as UltrastructuralDataset, where object is in {mitos, vc, sj}
Parameters: - dh (DataHandler) – object containing SegmentationDataObjects mitos, vc, sj
- radius (int) – Radius in nm. Single integer if integer radius is for all objects the same. If list of three integer stick to ordering [mitos, vc, sj].
- method (str) – Either ‘kd’ for fix radius or ‘hull’/’supervoxel’ if membrane is available.
- thresh (float) – Denotes the factor which is multiplied with the maximum membrane probability. The resulting value is used as threshold after which the membrane is assumed to be existant.
- filter_size (int) – List of integer for each object [mitos, vc, sj]
- nb_neighbors (int) – minimum number of neighbors needed during outlier detection for a single hull point to survive.
- nb_hull_vox (int) – Number of object hull voxels which are used to estimate spatial proximity to skeleton (inside or outside).
- neighbor_radius (int) – Radius (nm) of ball in which to look for supporting hull voxels. Used during outlier detection.
- detect_outlier (bool) – use outlier-detection if True.
- nb_rays (int) – Number of rays send at each skeleton node (multiplied by a factor of 5). Defines the angle between two rays (=360 / nb_rays) in the orthogonal plane.
- nb_voting_neighbors (int) – Number votes of skeleton hull voxels (membrane representation) for object-mapping. Used for vc and mitos during geometrical position estimation of object nodes.
- max_dist_mult (float) – Multiplier for radius to estimate maximal distance of hull points to source node.
-
calc_myelinisation
()[source]¶ Calculates myelinisation at each node and writes it to node.data[“myelin_pred”]
-
cset
¶
-
get_plot_obj
()[source]¶ Extracts coordinates from annotated SegmentationObjects
Returns: object-voxels for each object Return type: np.array
-
hull_coords
¶ Scaled hull coordinates of skeleton membrane
Returns: Coordinate each hull point Return type: np.array
-
hull_normals
¶ Normal for each hull point pointing outwards
Returns: Normal vector of each hull point pointing outwards Return type: np.array
-
hull_sampling
(thresh=2.2, nb_rays=20, nb_neighbors=20, neighbor_radius=220, detect_outlier=True, max_dist_mult=1.4)[source]¶ Calculates hull of tracing
Parameters: - thresh (float) – factor of maximum occurring prediction value after which membrane is triggered active.
- nb_rays (int) – Number of rays send at each skeleton node (multiplied by a factor of 5). Defines the angle between two rays (=360 / nb_rays) in the orthogonal plane.
- nb_neighbors (int) – minimum number of neighbors needed during outlier detection for a single hull point to survive.
- neighbor_radius (int) – Radius of ball in which to look for supporting hull voxels. Used during outlier detection.
- detect_outlier (bool) – use outlier-detection if True.
- max_dist_mult (float) – Multiplier for radius to generate maximal distance of hull points to source node.
Returns: Average radius per node in (9,9,20) corrected units estimated by rays propagated through Membrane prediction until threshold reached.
Return type: numpy.array
-
predict_property
(rf, prop, max_neck2endpoint_dist=3000, max_head2endpoint_dist=600)[source]¶ Predict property (axoness, spiness) of tracings
Parameters: - rf (RandomForestClassifier) –
- prop (str) – property name
- max_neck2endpoint_dist (int) –
- max_head2endpoint_dist (int) –
-
property_features
¶ Getter of property features, calculates axoness/spiness features if necessary
Returns: property features, if spiness feature are given Return type: np.array, bool
-
skel_radius
¶ Radius of membrane at each skeleton node
Returns: cell radius at self.nodes Return type: np.array
-
-
syconn.processing.mapper.
calc_syn_dict
(features, axoness_info, get_all=False)[source]¶ Creates dictionary of synapses. Keys are ids of pre cells and values are dictionaries of corresponding synapses with post cell ids.
Parameters: - features (np.array) – synapse feature
- axoness_info (np.array) – string containing axoness information of cells
- get_all (bool) – collect all contact sites
Returns: synapse features, axoness information, connectivity, post synaptic cell ids, synapse predictions, axoness
Return type: np.array, np.array, dict, np.array, np.array, dict
-
syconn.processing.mapper.
cs_btw_annos
(anno_a, anno_b, max_hull_dist, concom_dist)[source]¶ Computes contact sites between two annotation objects and returns hull points of both skeletons near contact site.
Parameters: - anno_a (SkeletonAnnotation) – Annotation object A
- anno_b (SkeletonAnnotation) – Annotation object B
- max_hull_dist (int) – Maximum distance between skeletons in nm
- concom_dist (int) – maximum distance of connected components (nm)
Returns: List of hull coordinates for each contact site
Return type: list
-
syconn.processing.mapper.
feature_valid_syns
(cs_dir, only_sj=True, only_syn=True, all_contacts=False)[source]¶ Returns the features of valid synapses predicted by synapse rfc
Parameters: - cs_dir (str) – Path to computed contact sites.
- only_sj (bool) – Return feature of all contact sites with mapped sj.
- only_syn (bool) – Returns feature only if synapse was predicted
- all_contacts (bool) – Use all contact sites for feature extraction
Returns: features, array of contact site IDS, boolean array of synapse prediction
Return type: np.array (n x f), np.array (n x 1), np.array (n x 1)
-
syconn.processing.mapper.
max_nodes_in_path
(anno, source_node, max_number)[source]¶ Find specified number of nodes along skeleton from source node (BFS).
Parameters: - anno (SkeletonAnnotation) – tracing on which to search
- source_node (SkeletonNode) – Starting node
- max_number (int) – Maximum number of nodes
Returns: Tracing nodes up to certain distance from source node
Return type: list of SkeletonNodes
-
syconn.processing.mapper.
node_id2key
(segdataobject, node_ids, filter_size)[source]¶ Maps list indices in node_ids to keys of SegmentationObjects. Filters objects bigger than filter_size.
:param : :type : param segdataobject: UltrastructuralDataset of object type currently processed :param : :type : param node_ids: List of list containing annotated object ids for each node :param : :type : param filter_size: int minimum number of voxels of object
Returns: objects keys Return type: list
-
syconn.processing.mapper.
outlier_detection
(point_list, min_num_neigh, radius)[source]¶ Finds hull outlier using point density criterion
Parameters: - point_list (list) – List of coordinates
- min_num_neigh (int) – Minimum number of neighbors, s.t. hull-point survives.
- radius (int) – Radius in nm to look for neighbors
Returns: Cleaned point cloud
Return type: numpy.array
-
syconn.processing.mapper.
prepare_syns_btw_annos
(pairwise_paths, dest_path, max_hull_dist=60, concom_dist=300)[source]¶ Checks pairwise for contact sites between annotation objects found at paths in nml_list. Adds sj, vc and nearest skeleton nodes to found contact sites. Writes ‘contact_sites.nml’ to nml-path containing contact sites of all nml’s.
Parameters: - pairwise_paths (list of str) – List of pairwise paths to nml’s
- dest_path (str) – Path to directory where to store result of synapse mapping
- max_hull_dist (float) – maximum distance between skeletons in nm
- concom_dist (float) – Maximum distance of connected components (nm)
-
syconn.processing.mapper.
read_pair_cs
(pair_path)[source]¶ Helper function to collect pairwise contact site information. Extracts axoness prediction.
Parameters: pair_path (str) – path to pairwise contact site kzip Returns: annotation object without contact site hull voxel Return type: SkeletonAnnotation
-
syconn.processing.mapper.
readout_cs_info
(args)[source]¶ Helper function of feature_valid_syns
Parameters: args (tuple) – path to file and queue Returns: synapse features, contact site ID Return type: np.array, str
-
syconn.processing.mapper.
similarity_check
(skel_a, skel_b)[source]¶ If absolute number of identical nodes is bigger then certain threshold return similar.
Parameters: - skel_a (SkeletonAnnotation) – Skeleton a
- skel_b (SkeletonAnnotation) – Skeleton b
Returns: skel_a and skel_b are similar
Return type: bool
-
syconn.processing.mapper.
syn_btw_anno_pair
(params)[source]¶ Get synapse information between two mapped annotation objects. Details are written to pairwise nml (all contact sites between pairs contained) and to nml for each contact site.
Parameters: - params (list) – [path_a, path_b, max_hull_dist, concom_dist]
- path_a (str) – path to mapped annotation object
- path_b (str) – path to mapped annotation object
- max_hull_dist (float) – maximum distance between skeletons (nm)
- concom_dist (float) – maximum distance of connected components (nm)
syconn.processing.spiness¶
-
syconn.processing.spiness.
assign_neck
(anno, max_head2endpoint_dist=600, max_neck2endpoint_dist=3000)[source]¶ Assign nodes between spine head node and first node with degree 2 as spine necks inplace. head (1) and shaft (0). Key for prediction is “spiness_pred”
Parameters: - anno (SkeletonAnnotation) – mapped cell tracing
- max_head2endpoint_dist (int) – maximum distance between spine head and endpoint on graph
- max_neck2endpoint_dist (int) – maximum distance between spine neck and endpoint on graph
syconn.processing.synapticity¶
-
syconn.processing.synapticity.
calc_syn_feature
(gt_samples, ignore_keys=['Barrier', 'Skel'], new_data=False, test_data=False, detailed_cs_dir='/lustre/pschuber/m_consensi_rr/nml_obj/contact_sites_new3/')[source]¶ collect synpase feature of all contact sites. Additionally, ground truth values if test_data is True. :param gt_samples: List of paths to contact sites :param ignore_keys: Which keys to ignore in string if collecting GT value :param new_data: outdated :param test_data: whether to collect GT value :param detailed_cs_dir: path to folder containing the contact sites :return:
-
syconn.processing.synapticity.
pairwise_syn_feature_calc
(args)[source]¶ Helper function for calc_syn_feature. Collects feature of contact site. :param args: path to contact sites, list of ingore keys, path to contact_sites folder, q of multiprocess manager, bool new data(old), bool test_data (whether to collect gt_value) :return: synapse feature, ground truth value
-
syconn.processing.synapticity.
parse_synfeature_from_node
(node)[source]¶ Parases values of features from string. :param node: node with values of feature_names :return: array of float values for each feature
-
syconn.processing.synapticity.
parse_synfeature_from_txt
(txt)[source]¶ Parases values of features from string. :param txt: String with values of feature_names, like ‘area1.5_dist2.3’ :return: array of float values for each feature
syconn.processing.initialization¶
-
syconn.processing.initialization.
initialize_cset
(kd, home_path, chunksize)[source]¶ Initializes a ChunkDataset
Parameters: - kd (KnossosDataset) – KnossosDataset instance of the corresponding raw data
- home_path (str) – path to main folder
- chunksize (np.array) – size of each chunk; typically in the order of ~ [1000, 1000, 500]
Returns: cset
Return type: ChunkDataset
syconn.processing.objectextraction¶
-
syconn.processing.objectextraction.
apply_merge_list
(cset, chunk_list, filename, hdf5names, merge_list_dict, debug, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Applies merge list to all chunks
Parameters: - cset (chunkdataset instance) –
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- merge_list_dict (dictionary) – mergedict for each hdf5name
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
-
syconn.processing.objectextraction.
calculate_chunk_numbers_for_box
(cset, offset, size)[source]¶ Calculates the chunk ids that are (partly) contained it the defined volume
Parameters: - cset (ChunkDataset) –
- offset (np.array) – offset of the volume to the origin
- size (np.array) – size of the volume
Returns: - chunk_list (list) – chunk ids
- dictionary (dict) – with reverse mapping
-
syconn.processing.objectextraction.
concatenate_mappings
(cset, filename, hdf5names, debug=False, suffix='')[source]¶ Combines all map dicts
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
-
syconn.processing.objectextraction.
create_datasets_from_objects
(cset, filename, hdf5names, debug=False, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Create dataset instance from objects
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
-
syconn.processing.objectextraction.
create_objects_from_voxels
(cset, filename, hdf5names, granularity=15, debug=False, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Creates object instances from extracted voxels
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- granularity (int) – Defines granularity for partitioning data for multiprocessing
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
-
syconn.processing.objectextraction.
extract_voxels
(cset, filename, hdf5names, debug=False, chunk_list=None, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Extracts voxels for each component id
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
-
syconn.processing.objectextraction.
from_ids_to_objects
(cset, filename, hdf5names, chunk_list=None, debug=False, offset=None, size=None, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Main function for the object extraction step; combines all needed steps
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- offset (np.array) – offset of the volume to the origin
- size (np.array) – size of the volume
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
-
syconn.processing.objectextraction.
from_probabilities_to_objects
(cset, filename, hdf5names, overlap='auto', sigmas=None, thresholds=None, chunk_list=None, debug=False, swapdata=0, label_density=array([ 1., 1., 1.]), offset=None, size=None, membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Main function for the object extraction step; combines all needed steps
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- overlap (str or np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps; if ‘auto’ the overlap is calculated from the sigma and the stitch_overlap (here: [1., 1., 1.])
- sigmas (list of lists or None) – Defines the sigmas of the gaussian filters applied to the probability maps. Has to be the same length as hdf5names. If None no gaussian filter is applied
- thresholds (list of float) – Threshold for cutting the probability map. Has to be the same length as hdf5names. If None zeros are used instead (not recommended!)
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- swapdata (boolean) – If true an x-z swap is applied to the data prior to processing
- label_density (np.array) – Defines the density of the data. If the data was downsampled prior to saving; it has to be interpolated first before processing due to alignment issues with the coordinate system. Two-times downsampled data would have a label_density of [2, 2, 2]
- offset (np.array) – offset of the volume to the origin
- size (np.array) – size of the volume
- membrane_filename (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Filename of the prediction in the chunkdataset. The threshold is currently set at 0.4.
- membrane_kd_path (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Path to the knossosdataset containing a membrane segmentation. The threshold is currently set at 0.4.
- hdf5_name_membrane (str) – When using the membrane_filename this key has to be given to access the data in the saved chunk
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
-
syconn.processing.objectextraction.
from_probabilities_to_objects_parameter_sweeping
(cset, filename, hdf5names, nb_thresholds, overlap='auto', sigmas=None, chunk_list=None, swapdata=0, label_density=array([ 1., 1., 1.]), offset=None, size=None, membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, qsub_pe=None, qsub_queue=None)[source]¶ Sweeps over different thresholds. Each objectextraction resutls are saved in a seperate folder, all intermediate steps are saved with a different suffix
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- nb_thresholds (integer) – number of thresholds and therefore runs of objectextractions to do; the actual thresholds are equally spaced
- overlap (str or np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps; if ‘auto’ the overlap is calculated from the sigma and the stitch_overlap (here: [1., 1., 1.])
- sigmas (list of lists or None) – Defines the sigmas of the gaussian filters applied to the probability maps. Has to be the same length as hdf5names. If None no gaussian filter is applied
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- swapdata (boolean) – If true an x-z swap is applied to the data prior to processing
- label_density (np.array) – Defines the density of the data. If the data was downsampled prior to saving; it has to be interpolated first before processing due to alignment issues with the coordinate system. Two-times downsampled data would have a label_density of [2, 2, 2]
- offset (np.array) – offset of the volume to the origin
- size (np.array) – size of the volume
- membrane_filename (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Filename of the prediction in the chunkdataset. The threshold is currently set at 0.4.
- membrane_kd_path (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Path to the knossosdataset containing a membrane segmentation. The threshold is currently set at 0.4.
- hdf5_name_membrane (str) – When using the membrane_filename this key has to be given to access the data in the saved chunk
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str) – qsub parallel environment name
- qsub_queue (str or None) – qsub queue name
-
syconn.processing.objectextraction.
gauss_threshold_connected_components
(cset, filename, hdf5names, overlap='auto', sigmas=None, thresholds=None, chunk_list=None, debug=False, swapdata=False, label_density=array([ 1., 1., 1.]), membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, fast_load=False, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Extracts connected component from probability maps 1. Gaussian filter (defined by sigma) 2. Thresholding (defined by threshold) 3. Connected components analysis
In case of vesicle clouds (hdf5_name in [“p4”, “vc”]) the membrane segmentation is used to cut connected vesicle clouds across cells apart (only if membrane segmentation is provided).
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- overlap (str or np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps; if ‘auto’ the overlap is calculated from the sigma and the stitch_overlap (here: [1., 1., 1.])
- sigmas (list of lists or None) – Defines the sigmas of the gaussian filters applied to the probability maps. Has to be the same length as hdf5names. If None no gaussian filter is applied
- thresholds (list of float) – Threshold for cutting the probability map. Has to be the same length as hdf5names. If None zeros are used instead (not recommended!)
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- swapdata (boolean) – If true an x-z swap is applied to the data prior to processing
- label_density (np.array) – Defines the density of the data. If the data was downsampled prior to saving; it has to be interpolated first before processing due to alignment issues with the coordinate system. Two-times downsampled data would have a label_density of [2, 2, 2]
- membrane_filename (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Filename of the prediction in the chunkdataset. The threshold is currently set at 0.4.
- membrane_kd_path (str) – One way to allow access to a membrane segmentation when processing vesicle clouds. Path to the knossosdataset containing a membrane segmentation. The threshold is currently set at 0.4.
- hdf5_name_membrane (str) – When using the membrane_filename this key has to be given to access the data in the saved chunk
- fast_load (boolean) – If true the data of chunk is blindly loaded without checking for enough offset to compute the overlap area. Faster, because no neighbouring chunk has to be accessed since the default case loads th overlap area from them.
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
Returns: - results_as_list (list) – list containing information about the number of connected components in each chunk
- overlap (np.array)
- stitch overlap (np.array)
-
syconn.processing.objectextraction.
make_merge_list
(hdf5names, stitch_list, max_labels)[source]¶ Creates a merge list from a stitch list by mapping all connected ids to one id
Parameters: - hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- stitch_list (dictionary) – Contains pairs of overlapping component ids for each hdf5name
- dictionary (max_labels) – Contains the number of different component ids for each hdf5name
Returns: - merge_dict (dictionary) – mergelist for each hdf5name
- merge_list_dict (dictionary) – mergedict for each hdf5name
-
syconn.processing.objectextraction.
make_stitch_list
(cset, filename, hdf5names, chunk_list, stitch_overlap, overlap, debug, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Creates a stitch list for the overlap region between chunks
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- overlap (np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps
- stitch_overlap (np.array) – Defines the overlap with neighbouring chunks that is left for stitching
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue
- Returns –
- -------- –
- stitch_list (list) – list of overlapping component ids
-
syconn.processing.objectextraction.
make_unique_labels
(cset, filename, hdf5names, chunk_list, max_nb_dict, chunk_translator, debug, suffix='', qsub_pe=None, qsub_queue=None)[source]¶ Makes labels unique across chunks
Parameters: - cset (chunkdataset instance) –
- filename (str) – Filename of the prediction in the chunkdataset
- hdf5names (list of str) – List of names/ labels to be extracted and processed from the prediction file
- chunk_list (list of int) – Selective list of chunks for which this function should work on. If None all chunks are used.
- max_nb_dict (dictionary) – Maps each chunk id to a integer describing which needs to be added to all its entries
- chunk_translator (boolean) – Remapping from chunk ids to position in chunk_list
- debug (boolean) – If true multiprocessed steps only operate on one core using ‘map’ which allows for better error messages
- suffix (str) – Suffix for the intermediate results
- qsub_pe (str or None) – qsub parallel environment
- qsub_queue (str or None) – qsub queue