This project encompasses a series of simulations and analytical derivations related to the geometric properties of spheres and the generation of 3D biological tissue structures. The primary goals are:
- To understand the distribution of apparent 2D cross-section radii (
$r$ ) obtained from slicing spheres of true 3D radii ($R$ ). - To infer the true sphere radii (
$R$ ), or parameters of their distribution, from observed cross-section radii$r$ . - To simulate realistic 3D structures of nodules and vascular networks using various stochastic models.
The repository includes Python code for sphere-related simulations and R code for tissue generation, along with detailed notes and visualization scripts.
This project simulates and analyzes the geometric properties of spheres, focusing on inferring true 3D sphere radii (
Consider a single sphere of a fixed, known radius
The key assumptions are:
- The distance
$d$ from the center of the sphere to the slicing plane is uniformly distributed over the interval$[0, R]$ . We consider only one hemisphere due to symmetry, so$d \sim U(0, R)$ . The probability density function (PDF) for$d$ is$f_D(d) = \frac{1}{R}$ for$0 \le d \le R$ . - The radius of the dissection
$r$ is related to$R$ and$d$ by the Pythagorean theorem:$r^2 + d^2 = R^2$ . Thus,$r = \sqrt{R^2 - d^2}$ . Correspondingly,$d = \sqrt{R^2 - r^2}$ .
To find the distribution of the dissection radius
Therefore, the analytical PDF for the dissection radius modules.py as analytical_pdf(r_set, big_r).
In a more complex scenario, we consider a population of spheres where the true radii
When each of these spheres is dissected, we observe a set of dissection radii modules.py as marginal_r(r, theta, k) (where theta corresponds to k to
A primary goal of the project is to estimate the true sphere radius
If we have a set of observed dissection radii const_case.py within the big_r_est function.
When the true radii
The project explores the relationship between the parameters of the Gamma distribution fitted to the observed gamma_case.py and visualized in viz.py (e.g., params_shift function).
Beyond spherical dissections, the project includes modules for simulating 3D biological tissue structures.
Nodule simulation aims to generate clusters of points in 3D space. This is achieved using a Poisson cluster process:
- Cluster Centroids: The locations of nodule centers are first generated from a 3D Poisson point process within a defined volume.
- Points around Centroids: For each centroid, a number of points (cells) are generated around it. The distribution of these points relative to the centroid can be controlled, for example, by drawing a radius for each cluster (e.g., from an Exponential distribution) and then distributing points within that sphere according to another Poisson process.
This is implemented in
code/nodule.r.
Vessel simulation aims to create branching, tree-like structures in 3D. Two main approaches are mentioned:
- Brownian Random Walk: A simpler model where vessel paths are generated using a 3D Brownian motion, with branching occurring probabilistically. This is explored in
code/vessel.py. - Kent Distribution Model: A more sophisticated approach for directional control of branching. The Kent distribution is a directional distribution on the unit sphere, allowing for more realistic branching patterns.
- An iterative process generates a tree of vessel segment centroids.
- At each step, a new segment direction can be drawn from a Kent distribution.
- Branching can occur with a certain probability at the tip of existing segments.
- Once the vessel skeleton (centroids) is formed, points representing vessel cells are distributed around these segments, often using a Poisson process within a certain radius of the segment.
This model is primarily implemented in
code/vessel.r, utilizing helper functions fromcode/module.r.
The project is organized into Python scripts, R scripts, a Jupyter Notebook for orchestrating Python simulations, and data/note files.
The Python part of the project focuses on the simulation of sphere dissections and inference of sphere radii.
const_case.py
-
Purpose: Simulates dissections for a single sphere with a constant radius
$R$ and provides functions to estimate$R$ . -
Key Functions:
-
sim(big_r, n_sim): Simulatesn_simdissection radii$r$ for a sphere of radiusbig_r($R$ ). Returns a NumPy array of$r$ values. -
big_r_est(big_r, n_sim, stats=False): Estimates the true radius$R$ using the mean estimator formula ($\hat{R} = \text{mean}(r) \cdot 4/\pi$ ) based onn_simsimulated dissections. Ifstats=True, prints the estimate and error.
-
-
Usage:
from const_case import sim, big_r_est true_R = 10.0 num_simulations = 1000 # Simulate dissection radii r_values = sim(true_R, num_simulations) # Estimate R estimated_R = big_r_est(true_R, num_simulations, stats=True) print(f"Simulated r values (first 5): {r_values[:5]}") print(f"Estimated R: {estimated_R}")
gamma_case.py
-
Purpose: Simulates dissections when true sphere radii
$R$ follow a Gamma distribution ($R \sim \text{Gamma}(\theta, k)$). Includes functions for fitting Gamma distributions to observed$r$ values and attempting to estimate the parameters of the original$R$ distribution. -
Key Functions:
-
sim_gamma(theta, k, n_fol): Simulatesn_foldissection radii$r$ , where each corresponding true radius$R$ is drawn from$\text{Gamma}(\text{theta}, \text{k})$ . Returns a NumPy array of$r$ values. -
gamma_shift(theta, k, n_fol): Simulates$r$ values usingsim_gamma, then fits a Gamma distribution to these$r$ values. Returns the fitted shape (a), scale (scale), and the raw$r$ values. -
big_r_gamma_est(theta, k, n_fol): Simulates$r$ values, applies the mean estimator ($R_{est} = r \cdot 4/\pi$ ) to each$r$ to get estimates of$R$ , then fits a Gamma distribution to these$R_{est}$ values. Prints estimated vs. true parameters.
-
-
Usage:
from gamma_case import sim_gamma, gamma_shift, big_r_gamma_est shape_R = 2.0 scale_R = 3.0 num_follicles = 1000 # Simulate r values where R ~ Gamma(shape_R, scale_R) r_values_gamma = sim_gamma(shape_R, scale_R, num_follicles) # Fit Gamma to r values fitted_shape_r, fitted_scale_r, _ = gamma_shift(shape_R, scale_R, num_follicles) print(f"Fitted Gamma params for r: shape={fitted_shape_r:.2f}, scale={fitted_scale_r:.2f}") # Estimate parameters of R distribution big_r_gamma_est(shape_R, scale_R, num_follicles)
modules.py
- Purpose: Contains core mathematical/analytical functions used by other Python modules.
-
Key Functions:
-
analytical_pdf(r_set, big_r): Calculates the analytical PDF$f(r|R) = \frac{1}{R} \cdot \frac{r}{\sqrt{R^2 - r^2}}$ for a given set of$r$ values (r_set) and a fixed true radiusbig_r($R$ ). -
marginal_r(r, theta, k): Calculates the marginal PDF$f(r)$ for a dissection radius$r$ , given that the true radii$R$ follow$\text{Gamma}(\text{theta}, \text{k})$ . This involves numerical integration.
-
-
Usage: These functions are typically called internally by
viz.pyor other simulation modules.
viz.py
-
Purpose: Handles visualization of simulation results from
const_case.pyandgamma_case.py. Generates and saves various plots. -
Key Functions:
-
pdf_const(big_r, n_sim): Plots the histogram of simulated$r$ values (fromconst_case.sim), overlays the analytical PDF and a fitted Beta distribution. Prints AIC values for comparison. Saves plot toimages/pdf_const.pdf. -
est_box_const(big_r, n_sim_set, n_rep): Generates boxplots comparing$R$ estimates for different numbers of simulations (n_sim_set), repeatedn_reptimes. Saves plot toimages/est_box_const.pdf. -
pdf_gamma(theta, k, n_fol): Plots the histogram of simulated$r$ values (fromgamma_case.sim_gamma) and overlays the analytical marginal PDF. Saves plot toimages/pdf_gamma.pdf. -
viz_gamma_shift(theta, k, n_fol, stats=False): Plots the PDF of the true$R \sim \text{Gamma}(\text{theta}, \text{k})$ against the PDF of the Gamma distribution fitted to the simulated$r$ values. Saves plot toimages/viz_gamma_shift.pdf. -
viz_gamma_grid(theta_set, k_set, n_fol, mode): Creates a grid of plots, calling eitherpdf_gammaorviz_gamma_shiftfor different combinations ofthetaandk. Saves plot toimages/viz_gamma_grid.pdf. -
params_shift(params_set, mode): Visualizes the relationship between true Gamma parameters ($\theta$ or$k$ ) and their estimated values derived from fitting the$r$ distribution. Saves plot toimages/params_shift.pdf.
-
-
Usage: Typically run via
main.ipynb, which calls these functions with appropriate parameters.
vessel.py
- Purpose: Implements a 3D branching Brownian motion model for simulating vessel-like structures. This seems to be one of the explored models for vessel generation.
- Key Functions:
walk(): Simulates and plots a single 1D Brownian motion path with smoothing.tree_walk(): Simulates a 1D branching Brownian motion, where paths can split. Plots multiple smoothed branches.space_walk(): Simulates a 3D branching Brownian motion. Plots smoothed 3D paths.
- Usage: Can be run as a script (
if __name__ == "__main__":).This will generate and display plots.python code/vessel.py
main.ipynb
- Purpose: A Jupyter Notebook that serves as the primary interface for running the Python-based sphere dissection simulations and visualizations.
- Structure:
- Imports necessary functions from
const_case.py,gamma_case.py, andviz.py. - Contains cells for setting parameters (e.g.,
big_r,n_sim,theta,k,n_fol, sets for grid plots). - Calls visualization functions from
viz.pyto generate and save plots.
- Imports necessary functions from
- Usage:
- Open
code/main.ipynbin a Jupyter Notebook environment. - Modify parameters in the code cells as needed.
- Run cells sequentially to execute simulations and generate visualizations. Plots will be saved to the
images/directory.
- Open
The R scripts focus on the simulation of 3D tissue structures (nodules and vessels).
code/nodule.r
- Purpose: Generates 3D "nodule" point patterns using a Poisson cluster process.
- Main Function:
nodule(int, ord, viz = FALSE, save = FALSE)int: Overall point intensity.ord: Simulation index (for labeling).viz: IfTRUE, renders a 3D WebGL HTML visualization (images/nodule.html).save: IfTRUE, writes centered coordinates todata/nodule_loc.csv.
- Usage:
source("code/nodule.r") # Example: Generate nodules, visualize, and save nodule_data <- nodule(int = 7e4, ord = 1, viz = TRUE, save = TRUE)
code/vessel.r
- Purpose: Simulates 3D "vessel" point patterns, likely using a branching model based on the Kent distribution for trajectory growth and Poisson-distributed points for vessel walls.
- Main Function:
vessel(int, ord, viz = FALSE, save = FALSE)int: Poisson intensity for wall-points.ord: Simulation identifier.viz: IfTRUE, renders a 3D WebGL HTML visualization (images/vessel_combined.html).save: IfTRUE, writes centered coordinates todata/vessel_loc.csv.
- Dependencies: Uses helper functions from
code/module.r(e.g., for Kent distribution points, spatial Poisson processes). - Usage:
source("code/vessel.r") # Example: Generate vessels, visualize vessel_data <- vessel(int = 5e-3, ord = 1, viz = TRUE, save = FALSE)
code/module.r
- Purpose: Contains shared helper functions used by other R scripts, particularly
vessel.r. - Key Functions (inferred from notes):
kent_point(): Generates points based on the Kent distribution for directional vessel growth.pois_in_space(): Simulates points in a 3D space (e.g., parallelepiped) according to a Poisson process, used for populating vessel walls.rotation(): Performs 3D rotations, likely used to align vessel segments.
- Usage: Sourced by other R scripts. Not typically run directly.
code/cultivate.r
- Purpose: Appears to be a master R script for running multiple tissue simulations (either "nodule" or "vessel" type), saving their outputs, and potentially generating summary plots (e.g., boxplot of cell counts).
- Functionality (inferred from
notes/simulations.md):- Sets parameters like intensity, number of simulations, and simulation type ("nodule" or "vessel").
- Loops through replicate simulations, calling
nodule.rorvessel.r. - Saves individual simulation outputs (e.g., coordinates to
data/sim_<i>/selected_cell_coordinates.csv). - May generate summary plots (e.g.,
images/<type>_cell_counts_boxplot.png).
- Usage:
Parameters like simulation type and number of runs are typically set inside the script.
Rscript code/cultivate.r
archive/test.r
- Purpose: Likely an older R script used for testing specific functionalities or experimental code. Its exact current relevance might be limited.
data/nodule_loc.csv: Stores X, Y, Z coordinates and cluster/cell labels for points generated bycode/nodule.rwhensave=TRUE.data/vessel_loc.csv: Stores X, Y, Z coordinates and cluster/cell labels for points generated bycode/vessel.rwhensave=TRUE.- The
code/cultivate.rscript might also generate simulation-specific CSV files in subdirectories likedata/sim_<i>/.
The Python scripts rely on several common scientific computing libraries. It's recommended to use a virtual environment (e.g., venv or conda) for managing dependencies.
Dependencies:
- Python (3.6+ recommended)
- NumPy: For numerical operations.
- SciPy: For scientific and technical computing (used for
gamma.fit,quadintegration,savgol_filter). - Matplotlib: For plotting.
- Seaborn: For enhanced statistical visualizations.
- Jupyter Notebook or JupyterLab: For running
main.ipynb.
Installation:
- Create and activate a virtual environment (optional but recommended):
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install Python packages:
You can install the packages using pip:
(If you prefer Jupyter Notebook over JupyterLab, replace
pip install numpy scipy matplotlib seaborn jupyterlab
jupyterlabwithnotebook).
The R scripts require R to be installed, along with several packages from CRAN.
Dependencies:
- R (latest version recommended)
rgl: For 3D interactive visualizations (nodules, vessels).htmlwidgets: For savingrglvisualizations as HTML files.fifer: (Mentioned in notes forvessel.r, though its specific use isn't detailed in the provided snippets. May be for specific statistical functions or data manipulation).spatstat: (Mentioned in notes forvessel.r, likely for spatial statistics or point process generation).compositions: (While not explicitly insimulations.md, thearchive/test.ruseslibrary(compositions), so it might be relevant for some parts or older code).
Installation:
- Install R: Download and install R from CRAN (The Comprehensive R Archive Network).
- Install R packages:
Open an R console and run the following commands:
install.packages(c("rgl", "htmlwidgets", "fifer", "spatstat", "compositions"))
Parameters for simulations (e.g., sphere radius, number of simulations, Gamma distribution parameters, intensities for tissue models) are generally hard-coded within the respective scripts or the main.ipynb notebook. You will need to edit these files directly to change simulation conditions.
The primary way to run the Python-based sphere dissection simulations and generate visualizations is through the Jupyter Notebook:
- Start JupyterLab or Jupyter Notebook:
If you installed JupyterLab:
Or for Jupyter Notebook:
jupyter lab
This will open a new tab in your web browser.jupyter notebook
- Navigate and open
code/main.ipynb. - Modify Parameters: Inside the notebook, locate the cells where parameters are defined (e.g.,
big_r,n_simfor the constant case;theta,k,n_folfor the Gamma case;n_sim_set,theta_set,k_setfor batch visualizations). Adjust these values as needed. - Run Cells: Execute the notebook cells sequentially (e.g., using "Run" -> "Run All Cells" or by running cells individually with Shift+Enter).
- Simulations will be performed.
- Visualization functions from
viz.pywill be called. - Output plots will be saved to the
images/directory (e.g.,pdf_const.pdf,est_box_const.pdf,pdf_gamma.pdf, etc.). - Some functions like
big_r_est(whenstats=True) orbig_r_gamma_estwill print results directly in the notebook.
The code/vessel.py script (Brownian motion vessel model) can be run directly as a Python script:
python code/vessel.pyThis will typically display Matplotlib plots on screen.
The R scripts for simulating nodules and vessels can be run from the command line using Rscript or interactively within an R environment.
1. Individual Nodule/Vessel Simulations:
You can source nodule.r or vessel.r in an R console and call their main functions:
# In an R console
# For Nodules:
source("code/nodule.r")
nodule_data <- nodule(int = 7e4, ord = 1, viz = TRUE, save = TRUE)
# This will generate images/nodule.html and data/nodule_loc.csv
# For Vessels:
source("code/vessel.r") # This will also source module.r if needed
vessel_data <- vessel(int = 5e-3, ord = 1, viz = TRUE, save = FALSE)
# This will generate images/vessel_combined.htmlModify the parameters (int, ord, viz, save) as needed.
2. Batch Tissue Simulations (cultivate.r):
The code/cultivate.r script is designed to run multiple simulations and save their outputs.
- Modify Parameters within
cultivate.r: Opencode/cultivate.rand adjust variables liketype("nodule" or "vessel"),int_nodorint_vess(intensities), andnum_sim(number of replicates). - Run from Command Line:
This will:
Rscript code/cultivate.r
- Create output folders (e.g.,
data/sim_<i>/). - Run the chosen simulation type
num_simtimes. - Save coordinate data for each simulation.
- Potentially generate a summary boxplot (e.g.,
images/<type>_cell_counts_boxplot.png).
- Create output folders (e.g.,
Important Notes for R Scripts:
- Ensure that the working directory is set to the root of the project so that relative paths like
code/module.ror../data/resolve correctly. If running from an R console, you might need to usesetwd()to the project root. When usingRscriptfrom the project root, paths should generally work as written in the scripts. - For visualizations (
viz = TRUE), R will attempt to open anrglwindow and save an HTML widget. This might require a graphical environment if not running on a local machine.
Here are a couple of quick examples to get started.
This example uses Python to simulate dissections from a single sphere and estimate its radius. It mirrors parts of what main.ipynb does.
- Ensure your Python environment is set up and you are in the project's root directory.
- Create a Python script, e.g.,
run_single_sphere_example.py:from code.const_case import sim, big_r_est from code.viz import pdf_const, est_box_const # For visualization import matplotlib.pyplot as plt # Required by viz module for showing plots if not in notebook # Parameters true_R = 7.5 num_simulations_for_pdf = 2000 # 1. Simulate and visualize PDF of apparent radii print(f"Running PDF visualization for R = {true_R} with {num_simulations_for_pdf} simulations...") plt.figure() # Create a new figure for the plot pdf_const(big_r=true_R, n_sim=num_simulations_for_pdf) plt.suptitle(f"PDF for R={true_R}, n_sim={num_simulations_for_pdf}") # Add a main title plt.show() # Display the plot print(f"Plot saved to images/pdf_const.pdf (or similar name from viz.py)") # 2. Estimate R and check convergence with boxplots print("\nRunning R estimation boxplots...") n_sim_set_for_boxplot = [50, 200, 1000] # Number of simulations per estimate num_repetitions_for_boxplot = 50 # Number of times R is estimated for each n_sim in set plt.figure() # Create a new figure for the plot est_box_const(big_r=true_R, n_sim_set=n_sim_set_for_boxplot, n_rep=num_repetitions_for_boxplot) plt.suptitle(f"R Estimation Boxplots for True R={true_R}") # Add a main title plt.show() # Display the plot print(f"Plot saved to images/est_box_const.pdf (or similar name from viz.py)") # 3. Single R estimation num_simulations_for_est = 500 estimated_R = big_r_est(true_R, num_simulations_for_est, stats=True) print(f"\nFor a single run with {num_simulations_for_est} simulations:") print(f"True R: {true_R}, Estimated R: {estimated_R:.3f}")
- Run the script from your terminal:
This will:
python run_single_sphere_example.py
- Print AIC values and save
images/pdf_const.pdf. - Save
images/est_box_const.pdf. - Print the single R estimation results.
- Display the plots on screen.
- Print AIC values and save
This example uses R to generate a 3D nodule structure, visualize it, and save the coordinates.
- Ensure your R environment is set up and you are in the project's root directory.
- Run the following in an R console or save it as an R script (e.g.,
run_nodule_example.R) and runRscript run_nodule_example.R:This will:# Set working directory to project root if not already there # setwd("/path/to/your/SimulationOfSpheres/") source("code/nodule.r") # Parameters intensity <- 5e4 # Lower intensity for a quicker example simulation_id <- "example1" cat("Simulating nodules...\n") nodule_output <- nodule( int = intensity, ord = simulation_id, viz = TRUE, # Generate images/nodule_example1.html (name might vary based on actual nodule.r implementation) save = TRUE # Save data/nodule_loc_example1.csv (name might vary) ) cat("Nodule simulation complete.\n") if (TRUE) { # Assuming viz = TRUE cat("Visualization saved to an HTML file in the 'images' directory (e.g., images/nodule.html or similar).\n") } if (TRUE) { # Assuming save = TRUE cat("Nodule coordinates saved to a CSV file in the 'data' directory (e.g., data/nodule_loc.csv or similar).\n") } # To inspect the returned data (if not saving, or to see it anyway) # print(head(nodule_output))
- Simulate nodule points.
- Create an interactive 3D HTML visualization in the
imagesdirectory. - Save the nodule coordinates to a CSV file in the
datadirectory. - Print messages to the console.
SimulationOfSpheres/
├── .gitignore
├── LICENSE
├── README.md # This file
├── archive/
│ └── test.r # Archived R test script
├── code/
│ ├── __pycache__/ # Python cache files
│ ├── const_case.py # Python: Single sphere (constant R) simulation
│ ├── cultivate.r # R: Master script for batch tissue simulations
│ ├── gamma_case.py # Python: Sphere set (Gamma R) simulation
│ ├── main.ipynb # Jupyter Notebook: Main runner for Python simulations
│ ├── module.r # R: Helper modules for R scripts (e.g., Kent distribution)
│ ├── modules.py # Python: Core mathematical/analytical functions
│ ├── nodule.r # R: Nodule simulation
│ ├── vessel.py # Python: Vessel simulation (Brownian motion model)
│ ├── vessel.r # R: Vessel simulation (Kent distribution model)
│ └── viz.py # Python: Visualization scripts
├── data/
│ ├── nodule_loc.csv # Example output: Nodule coordinates
│ └── vessel_loc.csv # Example output: Vessel coordinates
│ └── sim_<i>/ # Potential subdirectories from cultivate.r
│ └── selected_cell_coordinates.csv
├── images/ # Output directory for plots and visualizations
│ ├── pdf_const.pdf # Output from viz.py: PDF for constant R case
│ ├── est_box_const.pdf # Output from viz.py: Boxplot of R estimates
│ ├── pdf_gamma.pdf # Output from viz.py: PDF for Gamma R case
│ ├── viz_gamma_shift.pdf # Output from viz.py: Gamma distribution shift plot
│ ├── viz_gamma_grid.pdf # Output from viz.py: Grid of Gamma visualizations
│ ├── params_shift.pdf # Output from viz.py: Gamma parameter shift plot (Note: existing images/params_shift.png might be an older version)
│ ├── nodule.html # Output from nodule.r: 3D Nodule visualization
│ ├── vessel_combined.html # Output from vessel.r: 3D Vessel visualization
│ ├── *.png # Other existing images (e.g., est_box_const.png) or R-generated images
├── logs/
│ └── logs_slice_2025-03-19.txt # Example log file
├── notes/
│ ├── geometry.md # Detailed notes on sphere geometry simulations
│ └── simulations.md # Detailed notes on tissue simulations (R scripts)
└── run_single_sphere_example.py # Example script created for README
└── run_nodule_example.R # Example script created for README