Project Information and Reproducibility Guide
View the Project on GitHub PennLINC/thalamocortical_development
Valerie J. Sydnor
Theodore D. Satterthwaite
Matthew Cieslak
Frank Yeh, Bart Larsen, Deanna Barch, Michael Arcaro, Sydney Covitz, Raquel E. Gur, Ruben C., Gur, Russell T. Shinohara, Allyson P. Mackey
December 2022
Manuscript in preparation
RBC-PNC and RBC-HCPD
https://github.com/PennLINC/thalamocortical_development
https://github.com/PennLINC/thalamocortical_development/tree/main/thalamocortical_autotrack_template (see below for use instructions with dsi-studio’s autotrack)
The project directory on cubic is: /cbica/projects/thalamocortical_development
The directory structure within the project directory is as follows:
code: directory with the thalamocortical_development github repo
cortical_anatomy: PNC and HCPD freesurfer tabulate anatomical statistics
Maps: surface parcellation files (parcellations/), S-A axis github repo (S-A_ArchetypalAxis/), fluctuation amplitude development maps (boldamplitude_development/), myelin development maps (myelin_development/), E:I ratio development maps (EI_development/), thalamic Cpt gradient (thalamusgradient_CPt_muller/)
sample_info: sample demographics, environment data, and final project participant lists
software: project software
Templates: MNI template and HCP-1065 YA FIB templates
thalamocortical_results: GAM outputs for developmental and environmental effects
thalamocortical_structuralconnectivity/template: thalamocortical template tractography
thalamocortical_structuralconnectivity/individual: PNC and HCPD autotrack outputs
qsirecon_0.16.0RC3: PNC and HCPD qsirecon clones with dsi-studio gqi and fib outputs
The analytic and statistical workflow implemented in this research is described below and links to all corresponding code on github are provided. This workflow begins with creation of an atlas of human thalamocortical connections, preprocessing and reconstruction of PNC and HCPD diffusion MRI data, generation of individual-specific thalamocortical connections, quantification and harmonization of thalamocortical connectivity metrics, and examination of group-level and individual-level thalamocortical anatomy characteristics. The workflow continues with the fitting of generalized additive models to study relationships between thalamocortical connectivity, age, and the environment and analyses aimed at characterizing thalamocortical structural connectivity development and its influence on hierarchical cortical development and organization along the sensorimotor-association axis.
A novel thalamocortical structural connectivity tractography atlas was generated using a high quality diffusion template derived from HCPYA data (N = 1,065, multi-shell acquisition parameters: b-values = 1000, 2000, 3000, 90 directions per shell, 1.25mm isotropic voxels). This population-average template, downloaded from here, is a 1.25 mm isotropic diffusion template in ICBM152 space generated with q-space diffeomorphic reconstruction (MNI-space version of generalized q-sampling imaging).
The thalamocortical structural connectivity tractography atlas was generated in the following steps:
--threshold_index=qa #use qa for thresholding fiber tracking
--fa_threshold=0 #select a random qa value threshold as termination criterion for each streamline
--turning_angle=0 #select a random turning angle threshold between 15-90 degrees as termination criterion for each streamline
--step_size=0 #select a random step size from 0.5 to 1.5 voxels for each streamline
--min_length=10 #minimum required length of streamlines
--max_length=300 #maximum allowed length of streamlines
--method=0 #streamline tracking
--otsu_threshold=0.45 #Otsu's threshold
--smoothing=1 #select a random smoothing amount between 0% to 95% for each streamline; smoothing uses previous propagation vector directional information
--tip_iteration=0 #turn off topology-informed pruning
--random_seed=1 #set the random seed for fiber tracking
To use this atlas with dsi-studio’s autotrack to generate thalamocortical connections in individual participant data, both the “ICBM152_adult.tt.gz” (autotrack tracts) and “ICBM152_adult.tt.gz.txt” (tract name list) files are required and must be located in the location expected by the dsi-studio software.
On a Mac, this location is
dsi_studio/dsi_studio.app/Contents/MacOs/atlas/ICBM152_adult
In a container, this location is
/opt/dsi-studio/atlas/ICBM152_adult
. To use these files with a dsi-studio container, bind a local directory containing the contents of atlas/ICBM152_adult with these thalamus-specific .tt.gz and .tt.gz.txt files to the container directory (e.g., -B /cbica/projects/thalamocortical_development/software/thalamocortical_autotrack_template/dsi-studio/atlas/ICBM152_adult/:/opt/dsi-studio/atlas/ICBM152_adult). Or, bind the individual thalamus-specific .tt.gz and .tt.gz.txt files to their corresponding original files in /opt/dsi-studio/atlas/ICBM152_adult.** It is highly recommended that you use this thalamocortical tractography atlas with the same autotrack parameters validated here for participant-specific data (see section “Delineation of Individual-Specific Thalamocortical Connections” below). These parameters include –otsu_threshold=0.5, –smoothing=1, –tolerance=10, –tip_iteration=0, –track_voxel_ratio=4, –check_ending=0, and (for command line usage) –yield_rate=0.0000001. Modifying the tolerance parameter may be reasonable to delineate even stricter trajectory-based pathways (lower the tolerance) or to allow for potential greater individual-specific differences in anatomy to emerge (increase the threshold). **
Diffusion MRI data were preprocessed with qsiprep (0.14.2 for PNC; 0.16.1 for HCPD) as follows:
$ singularity run --cleanenv -B ${PWD} pennlinc-containers/.datalad/environments/qsiprep-${version}/image inputs/data prep participant --stop-on-first-crash --fs-license-file code/license.txt --skip-bids-validation --participant-label "$subid" --unringing-method mrdegibbs --output-resolution ${res} #res = 1.8 in PNC, 1.5 in HCPD
Diffusion MRI data were reconstructed using the dsi_studio_gqi reconstruction workflow with qsirecon (0.16.0RC3 for PNC and HCPD) as follows:
$ singularity run --cleanenv -B ${PWD} pennlinc-containers/.datalad/environments/qsiprep-0-16-0RC3/image inputs/data/qsiprep/qsiprep qsirecon participant --participant_label $subid -recon-input inputs/data/qsiprep/qsiprep --fs-license-file code/license.txt --stop-on-first-crash --recon-only --skip-odf-reports --freesurfer-input inputs/data/fmriprep/freesurfer --recon-spec ${PWD}/code/gqi_hsvs.json
Preprocessing and reconstruction workflows were executed with datalad using the template scripts in /qsiprep, including /PNC/qsiprep_call_PNC.sh, /HCPD/qsiprep_call_HCPD.sh, /PNC/qsirecon_call_PNC.sh, /HCPD/qsirecon_call_HCPD. Datalad outputs were cloned for use in this project using the scripts in /datalad.
Person-specific thalamocortical structural connections were delineated for PNC and HCPD participants using the thalamic tractography atlas and dsi-studio’s autotrack. A relatively stringent Hausdorff distance threhold was used; the selected threshold balanced the recovery of person-specific anatomy with mitigation of false positive streamlines and regionally non-specific streamlines. The script /thalamocortical_structuralconnectivity/individual/thalamocortical_autotrack.sh was run twice for every participant, first to generate participant -> template registration files (gqi.fib.gz.icbm152_adult.map.gz) for use with autotrack and then to reconstruct all thalamocortical connections.
For PNC, registration was accomplished with /thalamocortical_structuralconnectivity/individual/PNC/autotrack_registration_PNC.sh and autotrack tract generation was executed with /thalamocortical_structuralconnectivity/individual/PNC /run_autotrack_PNC.sh.
For HCPD, registration was accomplished with /thalamocortical_structuralconnectivity/individual/HCPD/autotrack_registration_HCPD.sh and autotrack tract generation was executed with /thalamocortical_structuralconnectivity/individual/HCPD/run_autotrack_HCPD.sh.
--otsu_threshold=0.5 #Otsu's threshold for tracking
--smoothing=1 #select a random smoothing amount between 0% to 95% for each streamline; smoothing uses previous propagation vector directional information
--tolerance=10 #Hausdorff distance threhold
--tip_iteration=0 #no topology-informed pruning
--track_voxel_ratio=4 #the track-voxel ratio for the total number of streamline count. Increased from the default of 2 to facilitate better mapping (at the expense of greater computation time)
--check_ending=0 #don't remove tracts that terminate in high anisotropy areas
--export_stat=1 #write out connection statistics file
--export_trk=1 #write out the reconstructed connection as a tractography file
--yield_rate=0.0000001 #yield rate that must be met before fiber tracking is terminated and no output is generated
--export_template_trk=1 #write out reconstructed connection in dsi-studio template space
Note, thalamocortical_autotrack.sh was run twice (first to generate registration files, second to run autotrack on all atlas connections) due to how the threading parameter interacted with the registration process in dsi-studio. In recent versions of dsi-studio, greedy resource usage during registration was fixed, thus this procedure could now be run all in one step.
Diffusion MRI-derived connectivity metrics (FA, MD, streamline count) and gene expression-derived thalamic gradient values (thalamus calbindin-parvalbumin CPt gradient) were extracted for every participant’s autotrack-generated thalamocortical connections, using the following code:
$ python thalamocortical_dwimeasures.py PNC
$ bash thalamocortical_CPtvalues_jobs.sh HCPD
$ python thalamocortical_CPtvalues.py HCPD
1358 PNC participants and 640 (Lifespan 2.0 release) HCPD participants had dominant group diffusion MRI acquisitions (i.e., non-variant CuBIDS acquisitions) and were considered for inclusion in this research. The following exclusion criterion were then applied to generate the final samples of 1145 PNC participants and 572 HCPD participants:
- health history exclusions, for example history of cancer, MS, seizures, or incidentally-encountered brain structure abnormalities
- T1 quality exclusion, based on visual QC
- diffusion acquisition exclusion for missing runs (HCPD only)
- diffusion quality exclusion, based on the neighborhood correlation (nc) of the preprocessed, T1w-aligned diffusion data. Note, nc values differ by sampling scheme, thus different thresholds were used in PNC and HCPD. Thresholds were chosen based on nc histograms by selecting a value that cut off the low-quality (left-skewed) data tail.
- diffusion scan head motion exclusion, based on mean framewise displacement (threshold = 1). Note, ~6% of the presently retained sample was excluded for both PNC and HCPD following diffusion quality and head motion exclusions
- An age exclusion (< 8 years old) was also applied to HCPD in order to match ages across samples and directly assess reproducibility. This excluded only 2.4% of the final HCPD sample participants, and thus has the additional benefit of not biasing gam smooth fits to very few data points at the lower end of the age range
Final study samples were constructed following the criterion outlined above in /sample_construction/PNC/finalsample_PNC.Rmd and /sample_construction/HCPD/finalsample_HCPD.Rmd. This sample construction procedure utilized diffusion scan acqusition, quality, and head motion information provided in qsiprep ImageQC_dwi.csvs, which were collated into dataset-specific diffusion QC metric .csvs with /sample_construction/PNC/diffusion_qcmetrics_PNC.py and /sample_construction/HCPD/diffusion_qcmetrics_HCPD.py
To facilitate analysis of participant-level thalamocortical connectivity data, dataset-specific demographics + diffusion dataframes and dataset-specific tract lists were generated.
Thalamocortical connection anatomy was examined at the atlas and the group level to understand characteristics of reconstructed thalamic connections and to uncover cortical and thalamic anatomical gradients. This exploration included quantifying cortical surface coverage for the thalamocortical atlas (atlas-based); comparing the surface area and sulcal depth of cortical parcels with versus without thalamic connections represented in the atlas (atlas-based); surveying variation in thalamocortical connectivity strength across Mesulam cortical types and the S-A axis (group-based in PNC and HCPD); and testing whether thalamic endpoint core-matrix gradients values differed across Mesulam types and the S-A axis (group-based in PNC and HCPD). These analyses were conducted in /results/tract_anatomy/thalamocortical_connection_anatomy.Rmd; a knit version of thalamocortical_connection_anatomy.html can be viewed here.
Analyses were undertaken to characterize thalamocortical connection maturational patterns and to investigate whether thalamocortical connectivity development aligns with hierarchical axes of cortical developmental heterochronicity and plasticity marker maturation. Generalized additive models were used to delineate developmental trajectories, quantify age-related change, and identify the age of thalamocortical connectivity maturation. GAM analyses utilized the functions in /gam_functions/GAM_functions_thalamocortical.R, including:
These functions were implemented with dataset-specific thalamocortical connectivity metrics and demographics data for GAM-based analyses in /gam_functions/fit_ageGams.R. In general, this code applies the above functions to fit age GAMs for each thalamocortical connection of interest and saves out the results.
After fitting developmental models, the potential contribution of thalamocortical connectivity maturation to hierarchical development along the cortex’s S-A axis was investigated in /results/developmental_effects/thalamocortical_connectivity_development.Rmd, which can be viewed here. This investigation included quantifying the similarity of developmental effects across datasets (PNC and HCPD), visualizing cortex-wide and region-specific thalamic connectivity developmental profiles and derivative heterochronicity, ascribing functions to cortical regions with early and late maturing thalamic connections, testing how maturational timing varied across the S-A axis (and anatomical axes), and examining how thalamocortical connection maturation relates to the development of non-invasive readouts of plasticity, including development of fMRI-derived E:I ratio, T1/T2 ratio-indexed myelin growth, and fMRI-proxied intrinsic fluctuation amplitude.
GAMs were employed to explore associations between thalamocortical connection properties and youths’ household-level and neighborhood-level socioeconomic conditions (while controlling for age). Models were fit in /gam_functions/fit_envGams.R. Household-level SES was proxied by parental education; neighborhood-level SES was determined based on a factor analysis of geocoded neighborhood environment indicators (e.g., median family income, percent employed, population density, percent in poverty). Environment GAMs used additional functions from [/gam_functions/GAM_functions_thalamocortical.R], including:
Significant associations between environmental conditions and thalamocortical connectivity properties were identified and spatial variability in their cortical embedding along S-A and anatomical axes was classified in /results/environment_effects /thalamocortical_connectivity_environment.Rmd, which can be viewed here.
The following external software was used in this project: