Project Information and Reproducibility Guide
View the Project on GitHub PennLINC/thalamocortical_development
Valerie J. Sydnor
Theodore D. Satterthwaite
Joelle Bagautdinova and Matthew Cieslak
Bart Larsen, Aaron F. Alexander-Bloch, Michael J. Arcaro, Deanna M. Barch, Dani S. Bassett, Philip A. Cook, Sydney Covitz, Alexandre R. Franco, Raquel E. Gur, Ruben C. Gur, Allyson P. Mackey, Steven L. Meisler, Kahini Mehta, Michael P. Milham, Tyler M. Moore, Armin Raznahan, David R. Roalf, Taylor Salo, Gabriel Schubiner, Jakob Seidlitz, Russell T. Shinohara, James M. Shine, Fang-Cheng Yeh
December 2022
Manuscript in preparation
PNC, HCPD, HBN
https://github.com/PennLINC/thalamocortical_development
The thalamocortical tractometry atlas is available here. See below for implementation instructions
The project directory on cubic is: /cbica/projects/thalamocortical_development
The directory structure within the project directory is as follows:
code: directory with the thalamocortical_development github repo
cortical_anatomy: PNC and HCPD freesurfer tabulate anatomical statistics
figures: manuscript plots and images, compiled into final Figures
Maps: surface parcellation files (parcellations/), S-A axis github repo (S-A_ArchetypalAxis/), fluctuation amplitude development maps (boldamplitude_development/), myelin development maps (myelin_development/), E:I ratio development maps (EI_development/), thalamic Cpt gradient (thalamusgradient_CPt_muller/)
sample_info: sample demographics, environment data, and final project participant lists
software: project software
Templates: MNI template and HCP-1065 YA FIB templates
thalamocortical_results: GAM outputs for developmental and environmental effects
thalamocortical_structuralconnectivity/template: thalamocortical template tractography
thalamocortical_structuralconnectivity/individual: PNC, HCPD, and HBN autotrack outputs
qsirecon_0.16.0RC3: PNC and HCPD qsirecon clones with dsi-studio gqi and fib outputs
The analytic and statistical workflow implemented in this research is described below; links to corresponding code on github are provided. This workflow begins with creation of an atlas of human thalamocortical connections. It continues with preprocessing and reconstruction of PNC, HCPD, and HBN diffusion MRI data, generation of individual-specific thalamocortical connections in youth datasets, and quantification and harmonization of thalamocortical connectivity metrics. It then transitions to fitting of generalized additive models to study relationships between thalamocortical connectivity, age, and the environment and describes analyses aimed at characterizing thalamocortical structural connectivity development and its influence on hierarchical cortical development along the sensorimotor-association axis.
A novel thalamocortical structural connectivity tractography atlas was generated using a high quality diffusion template derived from HCPYA data (N = 1,065, multi-shell acquisition parameters: b-values = 1000, 2000, 3000, 90 directions per shell, 1.25mm isotropic voxels). This population-average template, downloaded from here, is a 1.25 mm isotropic diffusion template in ICBM152 space generated with q-space diffeomorphic reconstruction (MNI-space version of generalized q-sampling imaging).
The thalamocortical structural connectivity tractography atlas was generated in the following steps:
--threshold_index=qa #use qa for thresholding fiber tracking
--fa_threshold=0 #select a random qa value threshold as termination criterion for each streamline
--turning_angle=0 #select a random turning angle threshold between 15-90 degrees as termination criterion for each streamline
--step_size=0 #select a random step size from 0.5 to 1.5 voxels for each streamline
--min_length=10 #minimum required length of streamlines
--max_length=300 #maximum allowed length of streamlines
--method=0 #streamline tracking
--otsu_threshold=0.45 #Otsu's threshold
--smoothing=1 #select a random smoothing amount between 0% to 95% for each streamline; smoothing uses previous propagation vector directional information
--tip_iteration=0 #turn off topology-informed pruning
--random_seed=1 #set the random seed for fiber tracking
To use this atlas with dsi-studio’s autotrack to generate thalamocortical connections in individual participant data, both the “ICBM152_adult.tt.gz” (autotrack tracts) and “ICBM152_adult.tt.gz.txt” (tract name list) files are required and must be located in the location expected by the dsi-studio software.
On a Mac, this location is
dsi_studio/dsi_studio.app/Contents/MacOs/atlas/ICBM152_adult
In a container, this location is
/opt/dsi-studio/atlas/ICBM152_adult
. To use these files with a dsi-studio container, bind a local directory containing the contents of atlas/ICBM152_adult with these thalamus-specific .tt.gz and .tt.gz.txt files to the container directory (e.g., -B /cbica/projects/thalamocortical_development/software/thalamocortical_autotrack_template/dsi-studio/atlas/ICBM152_adult/:/opt/dsi-studio/atlas/ICBM152_adult). Or, bind the individual thalamus-specific .tt.gz and .tt.gz.txt files to their corresponding original files in /opt/dsi-studio/atlas/ICBM152_adult.** It is highly recommended that you use this thalamocortical tractography atlas with the same autotrack parameters validated here for participant-specific data (see section “Delineation of Individual-Specific Thalamocortical Connections” below). These parameters include –otsu_threshold=0.5, –smoothing=1, –tolerance=10, –tip_iteration=0, –track_voxel_ratio=4, –check_ending=0, and (for command line usage) –yield_rate=0.0000001. Modifying the tolerance parameter may be reasonable to delineate even stricter trajectory-based pathways (lower the tolerance) or to allow for potential greater individual-specific differences in anatomy to emerge (increase the threshold). **
Diffusion MRI data were preprocessed with qsiprep (0.14.2 for PNC and HBN; 0.16.1 for HCPD) as follows:
$ singularity run --cleanenv -B ${PWD} pennlinc-containers/.datalad/environments/qsiprep-${version}/image inputs/data prep participant --stop-on-first-crash --fs-license-file code/license.txt --skip-bids-validation --participant-label "$subid" --unringing-method mrdegibbs --output-resolution ${res} #res = 1.8 in PNC and HBN, 1.5 in HCPD
Diffusion MRI data were reconstructed using the dsi_studio_gqi reconstruction workflow with qsirecon (0.16.0RC3 for PNC, HCPD, HBN) as follows:
$ singularity run --cleanenv -B ${PWD} pennlinc-containers/.datalad/environments/qsiprep-0-16-0RC3/image inputs/data/qsiprep/qsiprep qsirecon participant --participant_label $subid -recon-input inputs/data/qsiprep/qsiprep --fs-license-file code/license.txt --stop-on-first-crash --recon-only --skip-odf-reports --freesurfer-input inputs/data/fmriprep/freesurfer --recon-spec ${PWD}/code/gqi_hsvs.json
Preprocessing and reconstruction workflows were executed with datalad using the template scripts in /qsiprep, including /PNC/qsiprep_call_PNC.sh, /HCPD/qsiprep_call_HCPD.sh,/HBN/qsiprep_call_HBN.sh,/PNC/qsirecon_call_PNC.sh, /HCPD/qsirecon_call_HCPD, and /HBN/qsirecon_call_HBN.sh. Datalad outputs were cloned for use in this project using the scripts in /datalad.
Person-specific thalamocortical structural connections were delineated for PNC, HCPD, and HBN participants using the thalamic tractography atlas and dsi-studio’s autotrack. A relatively stringent Hausdorff distance threhold was used; the selected threshold balanced the recovery of person-specific anatomy with mitigation of false positive streamlines and regionally non-specific streamlines. The script /thalamocortical_structuralconnectivity/individual/thalamocortical_autotrack.sh was run twice for every participant, first to generate participant -> template registration files (gqi.fib.gz.icbm152_adult.map.gz) for use with autotrack and then to reconstruct all thalamocortical connections.
For PNC, registration was accomplished with /thalamocortical_structuralconnectivity/individual/PNC/autotrack_registration_PNC.sh and autotrack tract generation was executed with /thalamocortical_structuralconnectivity/individual/PNC /run_autotrack_PNC.sh.
For HCPD, registration was accomplished with /thalamocortical_structuralconnectivity/individual/HCPD/autotrack_registration_HCPD.sh and autotrack tract generation was executed with /thalamocortical_structuralconnectivity/individual/HCPD/run_autotrack_HCPD.sh.
For HBN, registration was accomplished with /thalamocortical_structuralconnectivity/individual/HBN/autotrack_registration_HBN.sh and autotrack tract generation was executed with /thalamocortical_structuralconnectivity/individual/HBN/run_autotrack_HBN.sh.
--otsu_threshold=0.5 #Otsu's threshold for tracking
--smoothing=1 #select a random smoothing amount between 0% to 95% for each streamline; smoothing uses previous propagation vector directional information
--tolerance=10 #Hausdorff distance threhold
--tip_iteration=0 #no topology-informed pruning
--track_voxel_ratio=4 #the track-voxel ratio for the total number of streamline count. Increased from the default of 2 to facilitate better mapping (at the expense of greater computation time)
--check_ending=0 #don't remove tracts that terminate in high anisotropy areas
--export_stat=1 #write out connection statistics file
--export_trk=1 #write out the reconstructed connection as a tractography file
--yield_rate=0.0000001 #yield rate that must be met before fiber tracking is terminated and no output is generated
--export_template_trk=1 #write out reconstructed connection in dsi-studio template space
Note, thalamocortical_autotrack.sh was run twice (first to generate registration files, second to run autotrack on all atlas connections) due to how the threading parameter interacted with the registration process in dsi-studio. In recent versions of dsi-studio, greedy resource usage during registration was fixed, thus this procedure could now be run all in one step.
Diffusion MRI-derived connectivity metrics (FA, MD, streamline count) and gene expression-derived thalamic gradient values (thalamus calbindin-parvalbumin CPt gradient) were extracted for every participant’s autotrack-generated thalamocortical connections, using the following code:
$ python thalamocortical_dwimeasures.py PNC
$ bash thalamocortical_CPtvalues_jobs.sh HCPD
$ python thalamocortical_CPtvalues.py HCPD
1358 PNC participants, 640 (Lifespan 2.0 release) HCPD participants, and 1530 (Data Releases 1-9) HBN participants had dominant group diffusion MRI acquisitions (i.e., non-variant CuBIDS acquisitions) and were considered for inclusion in this research. The following exclusion criterion were then applied to generate the final samples of 1145 PNC participants, 572 HCPD participants, and 959 HBN participants:
- health history exclusions, for example history of cancer, MS, seizures, or incidentally-encountered brain structure abnormalities
- T1 quality exclusion, based on visual QC
- diffusion acquisition exclusion for missing runs (HCPD only)
- diffusion quality exclusion, based on the neighborhood correlation (nc) of the preprocessed, T1w-aligned diffusion data. Note, nc values differ by sampling scheme, thus different thresholds were used in PNC and HCPD. Thresholds were chosen based on nc histograms by selecting a value that cut off the low-quality (left-skewed) data tail.
- diffusion scan head motion exclusion, based on mean framewise displacement (threshold = 1). Note, ~6% of the presently retained sample was excluded for both PNC and HCPD following diffusion quality and head motion exclusions
- An age exclusion (< 8 years old) was also applied to HCPD and HBN in order to match ages across samples and more appositely assess reproducibility of developmental results.
Final study samples were constructed following the criterion outlined above in /sample_construction/PNC/finalsample_PNC.Rmd, /sample_construction/HCPD/finalsample_HCPD.Rmd, and /sample_construction/HBN/finalsample_HBN.Rmd. This sample construction procedure utilized diffusion scan acqusition, quality, and head motion information provided in qsiprep ImageQC_dwi.csvs, which were collated into dataset-specific diffusion QC metric .csvs with /sample_construction/PNC/diffusion_qcmetrics_PNC.py, /sample_construction/HCPD/diffusion_qcmetrics_HCPD.py, and /sample_construction/HBN/diffusion_qcmetrics_HBN.py.
To facilitate analysis of participant-level thalamocortical connectivity data, dataset-specific demographics + diffusion dataframes and dataset-specific tract lists were generated.
Thalamocortical connection anatomy was examined at the atlas and the group level (in PNC and HCPD) to understand characteristics of reconstructed thalamic connections and to uncover cortical and thalamic anatomical gradients. This exploration included quantifying cortical surface coverage for the thalamocortical atlas (atlas-based); comparing the surface area and sulcal depth of cortical parcels with versus without thalamic connections represented in the atlas (atlas-based); and testing whether thalamic connection core-matrix gradient values and FA values systematically varied across the S-A axis (group-based in PNC and HCPD). These analyses were conducted in /results/tract_anatomy/thalamocortical_connection_anatomy.Rmd; a knit version of thalamocortical_connection_anatomy.html can be viewed here.
Analyses were undertaken in PNC and HCPD to characterize normative thalamocortical connection maturational patterns and to investigate whether thalamocortical connectivity development aligns with hierarchical axes of cortical developmental plasticity. Generalized additive models were used to delineate developmental trajectories, quantify age-related change, and identify the age of thalamocortical connectivity maturation. GAM analyses utilized the functions in /gam_functions/GAM_functions_thalamocortical.R, including:
These functions were implemented with dataset-specific thalamocortical connectivity metrics and demographics data for GAM-based analyses in /gam_functions/fit_ageGams.R. In general, this code applies the above functions to fit age GAMs for each thalamocortical connection of interest and saves out the results.
After fitting developmental models, the potential contribution of thalamocortical connectivity maturation to hierarchical development along the cortex’s S-A axis was investigated in /results/developmental_effects/thalamocortical_connectivity_development.Rmd, which can be viewed here. This investigation included quantifying the similarity of developmental effects across datasets (PNC and HCPD), visualizing cortex-wide and region-specific thalamic connectivity developmental profiles and derivative heterochronicity, ascribing psychological functions to cortical regions with early and late maturing thalamic connections, testing how maturational timing varied across the S-A axis (and anatomical axes), and examining how thalamocortical connection maturation relates to the development of non-invasive readouts of plasticity, including development of fMRI-derived E:I ratio, T1/T2 ratio-indexed myelin growth, and fMRI-proxied intrinsic fluctuation amplitude.
GAMs were employed to explore associations between thalamocortical connection properties and youths’ household-level and neighborhood-level socioeconomic conditions (while controlling for developmental effects). Models were fit in /gam_functions/fit_envGams.R. Household-level SES was proxied by caregiver education and income-to-needs ratios. Neighborhood-level SES was determined based on a factor analysis of geocoded neighborhood environment indicators (e.g., median family income, percent employed, population density, percent in poverty). Environment GAMs used additional functions from [/gam_functions/GAM_functions_thalamocortical.R], including:
Significant associations between environmental conditions and thalamocortical connectivity properties were identified and spatial variability in their cortical embedding along S-A and anatomical axes was classified in /results/environment_effects /thalamocortical_connectivity_environment.Rmd, which can be viewed here.
After obtaining the core set of findings in PNC and HCPD, we evaluated the generalizability of results to the HBN, a sample of youth with clinically-significant youth psychopathology. Developmental GAMs for the HBN were fit in /gam_functions/fit_ageGams_HBN.R and environmental gams were run in /gam_functions/fit_envGams_HBN.R.
Code investigating alignment of developmental and environmental effects to the S-A axis in the HBN is in /results/HBN_replication/thalamocortical_connectivity_HBNreplication.Rmd, which can be viewed as an html here!
The following external software was used in this project:
And Fin :)