Project Title
Penn LEAD: An open, fully-processed, longitudinal data resource to study transdiagnostic executive function
Brief Project Description
Executive function (EF) is a crucial aspect of human development. Deficits in EF that emerge in adolescence represent a transdiagnostic symptom associated with many forms of psychopathology, including attention deficit hyperactivity disorder (ADHD) and psychosis- spectrum (PS). There are relatively few open data resources specifically tailored to evaluate the development of EF across clinical diagnoses. Here, we introduce a new data resource that combines longitudinal multi-modal imaging data (sMRI, dMRI, fMRI, ASL, & MEGRE for QSM) with rich clinical and cognitive phenotyping data.
Project Lead
Brooke L. Sevchik
Faculty Lead
Theodore D. Satterthwaite
Collaborators
Golia Shafiei, Kristin Murtha, Sophia Linguiti, Lia Brodrick, Juliette B.H. Brook, Matt Cieslak, Elizabeth Flook, Kahini Mehta, Steven L. Meisler, Kosha Ruparel, Sage Rush, Taylor Salo, S. Parker Singleton, Tien T. Tong, Mrugank Salunke, Dani S. Bassett, Monica E. Calkins, Mark A. Elliott1, Raquel E. Gur, Ruben C. Gur, Tyler M. Moore, J. Cobb Scott, Russell T. Shinohara, M. Dylan Tisdall, Daniel H. Wolf, David R. Roalf, Theodore D. Satterthwaite
Project Start Date
September 2024
Current Project Status
In preparation for submission
Github repo
https://github.com/PennLINC/transdiagnostic_executive_function
Path to data on filesystem
/cbica/projects/executive_function
Slack Channel
#efr01_grmpyr01_opendata
Conference presentations
- Flux Congress, September 2025
Code documentation
General overview of project/data organization steps are below, including information about the scripts necessary for each step in the workflow and the folders in which they can be found in the corresponding GitHub repository. Specific details about scripts and individual steps can be found in the README.md files in each folder in the corresponding GitHub repository.
Imaging Data:
- Download the data from Flywheel project using fw sync
/curation/01_call_fw_sync.sh
- Create a heuristic file and use HeuDiConv to convert dicom files to NIfTI files, and organize data in BIDS format.
/curation/02_heudiconv_conversion/01_heuristic.pyand02_heudiconv_conversion/02_heuristic_reconvert.pycontain necessary heuristic files/curation/02_heudiconv_conversion/02_convert_all_heudiconv.shand02_heudiconv_converstion/02_convert_all_heudiconv_reconvert.shcontain necessary bash scripts that use the heuristic file to convert dicoms to NIfTIs in BIDS format.
- Use CuBIDS software to fix incorrect metadata, add in missing metadata, clean metadata, delete unecessary repeated runs of scans, ensure correct BIDS format, summarize the heterogeneity in the dataset, and organize scans into different acquisition groups based on their metadata. Also, score n-back task data.
/curation/03_cubids_curation/contains Python scripts used to edit metadata in dataset, as well as the configurationconfig.ymlfor CuBIDS software/curation/03_cubids_curation/final_cubids_docscontains the final output files from running CuBIDS/curation/05_nback_scoringcontains the code and instructions for generating events.tsv and events.json files for n-back task scoring.
- Anonymize scans (reface T1 scans & deface T2 scans) using AFNI’s refacer and pydeface.
/curation/04_reface_anatomicals.sh
- Preprocess the imaging data using BABS software.
/preprocessing/babs_yaml_filescontains the yaml files for each BIDS App we ran/preprocessing/make_container_babs.shis an optional helper script to make a container for BIDS Apps- Note that MEGRE sequences for QSM were not preprocessed; we did not run any BIDS Apps on them through BABS
- Complete quality control (using python scripts to concatenate data from individual scans into summary csv files and visualize the distribution of QC metrics for each modality) on the preprocessed scans and note which images are of poor quality or high quality.
/analysis/unzipfolder contains scripts necessary to unzip the files needed to grab the QC metrics for each modality, which should be run before the scripts inQC/qc_scripts/QC/qc_scriptscontains the Python scripts necessary for generating concatenated csv files with QC metrics for each modality and visualize distributions of the QC metrics, as well as a script to create and visualize the slices necessary for manual T1 QC ratings/QC/qc_csvsand/QC/qc_distribution_figscontain the csv files and plots that are the output of scripts in/QC/qc_scripts./QC/qc_csvs/final_QC_csvscontain our final QC recommendations./QC/exclusions_csvsfolder also contains csv files with a list of the exclusions resulting from QC decisions, which are then used later in Python scripts for creating group average figures after excluding those scans or regions from the group average.
- Create final group average figures for publication using Python scripts. Scans that were rated as poor quality (did not pass QC) are not included in the final group average figures.
/analysis/01_unzipcontains scripts necessary to unzip the files for individual scans needed to create the group average plots. This should be run before the plotting scripts/analysis/02_plotcontains scripts used to create group average plots and maps, as well as a script to create the reconstructed tracts for a subset of subjects/neuroimaging_figurescontains the figures that are the output of running the scripts in/analysis/02_plot, organized by imaging modality
Clinical and Demographic Data:
- Clean and organize clinical data into usable format. Note that other members of the team were consulted to correct any mistakes in original clinical data. Clinical diagnoses were visualized using a sankey plot.
/clinical/clinical_diagnostic_distribution.Rmdsummarizes clinical diagnostic information and produces the sankey plot visualization stored in/clinical/clinical_figures
- Clean and organize demographic data into usable format. Note that other members of the team were consulted to correct any mistakes in original demographic data. Demographic information was visualized with bar plots and histograms.
/demographics/demographics_org.Rmdorganizes, summarizes, and plots demographics data, as well as corrects any mistakes in original demographic data. The resulting plots are stored in/demographics/demographic_figures
- Clean and organize cognitive phenotypic data into usable format.
/cognitivefolder contains relevant scripts for this purpose.