👋 If you need to download some data from the NIMH Data Arhive but do not have access to it, you’ve come to the right place!
Note: This tutorial uses the PING dataset as an example for illustration purposes.
You will need a valid PennKey in order to create a new data use agreement.
You can access the most recent version here. Make sure that you select Data Use Certification (DUC) to download the form.
Next, you need to fill out the form (don’t forget to add yourself as a recipient), and have Ted sign it.
Research Inventory System is the web-based application for requesting, routing, and issuing various research related agreements.
Start by clicking on the system and providing your PennKey login. You should then see the following:
Go to DATA: If you want to receive/send data for research from/to an outside party, click and click .
Are you receiving or are you sending data?
Since we are requesting data access, we select receiving.
Type of data
In the case of PING, the type of data is “de-identified health information”.
Please provide brief description of data
Please provide basic description of your requested dataset.
Can the data be obtained from an alternative source?
In the case of PING, the answer is No.
Are you aware of any export restrictions regarding the data?
Just select No.
In our case, faculty member is Theodore Satterthwaite, and the department is Neuropsychiatry.
Consult the team for business administrator.
Consult the team to come up with a research plan. This should be a short paragraph. You can either enter that plan in the blank field or upload it as an attachment.
This request is associated with
- Federally sponsored/federal flow through research
- Corporate sponsored clinical trial or human subjects research
- Foundations/other non-profit sponsored research
- Other corporate sponsored research
PING is funded by the National Institute of Mental Health (NIMH) which is a federal agency. Hence we select Federally sponsored/federal flow through research
Are your requests (materials, data, equipment, confidential information , services and/or collaboration) related or connected to a sponsored research agreement (SRA), a clinical trial agreement (CTA), a government or grant funded project, or will it be charged to a fund?
In the case of PING, the data request is for the purpose of scientific research. The research is neither related nor connected to a sponsored research agreement (SRA), a clinical trial agreement (CTA), a government or grant funded project. Hence we select No
What is the likelihood of an invention resulting from the research?
- May possibly result in an invention
- Is likely to result in an invention
- Is unlikely to result in an invention
In the case of PING, the data request is for the purpose of scientific research that would not lead to any invention.**
Is the research related to any of your previous or pending disclosures of inventions?
In the case of PING, the data request is for the purpose of scientific research that is not related to any disclosures of inventions. Hence we select No
Modify as necessary for your specific requested dataset.
Below is how you would fill out the data information for PING.
Modify as necessary for your specific requested dataset.
Upload the signed and dated data use agreement form that you prepared at the beginning by clicking Upload outside party agreement.
Select No for the question and you can click on submit.
Just relax and wait for a final approval email to pop up in your inbox. The subject line would read like
MTA/NMA (58028/00): Request fully executed.
Per the new NDA policy, The only way to properly submit a DUC is for the Lead Recipient to upload the document to the appropriate DAR on their Data Permissions dashboard (https://nda.nih.gov/user/dashboard/data_permissions.html).
Now you should contact the Lead Recipient for the DUC to execute the following steps:
visit the Data Permissions dashboard (https://nda.nih.gov/user/dashboard/data_permissions.html)
click ‘Reapply for Access’ under the ‘Actions’ dropdown for the Data access request
locate the updated DUC file
upload the DUC to Data Access Request.
Once you receive the
Permission Approved Email from NIMH, pat yourself on the back and enjoy a moment of relief. However, you are not yet done and please read on.
After you’ve been granted access, you need to log in to the NDA website to review and accept terms of data use. The prompt window should look like below:
Make sure you click on
Note 1: For security purposes, the AWS access key ID, AWS secret access key and AWS session token referenced below are all FAKE values. Please replace them with your real ID/key.
Note 2: This tutorial is partially adapted from the NDA’s tutorial How to Access S3 Objects.
The command line downdownload manager is written in Java and allows for the automatic download of files. We can use
wget to retrieve the program from the NIH web server.
Since the downloaded program is a zipped file, we would need to unzip the program to use the download manager.
We specify the name of the output file
awskeys.txt with the
java -jar downloadmanager.jar -g awskeys.txt
You will be prompted for your ndar username and password, after which you will be able to find your keys in the file
You will see something like below from your
accessKey=xxxxx secretKey=xxxxx sessionToken=xxxxx expirationDate=xxxxx
For instance, navigate to
~/.aws/credentials and paste in the
awskeys.txt according to the format below:
[NDAR] aws_access_key_id = xxxxx[copy and paste the value of accessKey in `awskeys.txt`] aws_secret_access_key = xxxxx[copy and paste the value of secretKey in `awskeys.txt`] aws_session_token = xxxxx[copy and paste the value of sessionToken in `awskeys.txt`]
Now you should be good to go! Shoot us an email if you still run into issues.