Readiness Script Download
Welcome to the download page for the scripts to assess readiness for participating in OHDSI oncology study studies.
What are these scripts for?
They collect statistics on concepts related to cancer by counting the unique pairs of source and standard concepts in the CONDITION_OCCURRENCE, PROCEDURE, DRUG, OBSERVATION AND MEASUREMENT tables of your OMOP CDM instance (any version 5 will do). They also retrieve the number of patients and the time span your database covers. Each script produces a table with four columns: (i) the table (or domain) abbreviation it found the source/standard concept pair, (ii) the source concept_id, (iii) the standard concept_id and (iv) the count.
Which one should be run?
You should definitely run general. If you have genomic data, even if only a few, you should run genomic. If you have filled the EPISODE table you should run episode. If you are not sure run all three.
How should they be run and what happens after it is done?
Run the scripts through the SQL client or command line SQL tool you are using (DBeaver, SQL Workbench etc.). You need to check into the database with read-only privileges. Before you execute you need to replace “@cdm_schema” with the schema name of your OMOP CDM instance. The script will produce the table with the 4 columns. Download it or copy and paste it into a file. You can create a csv, tab-delimited or excel file, with or without a header. Upload it through the upload page or email it to oncology@ohdsi.org. You will then get a detailed report about your data with a list of all issues that need to be addressed.
Is this safe? What about data protection?
The script does not collect any patient-related information, only a summary of the content of the entire database. It is not possible to re-identify a patient since no patient identifiers are obtained. It also does not count cases in your institution. You also need not worry about minimal cell sizes for the same reason. However, if you feel more comfortable to limit the cell size you can do so without any impact on the result of the readiness assessment.
| general | genomic | episodes |
|---|---|---|
| This queries for cancer concepts as well as the distribution of laboratory test values required for chemotherapy. | This script queries for genomic concepts (biomarkers, NGS data, proteomics data, IHC data) if you have any. | This script queries for concepts of disease or treatment episodes. This script will only run if you have an EPISODE table. |
| Download Postgres version | Download Postgres version | Download Postgres version |
| Download SQL Server version | Download SQL Server version | Download SQL Server version |
| Download dialect-agnostic version |
Different SQL dialects and R wrapper
Currently, only Postgres and Microsoft SQL Server dialects are supported out of the box. The dialect-specific scripts from above can be run standalone. The dialect-agnostic general query above only should run with the R wrapper, but so far it has only been tested with SQL Server and Postgres. The SQL Server version of genomic and episodes should also run with the R wrapper, regardless of your SQL dialect. If you run into trouble, please contact us at oncology@ohdsi.org. Also write to us with any other questions or for help executing the queries.