We hope you'll join us for our 4/23 webinar on using data tables to apply reference ranges and AE codes in OC4. For more information and to register, visit https://register.gotowebinar.com/register/2882170018956684555

how are the subject, event and crf statuses mapped from the database?

Hi all,

Does anyone have a guide to how the OpenClinica subject, event and CRF statuses shown in the ODM 1.3 xml full extension extracts and in the OpenClinica interface map to/from the ‘status’ table in the OpenClinica database?

The ‘status’ table has the following records, which are linked by the status_id to most of the important tables [1]

status_id;name
1;"available"
2;"unavailable"
3;"private"
4;"pending"
5;"removed"
6;"locked"
7;"auto-removed"
8;"signed"
9;"frozen"
10;"source_data_verification"

The ODM mapping page in the reference guide [2] states that the subject, event and CRF statuses come from the above, but clearly it doesn’t - statuses like ‘scheduled’, ‘initial data entry’, ‘data entry started’, etc. are missing.

The promisingly named ‘completion_status’ table has only one record, and it makes me sad:

completion_status_id;status_id;name;description
1;1;"completion status";"place filler for completion status"

I have practically no experience with java but I dug through a sonar page [3] with a 3.1 snapshot from 2012 on it anyway, and from what I can tell the displayed statuses are calculated from the status table, possibly by submit.eventcrfbean or core.dataentrystage or core.status. Even from these suspects it is not clear exactly what the calculations are – and it seems that portions of the calculation are done / modified by different beans (?), so after looking at it for a couple of hours I’m none the wiser.

I could make an educated guess on what the calculations are, but I’d rather use the same algorithm as the OpenClinica code does – until if/when the statuses stored in the database are consistent with the application and ODM xml extracts.

I also had a look at the audit_log_event table – but the corresponding names don’t cover the missing statuses either:

audit_log_event_type_id;name
1;"Item data value updated"
2;"Study subject created"
3;"Study subject status changed"
4;"Study subject value changed"
5;"Subject created"
6;"Subject status changed"
7;"Subject global value changed"
8;"Event CRF marked complete"
9;"Event CRF properties changed"
10;"Event CRF Initial Data Entry complete"
11;"Event CRF Double Data Entry complete"
12;"Item data status changed"
13;"Item data deleted"
14;"Event CRF complete with password"
15;"Event CRF Initial Data Entry complete with password"
16;"Event CRF Double Data Entry complete with password"
17;"Study Event scheduled"
18;"Study Event data entry started"
19;"Study Event completed"
20;"Study Event stopped"
21;"Study Event skipped"
22;"Study Event locked"
23;"Study Event removed"
24;"Study Event start date changed"
25;"Study Event end date changed"
26;"Study Event location changed"
27;"Subject Site Assignment"
28;"Subject Group Assignment"
29;"Subject Group changed"
30;"Item data inserted for repeating row"
31;"Study Event signed"
32;"EventCRF SDV Status"
33;"Change CRF version"
35;"Study Event restored"

[1] http://trialdatasolutions.com/tds/howto/schemaspyoc312/tables/status.html
[2] https://docs.openclinica.com/3.1/technical-documents/openclinica-and-cdisc-odm-specifications/mapping-openclinica-elements-odm
[3] https://dev.openclinica.com/sonar/

Any clues at all would be greatly appreciated.

Best regards,

Lindsay Stevens | Clinical Data Coordinator
NHMRC Clinical Trials Centre, THE UNIVERSITY OF SYDNEY
Office: Level 2, 6-10 Mallett St | Camperdown | NSW | 2050
Mail: Locked Bag 77 | Camperdown | NSW | 1450
T +61 2 9562 5369 | F +61 2 9562 5094
E [email protected] | W www.ctc.usyd.edu.au

Scanned by MailMarshal - M86 Security's comprehensive email content security solution. Download a free evaluation of MailMarshal at www.m86security.com

Comments

  • tom.hickersontom.hickerson Posts: 24
    Hi Lindsay,
    You might take a look at the subject_event_status table in OpenClinica - I think that will cover the cases for events.
    As for CRFs within the Events, there is actually a Java class which is not accurately replicated in the database. If you have access to the source code, I would look for DataEntryStage.java, as this will hold all the CRF statuses (Initial Data Entry Started, Double Data Entry Started, etc)
    Searching on DataEntryStage will eventually take you to ODMExtractDAO.getCrfVersionStatus(), which apparently computes it on the fly.
    With regards to the Subject status, could you say a little more about what you are looking for specifically that is not there in the DB? I am looking through the ODMExtractDAO, and while I see a lot about subject extraction there, it's not clear what's being exported to XML (so far).
    Hope that helps, Tom
    On Tue, Apr 30, 2013 at 10:11 AM, Lindsay Stevens wrote:
    Hi all,

    Does anyone have a guide to how the OpenClinica subject, event and CRF statuses shown in the ODM 1.3 xml full extension extracts and in the OpenClinica interface map to/from the ‘status’ table in the OpenClinica database?

    The ‘status’ table has the following records, which are linked by the status_id to most of the important tables [1]

    status_id;name
    1;"available"
    2;"unavailable"
    3;"private"
    4;"pending"
    5;"removed"
    6;"locked"
    7;"auto-removed"
    8;"signed"
    9;"frozen"
    10;"source_data_verification"

    The ODM mapping page in the reference guide [2] states that the subject, event and CRF statuses come from the above, but clearly it doesn’t - statuses like ‘scheduled’, ‘initial data entry’, ‘data entry started’, etc. are missing.

    The promisingly named ‘completion_status’ table has only one record, and it makes me sad:

    completion_status_id;status_id;name;description
    1;1;"completion status";"place filler for completion status"

    I have practically no experience with java but I dug through a sonar page [3] with a 3.1 snapshot from 2012 on it anyway, and from what I can tell the displayed statuses are calculated from the status table, possibly by submit.eventcrfbean or core.dataentrystage or core.status. Even from these suspects it is not clear exactly what the calculations are – and it seems that portions of the calculation are done / modified by different beans (?), so after looking at it for a couple of hours I’m none the wiser.

    I could make an educated guess on what the calculations are, but I’d rather use the same algorithm as the OpenClinica code does – until if/when the statuses stored in the database are consistent with the application and ODM xml extracts.

    I also had a look at the audit_log_event table – but the corresponding names don’t cover the missing statuses either:

    audit_log_event_type_id;name
    1;"Item data value updated"
    2;"Study subject created"
    3;"Study subject status changed"
    4;"Study subject value changed"
    5;"Subject created"
    6;"Subject status changed"
    7;"Subject global value changed"
    8;"Event CRF marked complete"
    9;"Event CRF properties changed"
    10;"Event CRF Initial Data Entry complete"
    11;"Event CRF Double Data Entry complete"
    12;"Item data status changed"
    13;"Item data deleted"
    14;"Event CRF complete with password"
    15;"Event CRF Initial Data Entry complete with password"
    16;"Event CRF Double Data Entry complete with password"
    17;"Study Event scheduled"
    18;"Study Event data entry started"
    19;"Study Event completed"
    20;"Study Event stopped"
    21;"Study Event skipped"
    22;"Study Event locked"
    23;"Study Event removed"
    24;"Study Event start date changed"
    25;"Study Event end date changed"
    26;"Study Event location changed"
    27;"Subject Site Assignment"
    28;"Subject Group Assignment"
    29;"Subject Group changed"
    30;"Item data inserted for repeating row"
    31;"Study Event signed"
    32;"EventCRF SDV Status"
    33;"Change CRF version"
    35;"Study Event restored"

    [1] http://trialdatasolutions.com/tds/howto/schemaspyoc312/tables/status.html
    [2] https://docs.openclinica.com/3.1/technical-documents/openclinica-and-cdisc-odm-specifications/mapping-openclinica-elements-odm
    [3] https://dev.openclinica.com/sonar/

    Any clues at all would be greatly appreciated.

    Best regards,

    Lindsay Stevens | Clinical Data Coordinator
    NHMRC Clinical Trials Centre, THE UNIVERSITY OF SYDNEY
    Office: Level 2, 6-10 Mallett St | Camperdown | NSW | 2050
    Mail: Locked Bag 77 | Camperdown | NSW | 1450
    T +61 2 9562 5369 | F +61 2 9562 5094
    E [email protected] | W www.ctc.usyd.edu.au

    Scanned by MailMarshal - M86 Security's comprehensive email content security solution. Download a free evaluation of MailMarshal at www.m86security.com
  • zwetszwets Posts: 59
    Hi Lindsay,
    > Does anyone have a guide to how the OpenClinica subject, event and CRF statuses shown in the ODM 1.3 xml full extension extracts and in the OpenClinica interface map to/from the ‘status’ table in the OpenClinica database?
    I've looked into the same a while ago, and after some trawling through the code (I am an experienced Java hacker, but the logic was just too sprawled out to get a coherent picture), decided to go the the empirical way. So I looked at database diffs while modifying CRFs via the UI. Below is the SQL to determine a CRFs stage:
    Get EC status (ignoring deleted ECs and the likes, also omitting SDV & signature status):
    SELECT CASE
    WHEN date_validate_completed IS NOT NULL THEN 'COMPLETED'
    WHEN date_completed IS NOT NULL THEN 'DOUBLE ENTRY'
    WHEN status_id = 1 THEN 'FIRST ENTRY'
    ELSE 'UNHANDLED' END AS "stage", ec.*
    FROM event_crf AS ec;
    In other words, the state transitions are implicit in the de-nulling of the date_completed (initial entry) and date_validate_completed (double entry) columns. Note that the value in the status_id column also changes (from 1=available to 4=pending to 2=unavailable), but only these timestamps trigger a status change in the UI. Just bumping the status_id will not work.
    > I could make an educated guess on what the calculations are, but I’d rather use the same algorithm as the OpenClinica code does – until if/when the statuses stored in the database are consistent with the application and ODM xml extracts.
    I would also prefer to use OC's algorithm, provided it were explicit instead of a somewhat organically grown situation (this is not a value judgment, it's the way most code goes), as attested by the subtleties in the code below.
    Below is simplified code for "Start Data Entry" and "Complete Data Entry". This shows the database changes I observed. Note: for clarity I have omitted the user- and time-stamping of the owner_id, update_id, date_created and date_updated columns. These you should do on any insert or update.
    Start Data Entry
    # Preconditions:
    # - Study_event se_id has been scheduled for study_subject ss_id
    # - se_id's study_event_definition allows for a crf_version cv_id
    # - Note: pseudo SQL to show the column assignments:
    INSERT event_crf
    study_event_id =
    crf_version_id =
    interviewer_name = '' -- overrides default NULL
    completion_status_id = 1 -- red herring, always 1
    status_id = 1 -- available
    annotations = '' -- overrides default NULL
    validator_annotations = '' -- overrides default NULL
    validate_string = '' -- overrides default NULL
    study_subject_id =
    Complete an EC
    if status is 'FIRST ENTRY' (see "Get EC status" above):
    if ec is subject to double data entry (value of ec.se.sed.edc.double_entry):
    # Mark initial data entry done but DDE not started
    # - status_id goes from available to pending and date_completed is stamped
    UPDATE event_crf
    SET (old_status_id, status_id, date_completed) = (status_id, 4, )
    WHERE event_crf_id =
    # Mark DDE is taking / has taken place but EC not completed (= validated)
    # - no status_id change, it's the validator_id that marks this
    UPDATE event_crf
    SET validator_id =
    WHERE event_crf_id =
    # Mark all item_data on the ec as unavailable in preparation for completion
    UPDATE item_data
    SET (old_status_id, status_id) = (status_id, 2)
    WHERE event_crf_id =
    # Set local variable to trigger completion code below
    status = 'DOUBLE ENTRY'
    else (no DDE):
    # No double entry, mark ec completed (done) and validated (completed) in one go
    # - status_id goes straight to 2 (unavailable), don't pass by 4 (pending)
    UPDATE event_crf
    SET (old_status_id, status_id, date_completed, date_validate_completed)
    = (status_id, 2, , )
    WHERE event_crf_id =
    if status = 'DOUBLE ENTRY':
    # Mark ec validated to complete DDE
    # - status_id goes from 4 (pending) to 2 (unavailable) and date_validate_completed is stamped
    UPDATE event_crf
    SET (status_id, date_validate_completed) = (2, )
    WHERE event_crf_id =
    # Mark the study_event completed
    # *** ONLY IF this was only or last EC for SE to be completed (check omitted) ***
    UPDATE study_event SET subject_event_status_id = 4 WHERE study_event_id =
    HTH
    Marco
  • Hi Tom and Marco

    Thanks very much for your help; at least I’m not alone in being confused by this.

    Tom –

    I think you’re right, subject_event_status covers the event statuses, except for unavailable, removed and auto-removed; but as Marco suggested these statuses are related but not integrated with the displayed statuses. Looking at it again I think you’re correct about the subject status – it seems only available, removed and auto-removed are possible for subjects, and these are all in the status table. So it’s mostly now a matter of working out the CRFs statuses.

    I would also like to know what precisely is exported to xml, and what transformations are done along the way, but so far I haven’t managed put the right search terms in google.

    Marco –

    I think your approach of using the dates as the main indicator of the CRF status is both perfect and excellent. I’m working on a query expands on the idea in your case...when snippet, and I’ll share it with the list when I’m done.

    Thanks again,

    Lindsay Stevens | Clinical Data Coordinator
    NHMRC Clinical Trials Centre, THE UNIVERSITY OF SYDNEY
    Office: Level 2, 6-10 Mallett St | Camperdown | NSW | 2050
    Mail: Locked Bag 77 | Camperdown | NSW | 1450
    T +61 2 9562 5369 | F +61 2 9562 5094
    E [email protected] | W www.ctc.usyd.edu.au

    From: Marco van Zwetselaar [mailto:[email protected]] On Behalf Of Marco van Zwetselaar
    Sent: Tuesday, 30 April 2013 8:07 PM
    To: Lindsay Stevens
    Cc: '[email protected]'
    Subject: Re: [Developers] how are the subject, event and crf statuses mapped from the database?

    Hi Lindsay,

    Does anyone have a guide to how the OpenClinica subject, event and CRF statuses shown in the ODM 1.3 xml full extension extracts and in the OpenClinica interface map to/from the ‘status’ table in the OpenClinica database?

    I've looked into the same a while ago, and after some trawling through the code (I am an experienced Java hacker, but the logic was just too sprawled out to get a coherent picture), decided to go the the empirical way. So I looked at database diffs while modifying CRFs via the UI. Below is the SQL to determine a CRFs stage:

    Get EC status (ignoring deleted ECs and the likes, also omitting SDV & signature status):

    SELECT CASE
    WHEN date_validate_completed IS NOT NULL THEN 'COMPLETED'
    WHEN date_completed IS NOT NULL THEN 'DOUBLE ENTRY'
    WHEN status_id = 1 THEN 'FIRST ENTRY'
    ELSE 'UNHANDLED' END AS "stage", ec.*
    FROM event_crf AS ec;

    In other words, the state transitions are implicit in the de-nulling of the date_completed (initial entry) and date_validate_completed (double entry) columns. Note that the value in the status_id column also changes (from 1=available to 4=pending to 2=unavailable), but only these timestamps trigger a status change in the UI. Just bumping the status_id will not work.

    I could make an educated guess on what the calculations are, but I’d rather use the same algorithm as the OpenClinica code does – until if/when the statuses stored in the database are consistent with the application and ODM xml extracts.

    I would also prefer to use OC's algorithm, provided it were explicit instead of a somewhat organically grown situation (this is not a value judgment, it's the way most code goes), as attested by the subtleties in the code below.

    Below is simplified code for "Start Data Entry" and "Complete Data Entry". This shows the database changes I observed. Note: for clarity I have omitted the user- and time-stamping of the owner_id, update_id, date_created and date_updated columns. These you should do on any insert or update.

    Start Data Entry

    # Preconditions:
    # - Study_event se_id has been scheduled for study_subject ss_id
    # - se_id's study_event_definition allows for a crf_version cv_id
    # - Note: pseudo SQL to show the column assignments:

    INSERT event_crf
    study_event_id =
    crf_version_id =
    interviewer_name = '' -- overrides default NULL
    completion_status_id = 1 -- red herring, always 1
    status_id = 1 -- available
    annotations = '' -- overrides default NULL
    validator_annotations = '' -- overrides default NULL
    validate_string = '' -- overrides default NULL
    study_subject_id =

    Complete an EC

    if status is 'FIRST ENTRY' (see "Get EC status" above):

    if ec is subject to double data entry (value of ec.se.sed.edc.double_entry):

    # Mark initial data entry done but DDE not started
    # - status_id goes from available to pending and date_completed is stamped
    UPDATE event_crf
    SET (old_status_id, status_id, date_completed) = (status_id, 4, )
    WHERE event_crf_id =

    # Mark DDE is taking / has taken place but EC not completed (= validated)
    # - no status_id change, it's the validator_id that marks this
    UPDATE event_crf
    SET validator_id =
    WHERE event_crf_id =

    # Mark all item_data on the ec as unavailable in preparation for completion
    UPDATE item_data
    SET (old_status_id, status_id) = (status_id, 2)
    WHERE event_crf_id =

    # Set local variable to trigger completion code below
    status = 'DOUBLE ENTRY'

    else (no DDE):

    # No double entry, mark ec completed (done) and validated (completed) in one go
    # - status_id goes straight to 2 (unavailable), don't pass by 4 (pending)
    UPDATE event_crf
    SET (old_status_id, status_id, date_completed, date_validate_completed)
    = (status_id, 2, , )
    WHERE event_crf_id =

    if status = 'DOUBLE ENTRY':

    # Mark ec validated to complete DDE
    # - status_id goes from 4 (pending) to 2 (unavailable) and date_validate_completed is stamped
    UPDATE event_crf
    SET (status_id, date_validate_completed) = (2, )
    WHERE event_crf_id =
    # Mark the study_event completed
    # *** ONLY IF this was only or last EC for SE to be completed (check omitted) ***
    UPDATE study_event SET subject_event_status_id = 4 WHERE study_event_id =

    HTH
    Marco


    Scanned by MailMarshal - M86 Security's comprehensive email content security solution. Download a free evaluation of MailMarshal at www.m86security.com
  • tom.hickersontom.hickerson Posts: 24
    Hi Lindsay,
    As far as your search to find out what is precisely exported to an OpenClinica ODM file - I did a little hunting, starting with the method 'createODMFile()' located within the java file org.akaza.openclinica.service.extract.GenerateExtractFileService. This method calls a number of Collector classes, and then proceeds to generate an ODMBean, which pulls in things like schema location, etc. and will eventually call a MetaDataReportBean.
    Within the MetaDataReportBean we have the methods which will create the actual XML, node by node (i.e. Study by Study). If you look for the method name createChunkedOdmXml, in fact, this will take you to two other objects which generate other ODM XML, one for the full report and another for the administrative data report.
    Clear as mud? Anyway, let me know if you have more questions about the createChunkedOdmXml method - this sounds like the place that you need to explore.
    Best, Tom
    On Wed, May 1, 2013 at 9:08 AM, Lindsay Stevens wrote:
    Hi Tom and Marco

    Thanks very much for your help; at least I’m not alone in being confused by this.

    Tom –

    I think you’re right, subject_event_status covers the event statuses, except for unavailable, removed and auto-removed; but as Marco suggested these statuses are related but not integrated with the displayed statuses. Looking at it again I think you’re correct about the subject status – it seems only available, removed and auto-removed are possible for subjects, and these are all in the status table. So it’s mostly now a matter of working out the CRFs statuses.

    I would also like to know what precisely is exported to xml, and what transformations are done along the way, but so far I haven’t managed put the right search terms in google.

    Marco –

    I think your approach of using the dates as the main indicator of the CRF status is both perfect and excellent. I’m working on a query expands on the idea in your case...when snippet, and I’ll share it with the list when I’m done.

    Thanks again,

    Lindsay Stevens | Clinical Data Coordinator
    NHMRC Clinical Trials Centre, THE UNIVERSITY OF SYDNEY
    Office: Level 2, 6-10 Mallett St | Camperdown | NSW | 2050
    Mail: Locked Bag 77 | Camperdown | NSW | 1450
    T +61 2 9562 5369 | F +61 2 9562 5094
    E [email protected] | W www.ctc.usyd.edu.au

    From: Marco van Zwetselaar [mailto:[email protected]] On Behalf Of Marco van Zwetselaar
    Sent: Tuesday, 30 April 2013 8:07 PM
    To: Lindsay Stevens
    Cc: '[email protected]'
    Subject: Re: [Developers] how are the subject, event and crf statuses mapped from the database?

    Hi Lindsay,

    Does anyone have a guide to how the OpenClinica subject, event and CRF statuses shown in the ODM 1.3 xml full extension extracts and in the OpenClinica interface map to/from the ‘status’ table in the OpenClinica database?

    I've looked into the same a while ago, and after some trawling through the code (I am an experienced Java hacker, but the logic was just too sprawled out to get a coherent picture), decided to go the the empirical way. So I looked at database diffs while modifying CRFs via the UI. Below is the SQL to determine a CRFs stage:

    Get EC status (ignoring deleted ECs and the likes, also omitting SDV & signature status):

    SELECT CASE
    WHEN date_validate_completed IS NOT NULL THEN 'COMPLETED'
    WHEN date_completed IS NOT NULL THEN 'DOUBLE ENTRY'
    WHEN status_id = 1 THEN 'FIRST ENTRY'
    ELSE 'UNHANDLED' END AS "stage", ec.*
    FROM event_crf AS ec;

    In other words, the state transitions are implicit in the de-nulling of the date_completed (initial entry) and date_validate_completed (double entry) columns. Note that the value in the status_id column also changes (from 1=available to 4=pending to 2=unavailable), but only these timestamps trigger a status change in the UI. Just bumping the status_id will not work.

    I could make an educated guess on what the calculations are, but I’d rather use the same algorithm as the OpenClinica code does – until if/when the statuses stored in the database are consistent with the application and ODM xml extracts.

    I would also prefer to use OC's algorithm, provided it were explicit instead of a somewhat organically grown situation (this is not a value judgment, it's the way most code goes), as attested by the subtleties in the code below.

    Below is simplified code for "Start Data Entry" and "Complete Data Entry". This shows the database changes I observed. Note: for clarity I have omitted the user- and time-stamping of the owner_id, update_id, date_created and date_updated columns. These you should do on any insert or update.

    Start Data Entry

    # Preconditions:
    # - Study_event se_id has been scheduled for study_subject ss_id
    # - se_id's study_event_definition allows for a crf_version cv_id
    # - Note: pseudo SQL to show the column assignments:

    INSERT event_crf
    study_event_id =
    crf_version_id =
    interviewer_name = '' -- overrides default NULL
    completion_status_id = 1 -- red herring, always 1
    status_id = 1 -- available
    annotations = '' -- overrides default NULL
    validator_annotations = '' -- overrides default NULL
    validate_string = '' -- overrides default NULL
    study_subject_id =

    Complete an EC

    if status is 'FIRST ENTRY' (see "Get EC status" above):

    if ec is subject to double data entry (value of ec.se.sed.edc.double_entry):

    # Mark initial data entry done but DDE not started
    # - status_id goes from available to pending and date_completed is stamped
    UPDATE event_crf
    SET (old_status_id, status_id, date_completed) = (status_id, 4, )
    WHERE event_crf_id =

    # Mark DDE is taking / has taken place but EC not completed (= validated)
    # - no status_id change, it's the validator_id that marks this
    UPDATE event_crf
    SET validator_id =
    WHERE event_crf_id =

    # Mark all item_data on the ec as unavailable in preparation for completion
    UPDATE item_data
    SET (old_status_id, status_id) = (status_id, 2)
    WHERE event_crf_id =

    # Set local variable to trigger completion code below
    status = 'DOUBLE ENTRY'

    else (no DDE):

    # No double entry, mark ec completed (done) and validated (completed) in one go
    # - status_id goes straight to 2 (unavailable), don't pass by 4 (pending)
    UPDATE event_crf
    SET (old_status_id, status_id, date_completed, date_validate_completed)
    = (status_id, 2, , )
    WHERE event_crf_id =

    if status = 'DOUBLE ENTRY':

    # Mark ec validated to complete DDE
    # - status_id goes from 4 (pending) to 2 (unavailable) and date_validate_completed is stamped
    UPDATE event_crf
    SET (status_id, date_validate_completed) = (2, )
    WHERE event_crf_id =
    # Mark the study_event completed
    # *** ONLY IF this was only or last EC for SE to be completed (check omitted) ***
    UPDATE study_event SET subject_event_status_id = 4 WHERE study_event_id =

    HTH
    Marco


    Scanned by MailMarshal - M86 Security's comprehensive email content security solution. Download a free evaluation of MailMarshal at
    www.m86security.com
This discussion has been closed.