We hope you'll join us for our 4/23 webinar on using data tables to apply reference ranges and AE codes in OC4. For more information and to register, visit https://register.gotowebinar.com/register/2882170018956684555

Data missing in export


my institution is conducting a multicenter study on OpenClinica where we intermittently experience missing data when running exports. Previously we have associated these with irregular characters in the data set (e.g. ä, ö and so on), but these have been removed from the data set.

Our setup is OpenClinica, running on Red Hat Linux Enterprise edition 6.5, and PostgreSQL 8.4.

Any hints would be greatly appreciated!


  • lindsay.stevenslindsay.stevens Posts: 404 ✭✭✭
    via Email
    can you give us an example of the data (values, types, scope, etc) that is
    missing? also what export format(s) are you using?
  • jowakejowake Posts: 14
    Yes, the scope of the data are; about 150 participants so far, and 6 crfs for 7 events (five time-based events, two of which uses the same non-repeating crf, and two treatment-based events with repeating crf-s). There are about a thousand variables, everything included, and a portion of these are mutually exclusive.

    The data types are mainly integers, with a few stringd and dates, and a handful of reals.

    The data export formats are excel and spss.
  • haenselhaensel Posts: 602 ✭✭✭
    via Email

    There was a bug in the past, that had to do with the variable names (_E
    or something like this caused the trouble) but this has been fixed. Can
    you explain further what data is missing. Are there full variables
    missing, values for a subject or a whole subject? Is this regular
    (always the same variable) or arbitrary (different variables for
    different subjects)?

  • jowakejowake Posts: 14
    Thanks for your reply - whole subjects are missing. They have been missing previously, and then "reappeared" when the special characters in the data have been replaced with english alphabet characters. Now they have disappeared again. They all belong to the same site.

    None of the data variable names contain special characters - only numbers and letters (upper and lower case), but the crfs names all contain underscores and hyphens in some instances.

    Best regards, Jo
  • haenselhaensel Posts: 602 ✭✭✭
    edited April 2014
    @jowake Hi Jo

    Can you try to export the data in ODM format and check by hand if the missing subjects are missing in the odm too (they should be easy to find by their OID)?

    Regards, Christian
  • jowakejowake Posts: 14
    Hi Christian,

    Yes, I exported to the "CDISC ODM XML 1.3 Full with OpenClinica extensions", and the data still do not appear. I also created a data export for just the site with the missing subjects with no luck.

    Best regards, Jo
  • haenselhaensel Posts: 602 ✭✭✭
    via Email
    Hi Jo

    So this is the point where it gets hard to explore the error via the
    forum. Can you provide some example data dump? An empty study setup dump
    or something like this?
    Just to be sure, the missing subjects show up on the web interface?

  • lindsay.stevenslindsay.stevens Posts: 404 ✭✭✭
    via Email
    so far it's clear that for one site there is data missing for some
    subject(s), for exports in odm 1.3 xml, excel, and spss.

    I think there are many more details that would be useful, such as:

    are the subjects removed?

    do any of the objects have any special characters in the name? e.g. study
    subject id, event, crf, item group, item, etc.

    is it all data for those subjects missing or is it a subset (e.g.
    particular events, crfs, item groups, or items)?

    is it always the same subjects and data missing or does it change over time
    or between export formats?

    is the data definitely there in both the web app and database?
  • jowakejowake Posts: 14
    The three missing subjects show at the web interface, yes.
  • jowakejowake Posts: 14
    Hi Lindsay,

    The subjects are not removed, and their event status is "Completed" for all scheduled events.

    None of the objects have  special characters in the name, only letters and numbers, and no spaces. The CRF names contain underscore (_) and hyphen (-).

    All data for the three missing subjects are missing - the subjects do not appear in the exports as a whole. They share the commonality of belonging to the same site. They have been missing previously, but has since reappeared when the special characters were removed from the data.

    Can you advise on how to verify that the data are present in the database? They appear when i search by the ssid in the OpenClinica web interface.

This discussion has been closed.