We hope you'll join us for our 4/23 webinar on using data tables to apply reference ranges and AE codes in OC4. For more information and to register, visit https://register.gotowebinar.com/register/2882170018956684555

SAS export (thx Linas) to be included in 3.10! Looking for community feedback

2

Comments

  • agoodwinagoodwin Posts: 131 admin
    Hi @lindsay.stevens and @Linas ,

    We are working through some issues and would like to get your input / feedback and ideas.

    We have found that for SASDataSetName, if that is greater than 32 characters and is not unique then only the first SAS data set is created and subsequent ones are not.


    It seems this same issue is true for SAS field names.

    It seems that Lindsays approach to this is to use the ODM SaSFieldName for column names but we don't think this is a great solution for most users because it can become quite unreadable if forms aren't developed initially to use 8 character unique names.

    Do either of you have any suggestions for how to better handle uniqueness in the XSLT?



  • lindsay.stevenslindsay.stevens Posts: 404 ✭✭✭
    via Email
    I think the naming issue should be resolved upstream, the SAS attributes in
    ODM should be ready to use. It's also a good lowest common denominator for
    stats software naming requirements.

    As it currently stands, I don't think that users who have not designed
    their study in a SAS friendly way should expect a neat pathway to SAS.
    These users could maintain their own name mapping scripts to make the SAS
    export files work / look nicer.
  • agoodwinagoodwin Posts: 131 admin
    via Email
    Hi Lindsay,

    Thanks for the feedback. One of my concerns is this:

    SAS is OK with 32 character names while ODM is not (the ODM spec still uses
    8 character names).

    I think it's more difficult to stick to 8 char unique names while 32 is
    much easier. That coupled with the ability to design your CRFs, etc with
    longer names, I wanted an approach that would be a bit more user-friendly.
    It is reasonable to assume that someone might build their forms using 32
    character naming conventions not realizing that ODM (and therfor
    SASFieldNames) would be truncated.

    It would be great it ODM updated the spec to use 32 characters, but in the
    meantime do you think there could be a low-friction solution in the xslt?

    Best Regards,
    Alicia



    On Tue, Feb 23, 2016 at 3:56 AM, lindsay.stevens <
    [email protected]> wrote:

    > OpenClinica http://openclinica.vanillaforums.com/
    >
    >
    >
    > lindsay.stevens commented on SAS export (thx Linas) to be included in
    > 3.10! Looking for community feedback
    >
    >
    >
    > I think the naming issue should be resolved upstream, the SAS attributes in
    >
    > ODM should be ready to use. It's also a good lowest common denominator for
    >
    > stats software naming requirements.
    >
    >
    >
    > As it currently stands, I don't think that users who have not designed
    >
    > their study in a SAS friendly way should expect a neat pathway to SAS.
    >
    > These users could maintain their own name mapping scripts to make the SAS
    >
    > export files work / look nicer.
    >
    >
    >
    > --
    >
    > To manage your email notifications, please visit:
    > https://www.openclinica.com/forums#/profile/preferences
    >
    >
    >
    > Reply to this email directly or follow the link below to check it out:
    >
    > http://openclinica.vanillaforums.com/discussion/comment/17919#Comment_17919
    >
    >
    >
    > Check it out:
    > http://openclinica.vanillaforums.com/discussion/comment/17919#Comment_17919
    >
    >
  • lindsay.stevenslindsay.stevens Posts: 404 ✭✭✭
    via Email
    There is a feature of variable labels in SAS, which is there to resolve
    this kind of variable name usability issue.

    If OIDs were to be used, even those are potentially too long: max 40 chars.
    So the same problem is there.

    How about use the OID, minus the type prefix (IG_ or I_), if that is longer
    than 28, truncate to 28 and append the node position in the parent entity
    (FormDef or ItemGroupDef).

    Warnings about all this could be put in the CRF template.
  • lindsay.stevenslindsay.stevens Posts: 404 ✭✭✭
    @ccollins
    Thanks, I've got an export from the demo Juno to add to a copy of the old Docetaxel demo study. Is the Juno postgres database available?

    I noticed that while setting up OC source locally, the docs at [1] should mention the important step of un-commenting out the filters on lines 545-548 in the top-level "pom.xml". Otherwise, the developer specific Maven build profile instructions will not work.

    [1] https://docs.openclinica.com/3.1/technical-documents/developing-openclinica


    @tkhaja
    I have made a proof of concept of the XLST code change in a new repository at [2]. In short, changing from FileOutputStream to File objects for the transform output makes the multiple result-documents work.

    I am yet to get an integration test working with the Spring app context loaded - if you have any guidance on running integration tests in that way it would be greatly appreciated. I tried to adapt the Clinovo techniques for this but the amount of mocks blew my mind. But maybe that is just how it is done?

    [2] https://github.com/lindsay-stevens-kirby/oc_xslt


    @tkhaja part 2
    About the error PDF you posted:
    - It seems that both you and @jamesbonica are using SAS Studio.
    - I installed it and if you did the same as me in the instructions, then you'll have a mapped folder in SAS studio at "/folders/myfolders".
    - In which case you'd need to change the paths in load.sas as follows; normally the current path would be available with "%sysget('sas_execfilepath')" but this doesn't work in SAS Studio and I'm not sure what the alternative is.

    ```
    FILENAME R0112345 "/folders/myfolders/data.xml";
    FILENAME map "/folders/myfolders/map.xml";
    ```

    - I think asking users to update the file path at the top of the SAS script file is a pretty small ask but if you think a workaround is in order then let me know your idea.
    - Later in the PDF log, at log line 73-8 there is the following syntax error:

    ```
    73 VALUE CL_809_
    74 (select response)="(select response)"
    ______
    22
    76
    ERROR 22-322: Syntax error, expecting one of the following: ), DEFAULT, FUZZ, MAX, MIN, MULTILABEL, NOTSORTED, ROUND.
    ERROR 76-322: Syntax error, statement will be ignored.
    75 1="childbearing potential without contraceptive protection"
    76 2="childbearing potential with contraceptive protection"
    77 3="surigcally sterilised"
    78 4="post-menopausal";
    NOTE: The previous statement has
    ```

    - The XSLT will output code lists in the format 1="Yes", unless the associated item data type is text, in which case it wraps the coded value in quotes, like "1"="Yes".
    - In this case there seems to be an integer code list where the first value is a string, "(select response)". I would have thought this was not valid in OpenClinica? What data type is the item that uses CL_809?
    - It's not possible to have mixed types in SAS, so the solutions are to remove the invalid code item, or change the code list to a string value list and cast the data to string as well (overriding the metadata, basically).
    - Again it would be useful to have access to your test input data.


    @jamesbonica
    I can add a check for if the name starts with a number and prefix it with an underscore, if that would do? An interesting quirk of the SAS XML reading process is that the library with the study name is not actually usable / readable. Only the SAS native copies in the WORK library can actually be opened / worked with. This is a SAS limitation.

  • ccollinsccollins Posts: 379 admin
    via Email
    Hi Lindsay,
    I'll get you a copy of the database. I added your note to the build
    instructions, thanks!
    Cal
  • jmacminn1jmacminn1 Posts: 18
    Hi @lindsay_stevens and @Linas

    It doesn't seem like there is a way to include the CRF version in the SAS extract. Is this correct? I've been helping a community user setup the SAS extract, and we can't seem to get this variable included, even if we select it when creating the data set.

    If it is not currently possible to include CRF version, do you think it would be something you would consider adding? I think it would be useful to include this variable in extracts. For example...

    You have an Eligibility Criteria CRF that qualifies subjects based on certain criteria (for example, a certain score on an assessment must be < 25). However, let's say you make a protocol change and subjects must now have a score < 20. It would be useful to know which version of the form was filled out for the subject so that you know which criteria was used for determining eligibility.

    What are your thoughts?

    Thank you,
    Jessica
  • lindsay.stevenslindsay.stevens Posts: 404 ✭✭✭
    via Email
    Sure, it could. CRF version name, or CRF version oid, or both?
  • jmacminn1jmacminn1 Posts: 18
    Hi Linsday, Thanks for the response. I was originally thinking the CRF version name, but I suppose the OID doesn't hurt as well!
  • jamesbonicajamesbonica Posts: 7
    edited April 2016
    @lindsay_stevens @Linas
    I have a question about the SAS script from the SAS_FORMAT.sas file.

    Here is an example from a test extract:

    FILENAME S56_AUTO "~/SAS_DATA.xml";
    FILENAME map "~/SAS_MAP.xml";
    LIBNAME S56_AUTO xml xmlmap=map access=readonly;
    proc datasets library=S56_AUTO;
    copy out=work;
    run;
    proc format;
    value $CL_11_ "option_1"="Option 1" "option_2"="Option 2" "option_3"="Option 3" "option_4"="Option 4" "option_5"="Option 5" "option_6"="Option 6" ;
    value $CL_17_ "1"="Show Hidden Question 1" "2"="Show Hidden Question 2" ;
    value $CL_4_ "option_1"="Option 1" "option_2"="Option 2" "option_3"="Option 3" "option_4"="Option 4" "option_5"="Option 5" "option_6"="Option 6" ;

    run;

    data IG_XFORM_SAS_GROUP_TO_DETERMINE_IF_;
    set IG_XFORM_SAS_GROUP_TO_DETERMINE_IF_;
    format SAS_Question_4 $CL_11_.;
    run;

    data IG_XFORM_SAS_SHOWHIDE_GROUP1;
    set IG_XFORM_SAS_SHOWHIDE_GROUP1;
    format SAS_OC6865 $CL_17_.;
    run;

    When I run this, SAS Studio returns the following error:

    ERROR 307-185: The data set name cannot have more than 32 characters.

    When I change both occurrences of IG_XFORM_SAS_GROUP_TO_DETERMINE_IF_ to the SAS Table Name in the SAS_MAP.xml file, XFORM_SAS_GROUP_TO_DETERMINE_IF_, there is no error.

    However, when I delete everything after "run;" and execute the script, there are no error and the data is rendered as expected in each table. It makes it seem like this portion of the script is unnecessary but I doubt it.

    What is the purpose of this part of the script and should the syntax be fixed?

    Let me know if this is unclear.

    Thanks for your help!

    Jim
Sign In or Register to comment.