Tools to convert OpenClinica Extracts to CSV, R and SAS

LinasLinas Posts: 5
Hi Developers, Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library). The transformations work by creating a “table” per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl). Regards, Linas This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt. Attachments: OC_XML_to_SAS_R_CSV.zip 9.6 KB
Post edited by bbaumann on
Tagged:
«134

Comments

  • lindsay.stevenslindsay.stevens Posts: 391 ✭✭
    Hi Linas,
    That's amazing! Thanks for sharing these. I was able to run all 3 things.
    The hardest part was getting windows to let me run the ps1 files :)
    Best regards,
    Lindsay
    On 22 January 2014 13:31, Silva Linas wrote:
    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a “table” per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas
    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
  • haenselhaensel Posts: 530 ✭✭
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------
    Dipl.-Inf. Christian Hänsel

    IT / Software Developer
    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de
    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de
    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
  • lindsay.stevenslindsay.stevens Posts: 391 ✭✭
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------
    Dipl.-Inf. Christian Hänsel

    IT / Software Developer
    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de
    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de
    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
  • LinasLinas Posts: 5
    Hi Lindsay,

    Sounds good to me!

    Regards,

    Linas
    Sent: Thursday, 23 January 2014 9:18 AM
    To: developers@openclinica.org
    Subject: Re: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------

    Dipl.-Inf. Christian Hänsel

    IT / Software Developer

    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de

    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de

    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
    >
  • tkhajatkhaja Posts: 54
    Linas,
    Thank you so much for sharing it. It is a great tool and can see it being helpful in a lot of ways. I was successful in getting the ODM transformed to SAS. It took only few seconds for the powershell script to complete. The xsl files - sas_data,sas_format and sas_map were generated successfully.
    Like Lindsay said, the only difficulty was to make windows execute the powershell script. In case if any one is interested, this is how I executed it in powershell v2:
    powershell -executionpolicy bypass -File .\powershell_perform_SAS_xsl_transforms.ps1
    Best,
    Thasbiha
    On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens wrote:
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------
    Dipl.-Inf. Christian Hänsel

    IT / Software Developer
    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de
    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de
    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
  • agoodwinagoodwin Posts: 131 admin
    Hi Linus,
    This is very exciting - Thanks for the contribution. We'll make sure there is a story in the back log (in jira) and we'll see how these could be potentially included in a future release. We're really excited and we will keep you posted!
    Cheers,
    Alicia
    On Wednesday, January 22, 2014, Thasbiha Khaja wrote:
    Linas,
    Thank you so much for sharing it. It is a great tool and can see it being helpful in a lot of ways. I was successful in getting the ODM transformed to SAS. It took only few seconds for the powershell script to complete. The xsl files - sas_data,sas_format and sas_map were generated successfully.
    Like Lindsay said, the only difficulty was to make windows execute the powershell script. In case if any one is interested, this is how I executed it in powershell v2:
    powershell -executionpolicy bypass -File .\powershell_perform_SAS_xsl_transforms.ps1
    Best,
    Thasbiha
    On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens wrote:
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------
    Dipl.-Inf. Christian Hänsel

    IT / Software Developer
    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de
    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de
    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
  • lindsay.stevenslindsay.stevens Posts: 391 ✭✭
    Hello all,
    The wikibook is now updated and the files have been added to my GPLv2 repo.
    I managed to get the R transformation working as an extract format, here is an example extract.properties file.
    The main issue I had was that the xsls refer to the 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the '/openclinica.data/xslt' folder.
    Rather than updating the lookup xsl, I edited the main R transform xsl (new xsl is here) so that it:
    1. Names the datasets like '[Crfname]_[Itemgroupname]' with all non-alphanumeric characters taken out of both names, e.g. a dataset for a CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was doing.
    2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02 was ending up as the int 2012. Changing it such that dates are assigned with as.Date("2012-01-02") seemed to work, such that I could add and subtract days properly after doing that.
    3. Includes the labels in the same file as the dataframes, since I guess most of the time you'd want both anyway, and it would save concatenating the files later. It would have made the xsl a fair bit shorter to place the label part after each dataframe part but I kept them separate so the whole labels part can be easily chopped off the end if it's not wanted.
    Best regards,
    Lindsay
    On 23 January 2014 12:27, Alicia Goodwin wrote:
    Hi Linus,
    This is very exciting - Thanks for the contribution. We'll make sure there is a story in the back log (in jira) and we'll see how these could be potentially included in a future release. We're really excited and we will keep you posted!
    Cheers,
    Alicia
    On Wednesday, January 22, 2014, Thasbiha Khaja wrote:
    Linas,
    Thank you so much for sharing it. It is a great tool and can see it being helpful in a lot of ways. I was successful in getting the ODM transformed to SAS. It took only few seconds for the powershell script to complete. The xsl files - sas_data,sas_format and sas_map were generated successfully.
    Like Lindsay said, the only difficulty was to make windows execute the powershell script. In case if any one is interested, this is how I executed it in powershell v2:
    powershell -executionpolicy bypass -File .\powershell_perform_SAS_xsl_transforms.ps1
    Best,
    Thasbiha
    On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens wrote:
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------
    Dipl.-Inf. Christian Hänsel

    IT / Software Developer
    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de
    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de
    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
  • mvirtosumvirtosu Posts: 272
    Hi guys,

    Which version of the ODM are you using to transform to SAS?

    I am getting something out, but not what I was expecting.

    Thanks,

    Mihai
    Sent: Thursday, January 23, 2014 12:18 AM
    To: developers@openclinica.org
    Subject: Re: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hello all,
    The wikibook is now updated and the files have been added to my GPLv2 repo.
    I managed to get the R transformation working as an extract format, here is an example extract.properties file.
    The main issue I had was that the xsls refer to the 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the '/openclinica.data/xslt' folder.
    Rather than updating the lookup xsl, I edited the main R transform xsl (new xsl is here) so that it:
    1. Names the datasets like '[Crfname]_[Itemgroupname]' with all non-alphanumeric characters taken out of both names, e.g. a dataset for a CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was doing.
    2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02 was ending up as the int 2012. Changing it such that dates are assigned with as.Date("2012-01-02") seemed to work, such that I could add and subtract days properly after doing that.
    3. Includes the labels in the same file as the dataframes, since I guess most of the time you'd want both anyway, and it would save concatenating the files later. It would have made the xsl a fair bit shorter to place the label part after each dataframe part but I kept them separate so the whole labels part can be easily chopped off the end if it's not wanted.

    Best regards,
    Lindsay

    On 23 January 2014 12:27, Alicia Goodwin wrote:
    Hi Linus,

    This is very exciting - Thanks for the contribution. We'll make sure there is a story in the back log (in jira) and we'll see how these could be potentially included in a future release. We're really excited and we will keep you posted!

    Cheers,
    Alicia
    On Wednesday, January 22, 2014, Thasbiha Khaja wrote:
    Linas,
    Thank you so much for sharing it. It is a great tool and can see it being helpful in a lot of ways. I was successful in getting the ODM transformed to SAS. It took only few seconds for the powershell script to complete. The xsl files - sas_data,sas_format and sas_map were generated successfully.
    Like Lindsay said, the only difficulty was to make windows execute the powershell script. In case if any one is interested, this is how I executed it in powershell v2:
    powershell -executionpolicy bypass -File .\powershell_perform_SAS_xsl_transforms.ps1

    Best,
    Thasbiha

    On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens wrote:
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------

    Dipl.-Inf. Christian Hänsel

    IT / Software Developer

    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de

    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de

    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas


    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
  • tkhajatkhaja Posts: 54
    Mihai,
    I used the CDISC ODM XML 1.3 Full with OpenClinica extensions format.
    -Thasbiha
    On Thu, Jan 23, 2014 at 3:47 PM, Mihai Virtosu wrote:
    Hi guys,

    Which version of the ODM are you using to transform to SAS?

    I am getting something out, but not what I was expecting.

    Thanks,

    Mihai
    Sent: Thursday, January 23, 2014 12:18 AM
    To: developers@openclinica.org
    Subject: Re: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hello all,
    The wikibook is now updated and the files have been added to my GPLv2 repo.
    I managed to get the R transformation working as an extract format, here is an example extract.properties file.
    The main issue I had was that the xsls refer to the 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the '/openclinica.data/xslt' folder.
    Rather than updating the lookup xsl, I edited the main R transform xsl (new xsl is here) so that it:
    1. Names the datasets like '[Crfname]_[Itemgroupname]' with all non-alphanumeric characters taken out of both names, e.g. a dataset for a CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was doing.
    2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02 was ending up as the int 2012. Changing it such that dates are assigned with as.Date("2012-01-02") seemed to work, such that I could add and subtract days properly after doing that.
    3. Includes the labels in the same file as the dataframes, since I guess most of the time you'd want both anyway, and it would save concatenating the files later. It would have made the xsl a fair bit shorter to place the label part after each dataframe part but I kept them separate so the whole labels part can be easily chopped off the end if it's not wanted.

    Best regards,
    Lindsay

    On 23 January 2014 12:27, Alicia Goodwin wrote:
    Hi Linus,

    This is very exciting - Thanks for the contribution. We'll make sure there is a story in the back log (in jira) and we'll see how these could be potentially included in a future release. We're really excited and we will keep you posted!

    Cheers,
    Alicia
    On Wednesday, January 22, 2014, Thasbiha Khaja wrote:
    Linas,
    Thank you so much for sharing it. It is a great tool and can see it being helpful in a lot of ways. I was successful in getting the ODM transformed to SAS. It took only few seconds for the powershell script to complete. The xsl files - sas_data,sas_format and sas_map were generated successfully.
    Like Lindsay said, the only difficulty was to make windows execute the powershell script. In case if any one is interested, this is how I executed it in powershell v2:
    powershell -executionpolicy bypass -File .\powershell_perform_SAS_xsl_transforms.ps1

    Best,
    Thasbiha

    On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens wrote:
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------

    Dipl.-Inf. Christian Hänsel

    IT / Software Developer

    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de

    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de

    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas


    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
  • mvirtosumvirtosu Posts: 272
    Linas,

    I have run the ODM to SAS transformation and sent it over to our statisticians, but they are wondering how to get SAS datasets based on the 3 output files.

    Your advice would be highly appreciated.

    Thanks,

    Mihai Virtosu
    University of Utah
    Sent: Tuesday, January 21, 2014 7:31 PM
    To: developers@openclinica.org
    Subject: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a “table” per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas
    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
Sign In or Register to comment.