Tools to convert OpenClinica Extracts to CSV, R and SAS

24

Comments

  • lindsay.stevenslindsay.stevens Posts: 391 ✭✭
    Hi Mihai,
    There are 3 SAS transforms. One gives as .txt file with the data, another gives a formats file, and the last gives a SAS script.
    The SAS script pretty much just says, "load data from this text file and use this other file for formats". So when you run the SAS script the data sets should be in the SAS 'work' directory. You can then save to whereever.
    Best regards,
    Lindsay
    On Jan 25, 2014 3:20 AM, "Mihai Virtosu" wrote:
    Linas,

    I have run the ODM to SAS transformation and sent it over to our statisticians, but they are wondering how to get SAS datasets based on the 3 output files.

    Your advice would be highly appreciated.

    Thanks,

    Mihai Virtosu
    University of Utah
    Sent: Tuesday, January 21, 2014 7:31 PM
    To: developers@openclinica.org
    Subject: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a “table” per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas
    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
  • hI PLEASE REMOVE MY EMAIL ID FROM THIS LIST
    On Fri, Jan 24, 2014 at 7:52 PM, chand kakarala wrote:
    hI PLEASE REMOVE MY EMAIL ID FROM THIS LIST
    On Friday, 24 January 2014 11:47 AM, Mihai Virtosu wrote:
    Linas,

    I have run the ODM to SAS transformation and sent it over to our statisticians, but they are wondering how to get SAS datasets based on the 3 output files.

    Your advice would be highly appreciated.

    Thanks,

    Mihai Virtosu
    University of Utah
    Sent: Tuesday, January 21, 2014 7:31 PM
    To: developers@openclinica.org
    Subject: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a “table” per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas
    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
  • LinasLinas Posts: 5
    Hi Mihai,

    Along with the 3 .xls files for SAS, there was a file called SAS_FPM_ImportCommands.SAS. This file is example of the SAS command required to load the first 2 transformed files (“data” and “mapping” files) into SAS as an XML library. Once the XML library has been successfully loaded in SAS, the last transformed file (“format”) can be run. This file is a SAS script that first copies the XML library into the Work library and then runs FORMAT commands for any codelists that were defined in the OpenClinica Study.

    Another factor may be the version of SAS. We run 9.3. I’m not sure when SAS added the functionality to allow XML files to be imported…..

    Regards,

    Linas Silva
    Analyst Programmer
    Peter MacCallum Cancer Centre
    www.petermac.org/BaCT
    Sent: Saturday, 25 January 2014 3:19 AM
    To: developers@openclinica.org
    Cc: Angie Webster
    Subject: Re: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Linas,

    I have run the ODM to SAS transformation and sent it over to our statisticians, but they are wondering how to get SAS datasets based on the 3 output files.

    Your advice would be highly appreciated.

    Thanks,

    Mihai Virtosu
    University of Utah
    Sent: Tuesday, January 21, 2014 7:31 PM
    To: developers@openclinica.org
    Subject: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hi Developers,

    Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).

    The transformations work by creating a “table” per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).

    Regards,

    Linas
    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
    This email (including any attachments or links) may contain
    confidential and/or legally privileged information and is
    intended only to be read or used by the addressee. If you
    are not the intended addressee, any use, distribution,
    disclosure or copying of this email is strictly
    prohibited.
    Confidentiality and legal privilege attached to this email
    (including any attachments) are not waived or lost by
    reason of its mistaken delivery to you.
    If you have received this email in error, please delete it
    and notify us immediately by telephone or email. Peter
    MacCallum Cancer Centre provides no guarantee that this
    transmission is free of virus or that it has not been
    intercepted or altered and will not be liable for any delay
    in its receipt.
  • agoodwinagoodwin Posts: 131 admin
    Hello all,
    This is exciting stuff. I'd love to see any feedback about these transforms and how useful people find them. I've created issues in Jira (I've separated them into 3 tickets - one for each output format). If everyone really likes these and if they seem polished enough, hopefully we can include them in a future release of OpenClinica (of course after it's been through our sdlc). Any level of detail and feedback you can provide on the tickets would be extremely helpful.
    Here are the Jira issues:
    Transform to SAS:
    http://jira.openclinica.com/browse/OC-4182
    Transform to R (dataframes):
    http://jira.openclinica.com/browse/OC-4188
    Transform to CSV:
    http://jira.openclinica.com/browse/OC-4194
    On Thu, Jan 23, 2014 at 2:17 AM, Lindsay Stevens wrote:
    Hello all,
    The wikibook is now updated and the files have been added to my GPLv2 repo.
    I managed to get the R transformation working as an extract format, here is an example extract.properties file.
    The main issue I had was that the xsls refer to the 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the '/openclinica.data/xslt' folder.
    Rather than updating the lookup xsl, I edited the main R transform xsl (new xsl is here) so that it:
    1. Names the datasets like '[Crfname]_[Itemgroupname]' with all non-alphanumeric characters taken out of both names, e.g. a dataset for a CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was doing.
    2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02 was ending up as the int 2012. Changing it such that dates are assigned with as.Date("2012-01-02") seemed to work, such that I could add and subtract days properly after doing that.
    3. Includes the labels in the same file as the dataframes, since I guess most of the time you'd want both anyway, and it would save concatenating the files later. It would have made the xsl a fair bit shorter to place the label part after each dataframe part but I kept them separate so the whole labels part can be easily chopped off the end if it's not wanted.
    Best regards,
    Lindsay
    On 23 January 2014 12:27, Alicia Goodwin wrote:
    Hi Linus,
    This is very exciting - Thanks for the contribution. We'll make sure there is a story in the back log (in jira) and we'll see how these could be potentially included in a future release. We're really excited and we will keep you posted!
    Cheers,
    Alicia
    On Wednesday, January 22, 2014, Thasbiha Khaja wrote:
    Linas,
    Thank you so much for sharing it. It is a great tool and can see it being helpful in a lot of ways. I was successful in getting the ODM transformed to SAS. It took only few seconds for the powershell script to complete. The xsl files - sas_data,sas_format and sas_map were generated successfully.
    Like Lindsay said, the only difficulty was to make windows execute the powershell script. In case if any one is interested, this is how I executed it in powershell v2:
    powershell -executionpolicy bypass -File .\powershell_perform_SAS_xsl_transforms.ps1
    Best,
    Thasbiha
    On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens wrote:
    Could I add the set to my scripts repo? That has a GPLv2 licence I think. I was then thinking I'd add a link to wikibook for the repo.
    I tried to get them working as extract formats (since they really should be) but it fails and the OC log says something about failing to compile the stylesheet. Has anyone had that kind of problem?
    Best regards,
    Lindsay
    On Jan 22, 2014 9:40 PM, "Christian Hänsel" wrote:
    Hi Linas
    There is no license attached.
    Regards,
    Christian
    ------------------------------------------------------------------------
    Dipl.-Inf. Christian Hänsel

    IT / Software Developer
    Tel.: +49-(0)89-5526189-16
    Fax : +49-(0)89-5526189-55
    E-Mail: c.haensel@reliatec.de
    ReliaTec GmbH
    Schleissheimer Str. 37
    85748 Garching Germany
    HRB 150060 / AG München
    Gf Thomas Herbig
    http://www.reliatec.de
    =========================================================================
    Am 22.01.2014 03:31, schrieb Silva Linas:
    > Hi Developers,
    >
    > Attached are XML transformation files (and powershell command scripts) for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML library; i.e. a SAS xml data file with a SAS mapping file). Code list lookups are also included for R (column duplicated as a factor with the lookup value) and SAS (using FORMAT commands after copying the XML library into the WORK library).
    >
    > The transformations work by creating a "table" per itemgroup. Prior to performing the transformations we dynamically create a lookup xsl template specific for the study that is used to map itemgroup IDs into more friendly table names (the code for this is not attached; an example of the output is, see xml_convert_dynamic_lookup.xsl).
    >
    > Regards,
    >
    > Linas
    >
    >
    > This email (including any attachments or links) may contain
    > confidential and/or legally privileged information and is
    > intended only to be read or used by the addressee. If you
    > are not the intended addressee, any use, distribution,
    > disclosure or copying of this email is strictly
    > prohibited.
    > Confidentiality and legal privilege attached to this email
    > (including any attachments) are not waived or lost by
    > reason of its mistaken delivery to you.
    > If you have received this email in error, please delete it
    > and notify us immediately by telephone or email. Peter
    > MacCallum Cancer Centre provides no guarantee that this
    > transmission is free of virus or that it has not been
    > intercepted or altered and will not be liable for any delay
    > in its receipt.
    >
    >
    >
    >
  • ccollinsccollins Posts: 361 admin
    Hi Linas and Lindsay,
    Would you be able to release these under a LGPL 2.1 license? It
    appears the scripts in Lindsay's repo are GPL 2, which unfortunately
    can't be directly incorporated into a LGPL work (which is what
    OpenClinica is).
    The differences are minor, let me know if this is doable.
    Thanks,
    Cal
    On Fri, Jan 31, 2014 at 9:42 AM, Alicia Goodwin
    wrote:
    > > Hello all,
    > >
    > > This is exciting stuff. I'd love to see any feedback about these transforms
    > > and how useful people find them. I've created issues in Jira (I've separated
    > > them into 3 tickets - one for each output format). If everyone really likes
    > > these and if they seem polished enough, hopefully we can include them in a
    > > future release of OpenClinica (of course after it's been through our sdlc).
    > > Any level of detail and feedback you can provide on the tickets would be
    > > extremely helpful.
    > >
    > > Here are the Jira issues:
    > > Transform to SAS:
    > > http://jira.openclinica.com/browse/OC-4182
    > >
    > > Transform to R (dataframes):
    > > http://jira.openclinica.com/browse/OC-4188
    > >
    > > Transform to CSV:
    > > http://jira.openclinica.com/browse/OC-4194
    > >
    > >
    > >
    > >
    > >
    > > On Thu, Jan 23, 2014 at 2:17 AM, Lindsay Stevens
    > > wrote:
    >> >>
    >> >> Hello all,
    >> >>
    >> >> The wikibook is now updated and the files have been added to my GPLv2
    >> >> repo.
    >> >>
    >> >> I managed to get the R transformation working as an extract format, here
    >> >> is an example extract.properties file.
    >> >>
    >> >> The main issue I had was that the xsls refer to the
    >> >> 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the
    >> >> '/openclinica.data/xslt' folder.
    >> >>
    >> >> Rather than updating the lookup xsl, I edited the main R transform xsl
    >> >> (new xsl is here) so that it:
    >> >>
    >> >> 1. Names the datasets like '[Crfname]_[Itemgroupname]' with all
    >> >> non-alphanumeric characters taken out of both names, e.g. a dataset for a
    >> >> CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes
    >> >> 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was
    >> >> doing.
    >> >>
    >> >> 2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02
    >> >> was ending up as the int 2012. Changing it such that dates are assigned with
    >> >> as.Date("2012-01-02") seemed to work, such that I could add and subtract
    >> >> days properly after doing that.
    >> >>
    >> >> 3. Includes the labels in the same file as the dataframes, since I guess
    >> >> most of the time you'd want both anyway, and it would save concatenating the
    >> >> files later. It would have made the xsl a fair bit shorter to place the
    >> >> label part after each dataframe part but I kept them separate so the whole
    >> >> labels part can be easily chopped off the end if it's not wanted.
    >> >>
    >> >> Best regards,
    >> >> Lindsay
    >> >>
    >> >>
    >> >>
    >> >> On 23 January 2014 12:27, Alicia Goodwin wrote:
    >>> >>>
    >>> >>> Hi Linus,
    >>> >>>
    >>> >>> This is very exciting - Thanks for the contribution. We'll make sure
    >>> >>> there is a story in the back log (in jira) and we'll see how these could be
    >>> >>> potentially included in a future release. We're really excited and we will
    >>> >>> keep you posted!
    >>> >>>
    >>> >>> Cheers,
    >>> >>> Alicia
    >>> >>>
    >>> >>> On Wednesday, January 22, 2014, Thasbiha Khaja
    >>> >>> wrote:
    >>>> >>>>
    >>>> >>>> Linas,
    >>>> >>>>
    >>>> >>>> Thank you so much for sharing it. It is a great tool and can see it
    >>>> >>>> being helpful in a lot of ways. I was successful in getting the ODM
    >>>> >>>> transformed to SAS. It took only few seconds for the powershell script to
    >>>> >>>> complete. The xsl files - sas_data,sas_format and sas_map were generated
    >>>> >>>> successfully.
    >>>> >>>>
    >>>> >>>> Like Lindsay said, the only difficulty was to make windows execute the
    >>>> >>>> powershell script. In case if any one is interested, this is how I executed
    >>>> >>>> it in powershell v2:
    >>>> >>>>
    >>>> >>>> powershell -executionpolicy bypass -File
    >>>> >>>> .\powershell_perform_SAS_xsl_transforms.ps1
    >>>> >>>>
    >>>> >>>> Best,
    >>>> >>>> Thasbiha
    >>>> >>>>
    >>>> >>>>
    >>>> >>>> On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens
    >>>> >>>> wrote:
    >>>> >>>>
    >>>> >>>> Could I add the set to my scripts repo? That has a GPLv2 licence I
    >>>> >>>> think. I was then thinking I'd add a link to wikibook for the repo.
    >>>> >>>>
    >>>> >>>> I tried to get them working as extract formats (since they really should
    >>>> >>>> be) but it fails and the OC log says something about failing to compile the
    >>>> >>>> stylesheet. Has anyone had that kind of problem?
    >>>> >>>>
    >>>> >>>> Best regards,
    >>>> >>>> Lindsay
    >>>> >>>>
    >>>> >>>> On Jan 22, 2014 9:40 PM, "Christian Hänsel"
    >>>> >>>> wrote:
    >>>> >>>>
    >>>> >>>> Hi Linas
    >>>> >>>>
    >>>> >>>> There is no license attached.
    >>>> >>>>
    >>>> >>>> Regards,
    >>>> >>>> Christian
    >>>> >>>>
    >>>> >>>> ------------------------------------------------------------------------
    >>>> >>>>
    >>>> >>>> Dipl.-Inf. Christian Hänsel
    >>>> >>>>
    >>>> >>>> IT / Software Developer
    >>>> >>>>
    >>>> >>>> Tel.: +49-(0)89-5526189-16
    >>>> >>>> Fax : +49-(0)89-5526189-55
    >>>> >>>> E-Mail: c.haensel@reliatec.de
    >>>> >>>>
    >>>> >>>> ReliaTec GmbH
    >>>> >>>> Schleissheimer Str. 37
    >>>> >>>> 85748 Garching Germany
    >>>> >>>> HRB 150060 / AG München
    >>>> >>>> Gf Thomas Herbig
    >>>> >>>> http://www.reliatec.de
    >>>> >>>>
    >>>> >>>>
    >>>> >>>> =========================================================================
    >>>> >>>>
    >>>> >>>> Am 22.01.2014 03:31, schrieb Silva Linas:
    >>>> >>>>
    >>>> >>>> Hi Developers,
    >>>> >>>>
    >>>> >>>> Attached are XML transformation files (and powershell command scripts)
    >>>> >>>> for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML
    >>>> >>>> library; i.e. a SAS xml data file with a SAS mapping file). Code list
    >>>> >>>> lookups are also included for R (column duplicated as a factor with the
    >>>> >>>> lookup value) and SAS (using FORMAT commands after copying the XML library
    >>>> >>>> into the WORK library).
    >>>> >>>>
    >>>> >>>> The transformations work by creating a "table" per itemgroup. Prior to
    >>>> >>>> performing the transformations we dynamically create a lookup xsl template
    >>>> >>>> specific for the study that is used to map itemgroup IDs into more friendly
    >>>> >>>> table names (the code for this is not attached; an example of the output is,
    >>>> >>>> see xml_convert_dynamic_lookup.xsl).
    >>>> >>>>
    >>>> >>>> Regards,
    >>>> >>>>
    >>>> >>>> Linas
    >>>> >>>>
    >>>> >>>>
    >>>> >>>> This email (including any attachments or links) may contain
    >>>> >>>> confidential and/or legally privileged information and is
    >>>> >>>> intended only to be read or used by the addressee. If you
    >>>> >>>> are not the intended addressee, any use, distribution,
    >>>> >>>> disclosure or copying of this email is strictly
    >>>> >>>> prohibited.
    >>>> >>>> Confidentiality and legal privilege attached to this email
    >>>> >>>> (including any attachments) are not waived or lost by
    >>>> >>>> reason of its mistaken delivery to you.
    >>>> >>>> If you have received this email in error, please delete it
    >>>> >>>> and notify us immediately by telephone or email. Peter
    >>>> >>>> MacCallum Cancer Centre provides no guarantee that this
    >>>> >>>> transmission is free of virus or that it has not been
    >>>> >>>> intercepted or altered and will not be liable for any delay
    >>>> >>>> in its receipt.
    >>>> >>>>
    >>>> >>>>
    >>>> >>>>
    >>>> >>>>
  • lindsay.stevenslindsay.stevens Posts: 391 ✭✭
    Hi Cal,
    Sorry I didn't realise GPL and LGPL were incompatible. I have changed it to MIT so it can be used for anything, since that was the intention (Linas I hope this is OK with you).
    Best regards,
    Lindsay
    On 1 February 2014 01:53, Cal Collins wrote:
    Hi Linas and Lindsay,
    Would you be able to release these under a LGPL 2.1 license? It
    appears the scripts in Lindsay's repo are GPL 2, which unfortunately
    can't be directly incorporated into a LGPL work (which is what
    OpenClinica is).
    The differences are minor, let me know if this is doable.
    Thanks,
    Cal
    On Fri, Jan 31, 2014 at 9:42 AM, Alicia Goodwin
    wrote:
    > Hello all,
    >
    > This is exciting stuff. I'd love to see any feedback about these transforms
    > and how useful people find them. I've created issues in Jira (I've separated
    > them into 3 tickets - one for each output format). If everyone really likes
    > these and if they seem polished enough, hopefully we can include them in a
    > future release of OpenClinica (of course after it's been through our sdlc).
    > Any level of detail and feedback you can provide on the tickets would be
    > extremely helpful.
    >
    > Here are the Jira issues:
    > Transform to SAS:
    > http://jira.openclinica.com/browse/OC-4182
    >
    > Transform to R (dataframes):
    > http://jira.openclinica.com/browse/OC-4188
    >
    > Transform to CSV:
    > http://jira.openclinica.com/browse/OC-4194
    >
    >
    >
    >
    >
    > On Thu, Jan 23, 2014 at 2:17 AM, Lindsay Stevens
    > wrote:
    >>
    >> Hello all,
    >>
    >> The wikibook is now updated and the files have been added to my GPLv2
    >> repo.
    >>
    >> I managed to get the R transformation working as an extract format, here
    >> is an example extract.properties file.
    >>
    >> The main issue I had was that the xsls refer to the
    >> 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the
    >> '/openclinica.data/xslt' folder.
    >>
    >> Rather than updating the lookup xsl, I edited the main R transform xsl
    >> (new xsl is here) so that it:
    >>
    >> 1. Names the datasets like '[Crfname]_[Itemgroupname]' with all
    >> non-alphanumeric characters taken out of both names, e.g. a dataset for a
    >> CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes
    >> 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was
    >> doing.
    >>
    >> 2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02
    >> was ending up as the int 2012. Changing it such that dates are assigned with
    >> as.Date("2012-01-02") seemed to work, such that I could add and subtract
    >> days properly after doing that.
    >>
    >> 3. Includes the labels in the same file as the dataframes, since I guess
    >> most of the time you'd want both anyway, and it would save concatenating the
    >> files later. It would have made the xsl a fair bit shorter to place the
    >> label part after each dataframe part but I kept them separate so the whole
    >> labels part can be easily chopped off the end if it's not wanted.
    >>
    >> Best regards,
    >> Lindsay
    >>
    >>
    >>
    >> On 23 January 2014 12:27, Alicia Goodwin wrote:
    >>>
    >>> Hi Linus,
    >>>
    >>> This is very exciting - Thanks for the contribution. We'll make sure
    >>> there is a story in the back log (in jira) and we'll see how these could be
    >>> potentially included in a future release. We're really excited and we will
    >>> keep you posted!
    >>>
    >>> Cheers,
    >>> Alicia
    >>>
    >>> On Wednesday, January 22, 2014, Thasbiha Khaja
    >>> wrote:
    >>>>
    >>>> Linas,
    >>>>
    >>>> Thank you so much for sharing it. It is a great tool and can see it
    >>>> being helpful in a lot of ways. I was successful in getting the ODM
    >>>> transformed to SAS. It took only few seconds for the powershell script to
    >>>> complete. The xsl files - sas_data,sas_format and sas_map were generated
    >>>> successfully.
    >>>>
    >>>> Like Lindsay said, the only difficulty was to make windows execute the
    >>>> powershell script. In case if any one is interested, this is how I executed
    >>>> it in powershell v2:
    >>>>
    >>>> powershell -executionpolicy bypass -File
    >>>> .\powershell_perform_SAS_xsl_transforms.ps1
    >>>>
    >>>> Best,
    >>>> Thasbiha
    >>>>
    >>>>
    >>>> On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens
    >>>> wrote:
    >>>>
    >>>> Could I add the set to my scripts repo? That has a GPLv2 licence I
    >>>> think. I was then thinking I'd add a link to wikibook for the repo.
    >>>>
    >>>> I tried to get them working as extract formats (since they really should
    >>>> be) but it fails and the OC log says something about failing to compile the
    >>>> stylesheet. Has anyone had that kind of problem?
    >>>>
    >>>> Best regards,
    >>>> Lindsay
    >>>>
    >>>> On Jan 22, 2014 9:40 PM, "Christian Hänsel"
    >>>> wrote:
    >>>>
    >>>> Hi Linas
    >>>>
    >>>> There is no license attached.
    >>>>
    >>>> Regards,
    >>>> Christian
    >>>>
    >>>> ------------------------------------------------------------------------
    >>>>
    >>>> Dipl.-Inf. Christian Hänsel
    >>>>
    >>>> IT / Software Developer
    >>>>
    >>>> Tel.: +49-(0)89-5526189-16
    >>>> Fax : +49-(0)89-5526189-55
    >>>> E-Mail: c.haensel@reliatec.de
    >>>>
    >>>> ReliaTec GmbH
    >>>> Schleissheimer Str. 37
    >>>> 85748 Garching Germany
    >>>> HRB 150060 / AG München
    >>>> Gf Thomas Herbig
    >>>> http://www.reliatec.de
    >>>>
    >>>>
    >>>> =========================================================================
    >>>>
    >>>> Am 22.01.2014 03:31, schrieb Silva Linas:
    >>>>
    >>>> Hi Developers,
    >>>>
    >>>> Attached are XML transformation files (and powershell command scripts)
    >>>> for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML
    >>>> library; i.e. a SAS xml data file with a SAS mapping file). Code list
    >>>> lookups are also included for R (column duplicated as a factor with the
    >>>> lookup value) and SAS (using FORMAT commands after copying the XML library
    >>>> into the WORK library).
    >>>>
    >>>> The transformations work by creating a "table" per itemgroup. Prior to
    >>>> performing the transformations we dynamically create a lookup xsl template
    >>>> specific for the study that is used to map itemgroup IDs into more friendly
    >>>> table names (the code for this is not attached; an example of the output is,
    >>>> see xml_convert_dynamic_lookup.xsl).
    >>>>
    >>>> Regards,
    >>>>
    >>>> Linas
    >>>>
    >>>>
    >>>> This email (including any attachments or links) may contain
    >>>> confidential and/or legally privileged information and is
    >>>> intended only to be read or used by the addressee. If you
    >>>> are not the intended addressee, any use, distribution,
    >>>> disclosure or copying of this email is strictly
    >>>> prohibited.
    >>>> Confidentiality and legal privilege attached to this email
    >>>> (including any attachments) are not waived or lost by
    >>>> reason of its mistaken delivery to you.
    >>>> If you have received this email in error, please delete it
    >>>> and notify us immediately by telephone or email. Peter
    >>>> MacCallum Cancer Centre provides no guarantee that this
    >>>> transmission is free of virus or that it has not been
    >>>> intercepted or altered and will not be liable for any delay
    >>>> in its receipt.
    >>>>
    >>>>
    >>>>
    >>>>
  • LinasLinas Posts: 5
    Hi Lindsay,

    Yes that’s fine. The code was intended to be freely used by anyone.

    Regards,

    Linas
    Sent: Monday, 3 February 2014 11:46 AM
    To: developers@openclinica.org
    Subject: Re: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hi Cal,
    Sorry I didn't realise GPL and LGPL were incompatible. I have changed it to MIT so it can be used for anything, since that was the intention (Linas I hope this is OK with you).
    Best regards,
    Lindsay

    On 1 February 2014 01:53, Cal Collins wrote:
    Hi Linas and Lindsay,
    Would you be able to release these under a LGPL 2.1 license? It
    appears the scripts in Lindsay's repo are GPL 2, which unfortunately
    can't be directly incorporated into a LGPL work (which is what
    OpenClinica is).
    The differences are minor, let me know if this is doable.
    Thanks,
    Cal
    On Fri, Jan 31, 2014 at 9:42 AM, Alicia Goodwin
    wrote:
    > Hello all,
    >
    > This is exciting stuff. I'd love to see any feedback about these transforms
    > and how useful people find them. I've created issues in Jira (I've separated
    > them into 3 tickets - one for each output format). If everyone really likes
    > these and if they seem polished enough, hopefully we can include them in a
    > future release of OpenClinica (of course after it's been through our sdlc).
    > Any level of detail and feedback you can provide on the tickets would be
    > extremely helpful.
    >
    > Here are the Jira issues:
    > Transform to SAS:
    > http://jira.openclinica.com/browse/OC-4182
    >
    > Transform to R (dataframes):
    > http://jira.openclinica.com/browse/OC-4188
    >
    > Transform to CSV:
    > http://jira.openclinica.com/browse/OC-4194
    >
    >
    >
    >
    >
    > On Thu, Jan 23, 2014 at 2:17 AM, Lindsay Stevens
    > wrote:
    >>
    >> Hello all,
    >>
    >> The wikibook is now updated and the files have been added to my GPLv2
    >> repo.
    >>
    >> I managed to get the R transformation working as an extract format, here
    >> is an example extract.properties file.
    >>
    >> The main issue I had was that the xsls refer to the
    >> 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the
    >> '/openclinica.data/xslt' folder.
    >>
    >> Rather than updating the lookup xsl, I edited the main R transform xsl
    >> (new xsl is here) so that it:
    >>
    >> 1. Names the datasets like '[Crfname]_[Itemgroupname]' with all
    >> non-alphanumeric characters taken out of both names, e.g. a dataset for a
    >> CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes
    >> 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was
    >> doing.
    >>
    >> 2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02
    >> was ending up as the int 2012. Changing it such that dates are assigned with
    >> as.Date("2012-01-02") seemed to work, such that I could add and subtract
    >> days properly after doing that.
    >>
    >> 3. Includes the labels in the same file as the dataframes, since I guess
    >> most of the time you'd want both anyway, and it would save concatenating the
    >> files later. It would have made the xsl a fair bit shorter to place the
    >> label part after each dataframe part but I kept them separate so the whole
    >> labels part can be easily chopped off the end if it's not wanted.
    >>
    >> Best regards,
    >> Lindsay
    >>
    >>
    >>
    >> On 23 January 2014 12:27, Alicia Goodwin wrote:
    >>>
    >>> Hi Linus,
    >>>
    >>> This is very exciting - Thanks for the contribution. We'll make sure
    >>> there is a story in the back log (in jira) and we'll see how these could be
    >>> potentially included in a future release. We're really excited and we will
    >>> keep you posted!
    >>>
    >>> Cheers,
    >>> Alicia
    >>>
    >>> On Wednesday, January 22, 2014, Thasbiha Khaja
    >>> wrote:
    >>>>
    >>>> Linas,
    >>>>
    >>>> Thank you so much for sharing it. It is a great tool and can see it
    >>>> being helpful in a lot of ways. I was successful in getting the ODM
    >>>> transformed to SAS. It took only few seconds for the powershell script to
    >>>> complete. The xsl files - sas_data,sas_format and sas_map were generated
    >>>> successfully.
    >>>>
    >>>> Like Lindsay said, the only difficulty was to make windows execute the
    >>>> powershell script. In case if any one is interested, this is how I executed
    >>>> it in powershell v2:
    >>>>
    >>>> powershell -executionpolicy bypass -File
    >>>> .\powershell_perform_SAS_xsl_transforms.ps1
    >>>>
    >>>> Best,
    >>>> Thasbiha
    >>>>
    >>>>
    >>>> On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens
    >>>> wrote:
    >>>>
    >>>> Could I add the set to my scripts repo? That has a GPLv2 licence I
    >>>> think. I was then thinking I'd add a link to wikibook for the repo.
    >>>>
    >>>> I tried to get them working as extract formats (since they really should
    >>>> be) but it fails and the OC log says something about failing to compile the
    >>>> stylesheet. Has anyone had that kind of problem?
    >>>>
    >>>> Best regards,
    >>>> Lindsay
    >>>>
    >>>> On Jan 22, 2014 9:40 PM, "Christian Hänsel"
    >>>> wrote:
    >>>>
    >>>> Hi Linas
    >>>>
    >>>> There is no license attached.
    >>>>
    >>>> Regards,
    >>>> Christian
    >>>>
    >>>> ------------------------------------------------------------------------
    >>>>
    >>>> Dipl.-Inf. Christian Hänsel
    >>>>
    >>>> IT / Software Developer
    >>>>
    >>>> Tel.: +49-(0)89-5526189-16
    >>>> Fax : +49-(0)89-5526189-55
    >>>> E-Mail: c.haensel@reliatec.de
    >>>>
    >>>> ReliaTec GmbH
    >>>> Schleissheimer Str. 37
    >>>> 85748 Garching Germany
    >>>> HRB 150060 / AG München
    >>>> Gf Thomas Herbig
    >>>> http://www.reliatec.de
    >>>>
    >>>>
    >>>> =========================================================================
    >>>>
    >>>> Am 22.01.2014 03:31, schrieb Silva Linas:
    >>>>
    >>>> Hi Developers,
    >>>>
    >>>> Attached are XML transformation files (and powershell command scripts)
    >>>> for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML
    >>>> library; i.e. a SAS xml data file with a SAS mapping file). Code list
    >>>> lookups are also included for R (column duplicated as a factor with the
    >>>> lookup value) and SAS (using FORMAT commands after copying the XML library
    >>>> into the WORK library).
    >>>>
    >>>> The transformations work by creating a "table" per itemgroup. Prior to
    >>>> performing the transformations we dynamically create a lookup xsl template
    >>>> specific for the study that is used to map itemgroup IDs into more friendly
    >>>> table names (the code for this is not attached; an example of the output is,
    >>>> see xml_convert_dynamic_lookup.xsl).
    >>>>
    >>>> Regards,
    >>>>
    >>>> Linas
    >>>>
    >>>>
    >>>> This email (including any attachments or links) may contain
    >>>> confidential and/or legally privileged information and is
    >>>> intended only to be read or used by the addressee. If you
    >>>> are not the intended addressee, any use, distribution,
    >>>> disclosure or copying of this email is strictly
    >>>> prohibited.
    >>>> Confidentiality and legal privilege attached to this email
    >>>> (including any attachments) are not waived or lost by
    >>>> reason of its mistaken delivery to you.
    >>>> If you have received this email in error, please delete it
    >>>> and notify us immediately by telephone or email. Peter
    >>>> MacCallum Cancer Centre provides no guarantee that this
    >>>> transmission is free of virus or that it has not been
    >>>> intercepted or altered and will not be liable for any delay
    >>>> in its receipt.
    >>>>
    >>>>
    >>>>
    >>>>
  • ccollinsccollins Posts: 361 admin
    Fantastic, thanks!
    On Feb 2, 2014 8:06 PM, "Silva Linas" wrote:
    Hi Lindsay,

    Yes that’s fine. The code was intended to be freely used by anyone.

    Regards,

    Linas
    Sent: Monday, 3 February 2014 11:46 AM
    To: developers@openclinica.org
    Subject: Re: [Developers] Tools to convert OpenClinica Extracts to CSV, R and SAS

    Hi Cal,
    Sorry I didn't realise GPL and LGPL were incompatible. I have changed it to MIT so it can be used for anything, since that was the intention (Linas I hope this is OK with you).
    Best regards,
    Lindsay

    On 1 February 2014 01:53, Cal Collins wrote:
    Hi Linas and Lindsay,
    Would you be able to release these under a LGPL 2.1 license? It
    appears the scripts in Lindsay's repo are GPL 2, which unfortunately
    can't be directly incorporated into a LGPL work (which is what
    OpenClinica is).
    The differences are minor, let me know if this is doable.
    Thanks,
    Cal
    On Fri, Jan 31, 2014 at 9:42 AM, Alicia Goodwin
    wrote:
    > Hello all,
    >
    > This is exciting stuff. I'd love to see any feedback about these transforms
    > and how useful people find them. I've created issues in Jira (I've separated
    > them into 3 tickets - one for each output format). If everyone really likes
    > these and if they seem polished enough, hopefully we can include them in a
    > future release of OpenClinica (of course after it's been through our sdlc).
    > Any level of detail and feedback you can provide on the tickets would be
    > extremely helpful.
    >
    > Here are the Jira issues:
    > Transform to SAS:
    > http://jira.openclinica.com/browse/OC-4182
    >
    > Transform to R (dataframes):
    > http://jira.openclinica.com/browse/OC-4188
    >
    > Transform to CSV:
    > http://jira.openclinica.com/browse/OC-4194
    >
    >
    >
    >
    >
    > On Thu, Jan 23, 2014 at 2:17 AM, Lindsay Stevens
    > wrote:
    >>
    >> Hello all,
    >>
    >> The wikibook is now updated and the files have been added to my GPLv2
    >> repo.
    >>
    >> I managed to get the R transformation working as an extract format, here
    >> is an example extract.properties file.
    >>
    >> The main issue I had was that the xsls refer to the
    >> 'xml_convert_dynamic_lookup.xsl', which I hadn't included in the
    >> '/openclinica.data/xslt' folder.
    >>
    >> Rather than updating the lookup xsl, I edited the main R transform xsl
    >> (new xsl is here) so that it:
    >>
    >> 1. Names the datasets like '[Crfname]_[Itemgroupname]' with all
    >> non-alphanumeric characters taken out of both names, e.g. a dataset for a
    >> CRF called 'My_CRF' and an Item Group called 'My_Item_Group' becomes
    >> 'Mycrf_Myitemgroup'. This seemed to be more or less what the lookup xsl was
    >> doing.
    >>
    >> 2. Assigns dates as character-y dates, e.g. the unquoted string 2012-01-02
    >> was ending up as the int 2012. Changing it such that dates are assigned with
    >> as.Date("2012-01-02") seemed to work, such that I could add and subtract
    >> days properly after doing that.
    >>
    >> 3. Includes the labels in the same file as the dataframes, since I guess
    >> most of the time you'd want both anyway, and it would save concatenating the
    >> files later. It would have made the xsl a fair bit shorter to place the
    >> label part after each dataframe part but I kept them separate so the whole
    >> labels part can be easily chopped off the end if it's not wanted.
    >>
    >> Best regards,
    >> Lindsay
    >>
    >>
    >>
    >> On 23 January 2014 12:27, Alicia Goodwin wrote:
    >>>
    >>> Hi Linus,
    >>>
    >>> This is very exciting - Thanks for the contribution. We'll make sure
    >>> there is a story in the back log (in jira) and we'll see how these could be
    >>> potentially included in a future release. We're really excited and we will
    >>> keep you posted!
    >>>
    >>> Cheers,
    >>> Alicia
    >>>
    >>> On Wednesday, January 22, 2014, Thasbiha Khaja
    >>> wrote:
    >>>>
    >>>> Linas,
    >>>>
    >>>> Thank you so much for sharing it. It is a great tool and can see it
    >>>> being helpful in a lot of ways. I was successful in getting the ODM
    >>>> transformed to SAS. It took only few seconds for the powershell script to
    >>>> complete. The xsl files - sas_data,sas_format and sas_map were generated
    >>>> successfully.
    >>>>
    >>>> Like Lindsay said, the only difficulty was to make windows execute the
    >>>> powershell script. In case if any one is interested, this is how I executed
    >>>> it in powershell v2:
    >>>>
    >>>> powershell -executionpolicy bypass -File
    >>>> .\powershell_perform_SAS_xsl_transforms.ps1
    >>>>
    >>>> Best,
    >>>> Thasbiha
    >>>>
    >>>>
    >>>> On Wed, Jan 22, 2014 at 5:18 PM, Lindsay Stevens
    >>>> wrote:
    >>>>
    >>>> Could I add the set to my scripts repo? That has a GPLv2 licence I
    >>>> think. I was then thinking I'd add a link to wikibook for the repo.
    >>>>
    >>>> I tried to get them working as extract formats (since they really should
    >>>> be) but it fails and the OC log says something about failing to compile the
    >>>> stylesheet. Has anyone had that kind of problem?
    >>>>
    >>>> Best regards,
    >>>> Lindsay
    >>>>
    >>>> On Jan 22, 2014 9:40 PM, "Christian Hänsel"
    >>>> wrote:
    >>>>
    >>>> Hi Linas
    >>>>
    >>>> There is no license attached.
    >>>>
    >>>> Regards,
    >>>> Christian
    >>>>
    >>>> ------------------------------------------------------------------------
    >>>>
    >>>> Dipl.-Inf. Christian Hänsel
    >>>>
    >>>> IT / Software Developer
    >>>>
    >>>> Tel.: +49-(0)89-5526189-16
    >>>> Fax : +49-(0)89-5526189-55
    >>>> E-Mail: c.haensel@reliatec.de
    >>>>
    >>>> ReliaTec GmbH
    >>>> Schleissheimer Str. 37
    >>>> 85748 Garching Germany
    >>>> HRB 150060 / AG München
    >>>> Gf Thomas Herbig
    >>>> http://www.reliatec.de
    >>>>
    >>>>
    >>>> =========================================================================
    >>>>
    >>>> Am 22.01.2014 03:31, schrieb Silva Linas:
    >>>>
    >>>> Hi Developers,
    >>>>
    >>>> Attached are XML transformation files (and powershell command scripts)
    >>>> for converting OpenClinica extracts into CSV, R (dataframes) and SAS (XML
    >>>> library; i.e. a SAS xml data file with a SAS mapping file). Code list
    >>>> lookups are also included for R (column duplicated as a factor with the
    >>>> lookup value) and SAS (using FORMAT commands after copying the XML library
    >>>> into the WORK library).
    >>>>
    >>>> The transformations work by creating a "table" per itemgroup. Prior to
    >>>> performing the transformations we dynamically create a lookup xsl template
    >>>> specific for the study that is used to map itemgroup IDs into more friendly
    >>>> table names (the code for this is not attached; an example of the output is,
    >>>> see xml_convert_dynamic_lookup.xsl).
    >>>>
    >>>> Regards,
    >>>>
    >>>> Linas
    >>>>
    >>>>
    >>>> This email (including any attachments or links) may contain
    >>>> confidential and/or legally privileged information and is
    >>>> intended only to be read or used by the addressee. If you
    >>>> are not the intended addressee, any use, distribution,
    >>>> disclosure or copying of this email is strictly
    >>>> prohibited.
    >>>> Confidentiality and legal privilege attached to this email
    >>>> (including any attachments) are not waived or lost by
    >>>> reason of its mistaken delivery to you.
    >>>> If you have received this email in error, please delete it
    >>>> and notify us immediately by telephone or email. Peter
    >>>> MacCallum Cancer Centre provides no guarantee that this
    >>>> transmission is free of virus or that it has not been
    >>>> intercepted or altered and will not be liable for any delay
    >>>> in its receipt.
    >>>>
    >>>>
    >>>>
    >>>>
  • bbaumannbbaumann Posts: 104 admin
    I just wanted to let folks know that Linas will be presenting this work at OC14 in June:


    Exciting!

    - Ben
  • kristiakkristiak Posts: 1,201 ✭✭✭
    Hi Lindsay,
    I have tried to sort out what I have to do at a minimum to get myi OC data converted into your ACCESS format with all the data converted with all the variables listed in columns by patient. What I understand from github.com/lindsay-stevens-kirby/openclinica_sqldatamart is that that is what is done in this GitHub reference. To me it a little "overload" with so many options and links. Is there any way you could point me to a simple instruction where I could access the PGSQL 9.3 data restored from a OC backup file and just point me to what I need to do to access this data in MS ACCESS. I'm very familiar with ACCESS ODBC from my time at Merck & Co Inc (>10 years ago) where we stored all our data in a long and skinny Oracle table and read it with ODBC into MS ACCESS.

    Best regards

    Krister
Sign In or Register to comment.