simultaneous data entry and data export issues

2»

Comments

  • kristiakkristiak Posts: 1,313 ✭✭✭
    Hi Encarnita,

    From you description it seems that collecting details from diseases that the patients did not suffer from seems unnecessary and you could probably reduce your database size by editing your data and remove meaningless information. I think, though, that you can achieve most performance gains by following the suggestions from Lindsay and Tomas. Then limit your extractions to data that logically belongs together by visit, diseases etc.

    Best
    Krister
  • ebsebs Posts: 126 ✭✭
    The VisualVM tool included in the Java JDK has been useful to identify memory issues in the past.

    You will need to set a few java options in Tomcat....

    -Dcom.sun.management.jmxremote=true
    -Dcom.sun.management.jmxremote.port=
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.authenticate=false

    Once set up in the tool, you can monitor the performance in real time whilst you try an export. It should be fairly obvious when it goes wrong!

    Cheers
    Eric
  • encarnitaencarnita Posts: 13
    Hi everyone,

    Sorry for taking time! @toskrip your recommendations saved us!!!! thanks a lot, a lot!
    We did upgrade the heap memory of Tomcat as our main issue was certainly associated with the number of increasing data and the high number of items and we have no other applications using Tomcat. And the results are quite good (at least for us):
    - we can launch the extraction of 1200 items * 124 subjects in xls format and retrieve the results in 3h (which I think is reasonable, at least for us now). I want to specify that here we did not make any logical selection of the item (date, etc...) as initially suggested by @krister but as also suggested by @krister we did run the extraction only for this set of items at a time. Therefore, for our 5000 items, we expect to require 3-4 extraction, where two may take 3-4 hours to retrieve an xls file.
    - apparently, two people can be connected at the same time without slowing the data entry. So we are progressively introducing more people.

    I don't know if it could be better. Is there a maximum of heap memory that can be allocated to Tomcat?

    Again, thanks a lot @toskrip !

    best!
  • encarnitaencarnita Posts: 13
    Hi @krister,

    Well, in fact yes it is important since this is exactly the purpose of our study design. we want to compare different diseases, so we need to have a value (even if it is 0 or no) for all the patients regardless of the disease.
    nevertheless, we went with our clinicians through all the CRFs and discarded several items that were not necessary (such as individual answers used the calculation for a disease activity score...).

    Best

  • encarnitaencarnita Posts: 13
    Hi Eric @ebs!
    Thanks for the suggestion, I'll ask our IT angel to help us on that as this will definitely help us on tracking our tests and identify the limits !
    Best
  • RCHENURCHENU Posts: 203
    May I ask you what are now your Java parameters ?

    to IT people: Is there a time limit for an extraction or the main factor is the memory available ?

    thank you.


    Romain.
  • ebsebs Posts: 126 ✭✭
    Hi Romain,

    There is a Tomcat timeout setting which could have an effect - https://forums.openclinica.com/discussion/8621/default-timeout-period-for-openclinica

    In theory having more memory should speed up the extract process and help reduce getting an out of memory error, which you should see in the logs.

    I'd recommend setting up VisualVM on a test instance and trying to get the optimum JVM settings.

    Allocating too much memory can also actually slow your application but that's getting into a garbage collection area!

    Cheers
    Eric
  • RCHENURCHENU Posts: 203
    Ok.

    I'm also on a VM and my extraction takes for the moment 1h40 (SPPS). ODM is only taking 2min...

    I don't have any warning or error for the moment but I wanted to know as the extraction is getting bigger and bigger.
    For the moment my java parameters are these ones:
    JAVA_OPTS="-Xmx1280m -XX:MaxPermSize=512m

    Romain.
  • toskriptoskrip Posts: 265 ✭✭
    Any other extraction format than ODM should be taking longer (by design). The reason for this is the data extraction concept in OC. It is good to keep it in mind. OC basically always first perform ODM extraction at the first place and if other format was requested, then XML transformation job transform ODM XML into chosen export format. It is actually very wise, as this way it is possible to extend the list of supported extraction formats without any changes in Java source code (just by providing a proper transformation stylesheet = XSLT).

    If you are using 64bit environment then the theoretical maximum amount of memory we are talking about is about 256 TB. But as it was mentioned here the JVM memory configuration should be tweaked per specific deployment scenario. More memory assigned to JVM does not automatically lead to better performance (depending on used algorithm, garbage collection can suffer from too much memory that it has to manage). I would always only recommend to increase the limits when you are running into memory related problems (such as the one that was described here). And as stated above, these are almost always visible from log files.

    I also recommend people to monitor their Java applications in QA and production environments. There is a number of tools that can do such job (e.g. javamelody). Active monitoring, when done correctly have only very little performance impact and will help you prevent critical issues that can affect your users in long term.

    Tomas
  • RCHENURCHENU Posts: 203
    Thank you Tomas for your exhaustive answer.
    I will look at the monitoring tool you suggest.

    Romain.
Sign In or Register to comment.