We hope you'll join us for our 4/23 webinar on using data tables to apply reference ranges and AE codes in OC4. For more information and to register, visit https://register.gotowebinar.com/register/2882170018956684555

using oc with https

2»

Comments

  • Dear Marco,

    Looking at performance improvements in the last couple of versions, there were some solid improvements, and there are still some in the pipeline.

    From the performance point of view there was some issues fixed in 3.1.2:
    https://issuetracker.openclinica.com/view.php?id=12129 (SQL query caching)

    Issues fixed in 3.1.3:
    https://issuetracker.openclinica.com/view.php?id=14508 (advice on Tomcat tuning)
    https://issuetracker.openclinica.com/view.php?id=12979 (query optimisation)
    https://issuetracker.openclinica.com/view.php?id=13347 (Discrepancy Note fixes)

    The blog’s (http://blog.openclinica.com/2012/12/03/openclinica-3-1-3-now-available/) claim of ‘Improved performance and stability, including 80% increase in max user load and 40% faster page turn times’, appears to be based on the analysis in this issue https://issuetracker.openclinica.com/view.php?id=15605

    Issues for future versions:
    https://issuetracker.openclinica.com/view.php?id=12128 (study metadata caching)
    https://issuetracker.openclinica.com/view.php?id=12197 (slow Event updates, not sure of the status as it hasn’t been updated for a while)
    https://issuetracker.openclinica.com/view.php?id=14467 (refactoring data entry saving)

    As the performance of the server appears to be limited by Postgres, have you seen an OpenClinica instance open more than 2 connections under load (and use more than 2 cores on the db server)? I’ve just asked this here: https://issuetracker.openclinica.com/view.php?id=16472

    Yours,

    Michael

    From: Marco van Zwetselaar [mailto:[email protected]]
    Sent: 19 December 2012 20:45
    To: Michael Bluett
    Cc: [email protected]
    Subject: Re: [Users] using oc with https

    Michael,

    From the little evidence I have, upgrading to 3.1.3 indeed gives a performance gain that gets noticed by the users, so upgrading might be a good bet. The upgrade was smooth but for one glitch (which was a problem in a CRF, not in OC): we had a CRF with the same repeating group in three sections. This worked fine on 3.1.2, but not on 3.1.3 where it shows a very wide (but broken) table on each tab.

    I do notice a awful drop in performance when setting logging to debug (which is why I asked about individual logger level configuration on the developers list). Also, I notice a remarkable number of individual database queries ('select from item where item_id = ...') in situations where I wouldn't expect any - such as when scheduling an event or saving even a one-variable CRF. Finally, the time for the subject matrix to appear has been reduced, but is still quite dramatic (worsened by the fact that one tends to land on that page often ...).

    The fact that you don't see a lot of effect when switching to faster disks may be attributable to having plenty of memory (and allocating it to PostgreSQL), which would then mostly operate from its cache. That's why I said OC is data-bound, rather than I/O bound per se.

    Regarding compression and https: encrypted data is highly random (by definition), so it can't be compressed a lot. Compression must happen before encryption. If you terminate https on Tomcat, then there is no use in fronting it with an httpd (since it will cache nor compress). OTOH if you terminate https on your httpd, then clearly it can do the compression and encryption, which will offload a few cycles from the Tomcat server.

    As for pipelining, I actually meant keepalive (which became the default in http/1.1), and which means most requests don't result in a new connection (= latency). Pipelining would be "keepalive on steroids" :-) BTW Apache httpd has a very short default keepalive setting; you may want to increase it considerably, in particular if you don't have loads of concurrent users.

    OC could really use some thorough performance analysis and tuning ...

    Regards,
    Marco

    --
    Marco van Zwetselaar
    KCRI - Kilimanjaro Clinical Research Institute
    Moshi
    Tanzania
    e [email protected]
    m +255 782 334124


    On 19 Dec 2012, at 18:16, Michael Bluett wrote:
    Dear Marco,

    I think the reason I went for Apache HTTP server was configurability in terms of what is compressed. I think I read some problems with IE 6 and compression (some of our clients still use it). It would be good to document/point to a Tomcat-only solution.

    A couple of points:
    Encrypted data may be incompressible, but I don’t believe that compression is enabled over https by default.
    Pipelining is apparently turned off by default in most browsers: http://en.wikipedia.org/wiki/HTTP_pipelining
    Local caching in or near the browser will still have an effect to reduce the number of requests.

    Recently I have been looking how to improve the performance of our 3.1.2 system. I tried benchmarking an solid state disk (SSD) on my Tomcat and Postgres optimised local machine, but they didn’t seem to give any benefit. There didn’t seem to be all that much time spent moving data to/from the disk for an SSD to be beneficial. Looking at our server I find that the main consumer of CPU cycles is Postgres (as you suggested), and again not that much is time is spent moving data to/from the disk. I know that 3.1.3 promises better performance still (http://blog.openclinica.com/2012/12/03/openclinica-3-1-3-now-available/ ‘Improved performance and stability, including 80% increase in max user load and 40% faster page turn times’) but we’ve only just moved to 3.1.2.

    Yours,

    Michael

    From: Marco van Zwetselaar [mailto:[email protected]]
    Sent: 19 December 2012 13:55
    To: Michael Bluett
    Cc: [email protected]
    Subject: Re: [Users] using oc with https

    Michael,

    The only performance gain the graphs demonstrate is that of compression. The caching effect of the reverse proxy is, contrary to what the accompanying text says, negligible. Note that all except two requests in the upper right diagram result in a 0.0K transfer, the net result being identical to the lower right diagram. The difference is exclusively in compression.[1]

    For compression you don't need a reverse proxy, just switch it on in Tomcat! What's more, if your OC instance is exposed over the internet, then odds are that you'll be using https, not http. Https is compressed by nature, so an RP adds nothing at all. The single situation in which interposing an RP would be useful, is if it is already in your DMZ (because it reverse-proxies a number of different internal hosts on a single external port).

    Sure, compression costs CPU cycles on your Tomcat server, but OC is not compute-bound, it is data-bound (IOW buying faster hard discs will offset the lost CPU cycles at a lower cost). Also, it is Postgres which eats most of your CPU cycles, not Tomcat. So, instead of buying extra hardware to rig up an RP which adds close to nothing, put that money in an upgrade of your OC server. This will give you value for money, and it keeps things simple.

    Regards,
    Marco

    [1] Yes, the upper diagram has extra round trips (effectively the server responding "304 Not Changed") but these are negligible, both in size (<0.0K) and latency (http/1.1 pipelines them). What's more, in a real world situation most of these requests won't even hit the Tomcat server. They'll have been handled by caching proxies, e.g. at the user's ISP or place of work. There's no way you can beat those caching proxies with an RP: they are faster (because they're closer to the browser), and they come for free with the internet ...

    --
    Marco van Zwetselaar
    KCRI - Kilimanjaro Clinical Research Institute
    Moshi
    Tanzania
    e [email protected]
    m +255 782 334124



    On 19 Dec 2012, at 13:37, Michael Bluett wrote:

    Dear Marco,

    There are performance gains that can be achieved through using a separate reverse proxy in terms of caching and compression (see http://en.wikibooks.org/wiki/OpenClinica_User_Manual/UsingAReverseProxy#See_the_difference), but it depends on the user whether it’s worth setting up and configuring a separate program for this.

    Yours,

    Michael
    Sent: 19 December 2012 09:54
    To: [email protected]
    Subject: Re: [Users] using oc with https

    Thanks all.
    Sent: quarta-feira, 19 de Dezembro de 2012 09:39
    To: [email protected]
    Subject: Re: [Users] using oc with https

    Ricardo,

    On 18 Dec 2012, at 18:05, Ricardo Simões wrote:

    Thanks, im using Windows server 2008, i thought i only needed to create the ssl and change a config file.

    Yes, that should suffice. If you fire up an out-of-box Tomcat, you will find this (and much more) documented pretty well athttp://localhost:8080/docs. Note that "documented well" does not imply "quick and easy".

    Don't bother about the reverse proxy unless you have a specific situation requiring it (like an RP already being there). There is no point in reverse proxying Tomcat in general, and it is entirely useless in the case of OpenClinica. That is, unless bandwidth inside your datacentre issmaller than outside - but if that's the case, you have bigger fish to fry :-I

    Best regards,
    Marco

    PS: do note Michael's advice to block outside access to Tomcat's other ports.

    --
    Marco van Zwetselaar
    KCRI - Kilimanjaro Clinical Research Institute
    Moshi
    Tanzania
    e [email protected]
    m +255 782 334124
    Sent: terça-feira, 18 de Dezembro de 2012 14:58
    To: [email protected]
    Subject: Re: [Users] using oc with https

    Dear Ricardo,

    The Using a reverse proxy page has a section on setting up Nginx as a secure proxy on Linux:
    http://en.wikibooks.org/wiki/OpenClinica_User_Manual/UsingAReverseProxy

    If you are using Windows you could follow the instructions to set up Apache HTTP server (on the same page) as a proxy and then search for how to add SSL.

    In both cases, ensure you block external access to any insecure ports Tomcat is listening on.

    Yours,

    Michael
    Sent: 18 December 2012 14:35
    To: [email protected]
    Subject: [Users] using oc with https

    Hi again,
    How can I configure tomcat to use oc with https?
    Thanks
    Ricardo Simões
    The University of Dundee is a registered Scottish Charity, No: SC015096
  • zwets-kcrizwets-kcri Posts: 50
    Michael,
    To add to your mail below: the latter two unsolved performance issues you mention (12197 slow event updates, and 14467 refactoring data entry saving) seem to correspond closely to the point I mentioned:
    > [...] Also, I notice a remarkable number of individual database queries ('select from item where item_id = ...') in situations where I wouldn't expect any - such as when scheduling an event or saving even a one-variable CRF.
    I'd have to check, but it is my impression that both of these actions perform a 'SELECT FROM item WHERE item_id = ...' for every item in the study. (Try this with logging set to debug and you will see the queries in openclinica-db.log.) This begs two questions: why is OC doing this when all I do is schedule an event (which has nothing to do with items), and why does it do hundreds of individual queries instead of a single 'list-query'?
    This also ties in with the issue of the performance of PostgreSQL: it may be that it isn't PostgreSQL as such which is having the hard time (due to complex queries, suboptimal access paths or bad execution plans), but rather the application which is hitting it with an avalanche of little queries thus causing PostgreSQL to be the bottleneck. In fact, most of the queries I see complete in 0 msec, there just are very many.
    I'll be off for holidays on the coast now, will do some gin-tonic fueled research into this from under a palm tree. :-)
    Hope to see you all again in 2013!
    Best regards,
    Marco
    --
    Marco van Zwetselaar
    KCRI - Kilimanjaro Clinical Research Institute
    Moshi
    Tanzania
    e [email protected]
    m +255 782 334124
    On 21 Dec 2012, at 15:37, Michael Bluett wrote:
    > Dear Marco,
    >
    >
    > Looking at performance improvements in the last couple of versions, there were some solid improvements, and there are still some in the pipeline.
    >
    >
    > From the performance point of view there was some issues fixed in 3.1.2:
    > https://issuetracker.openclinica.com/view.php?id=12129 (SQL query caching)
    >
    >
    > Issues fixed in 3.1.3:
    > https://issuetracker.openclinica.com/view.php?id=14508 (advice on Tomcat tuning)
    > https://issuetracker.openclinica.com/view.php?id=12979 (query optimisation)
    > https://issuetracker.openclinica.com/view.php?id=13347 (Discrepancy Note fixes)
    >
    >
    > The blog’s (http://blog.openclinica.com/2012/12/03/openclinica-3-1-3-now-available/) claim of ‘Improved performance and stability, including 80% increase in max user load and 40% faster page turn times’, appears to be based on the analysis in this issue https://issuetracker.openclinica.com/view.php?id=15605
    >
    >
    > Issues for future versions:
    > https://issuetracker.openclinica.com/view.php?id=12128 (study metadata caching)
    > https://issuetracker.openclinica.com/view.php?id=12197 (slow Event updates, not sure of the status as it hasn’t been updated for a while)
    > https://issuetracker.openclinica.com/view.php?id=14467 (refactoring data entry saving)
    >
    >
    > As the performance of the server appears to be limited by Postgres, have you seen an OpenClinica instance open more than 2 connections under load (and use more than 2 cores on the db server)? I’ve just asked this here:https://issuetracker.openclinica.com/view.php?id=16472
    >
    >
    > Yours,
    >
    >
    > Michael
    >
    >
    > From: Marco van Zwetselaar [mailto:[email protected]]
    > Sent: 19 December 2012 20:45
    > To: Michael Bluett
    > Cc: [email protected]
    > Subject: Re: [Users] using oc with https
    >
    >
    > Michael,
    >
    >
    > From the little evidence I have, upgrading to 3.1.3 indeed gives a performance gain that gets noticed by the users, so upgrading might be a good bet. The upgrade was smooth but for one glitch (which was a problem in a CRF, not in OC): we had a CRF with the same repeating group in three sections. This worked fine on 3.1.2, but not on 3.1.3 where it shows a very wide (but broken) table on each tab.
    >
    >
    > I do notice a awful drop in performance when setting logging to debug (which is why I asked about individual logger level configuration on the developers list). Also, I notice a remarkable number of individual database queries ('select from item where item_id = ...') in situations where I wouldn't expect any - such as when scheduling an event or saving even a one-variable CRF. Finally, the time for the subject matrix to appear has been reduced, but is still quite dramatic (worsened by the fact that one tends to land on that page often ...).
    >
    >
    > The fact that you don't see a lot of effect when switching to faster disks may be attributable to having plenty of memory (and allocating it to PostgreSQL), which would then mostly operate from its cache. That's why I said OC is data-bound, rather than I/O bound per se.
    >
    >
    > Regarding compression and https: encrypted data is highly random (by definition), so it can't be compressed a lot. Compression must happen before encryption. If you terminate https on Tomcat, then there is no use in fronting it with an httpd (since it will cache nor compress). OTOH if you terminate https on your httpd, then clearly it can do the compression and encryption, which will offload a few cycles from the Tomcat server.
    >
    >
    > As for pipelining, I actually meant keepalive (which became the default in http/1.1), and which means most requests don't result in a new connection (= latency). Pipelining would be "keepalive on steroids" :-) BTW Apache httpd has a very short default keepalive setting; you may want to increase it considerably, in particular if you don't have loads of concurrent users.
    >
    >
    > OC could really use some thorough performance analysis and tuning ...
    >
    >
    > Regards,
    > Marco
    >
    >
    > --
    > Marco van Zwetselaar
    > KCRI - Kilimanjaro Clinical Research Institute
    > Moshi
    > Tanzania
    > e [email protected]
    > m +255 782 334124
    >
    >
    >
    >
    >
    > On 19 Dec 2012, at 18:16, Michael Bluett wrote:
    >
    >
    > Dear Marco,
    >
    >
    > I think the reason I went for Apache HTTP server was configurability in terms of what is compressed. I think I read some problems with IE 6 and compression (some of our clients still use it). It would be good to document/point to a Tomcat-only solution.
    >
    >
    > A couple of points:
    > Encrypted data may be incompressible, but I don’t believe that compression is enabled over https by default.
    > Pipelining is apparently turned off by default in most browsers: http://en.wikipedia.org/wiki/HTTP_pipelining
    > Local caching in or near the browser will still have an effect to reduce the number of requests.
    >
    >
    > Recently I have been looking how to improve the performance of our 3.1.2 system. I tried benchmarking an solid state disk (SSD) on my Tomcat and Postgres optimised local machine, but they didn’t seem to give any benefit. There didn’t seem to be all that much time spent moving data to/from the disk for an SSD to be beneficial. Looking at our server I find that the main consumer of CPU cycles is Postgres (as you suggested), and again not that much is time is spent moving data to/from the disk. I know that 3.1.3 promises better performance still (http://blog.openclinica.com/2012/12/03/openclinica-3-1-3-now-available/ ‘Improved performance and stability, including 80% increase in max user load and 40% faster page turn times’) but we’ve only just moved to 3.1.2.
    >
    >
    > Yours,
    >
    >
    > Michael
    >
    >
    > From: Marco van Zwetselaar [mailto:[email protected]]
    > Sent: 19 December 2012 13:55
    > To: Michael Bluett
    > Cc: [email protected]
    > Subject: Re: [Users] using oc with https
    >
    >
    > Michael,
    >
    >
    > The only performance gain the graphs demonstrate is that of compression. The caching effect of the reverse proxy is, contrary to what the accompanying text says, negligible. Note that all except two requests in the upper right diagram result in a 0.0K transfer, the net result being identical to the lower right diagram. The difference is exclusively in compression.[1]
    >
    >
    > For compression you don't need a reverse proxy, just switch it on in Tomcat! What's more, if your OC instance is exposed over the internet, then odds are that you'll be using https, not http. Https is compressed by nature, so an RP adds nothing at all. The single situation in which interposing an RP would be useful, is if it is already in your DMZ (because it reverse-proxies a number of different internal hosts on a single external port).
    >
    >
    > Sure, compression costs CPU cycles on your Tomcat server, but OC is not compute-bound, it is data-bound (IOW buying faster hard discs will offset the lost CPU cycles at a lower cost). Also, it is Postgres which eats most of your CPU cycles, not Tomcat. So, instead of buying extra hardware to rig up an RP which adds close to nothing, put that money in an upgrade of your OC server. This will give you value for money, and it keeps things simple.
    >
    >
    > Regards,
    > Marco
    >
    >
    > [1] Yes, the upper diagram has extra round trips (effectively the server responding "304 Not Changed") but these are negligible, both in size (<0.0K) and latency (http/1.1 pipelines them). What's more, in a real world situation most of these requests won't even hit the Tomcat server. They'll have been handled by caching proxies, e.g. at the user's ISP or place of work. There's no way you can beat those caching proxies with an RP: they are faster (because they're closer to the browser), and they come for free with the internet ...
    >
    >
    > --
    > Marco van Zwetselaar
    > KCRI - Kilimanjaro Clinical Research Institute
    > Moshi
    > Tanzania
    > e [email protected]
    > m +255 782 334124
    >
    >
    >
    >
    >
    > On 19 Dec 2012, at 13:37, Michael Bluett wrote:
    >
    > Dear Marco,
    >
    >
    > There are performance gains that can be achieved through using a separate reverse proxy in terms of caching and compression (see http://en.wikibooks.org/wiki/OpenClinica_User_Manual/UsingAReverseProxy#See_the_difference), but it depends on the user whether it’s worth setting up and configuring a separate program for this.
    >
    >
    > Yours,
    >
    >
    > Michael
    >
    > Sent: 19 December 2012 09:54
    > To: [email protected]
    > Subject: Re: [Users] using oc with https
    >
    >
    > Thanks all.
    >
    >
    >
    >
    >
    >
    >
    > Sent: quarta-feira, 19 de Dezembro de 2012 09:39
    > To: [email protected]
    > Subject: Re: [Users] using oc with https
    >
    >
    > Ricardo,
    >
    >
    > On 18 Dec 2012, at 18:05, Ricardo Simões wrote:
    >
    >
    >
    > Thanks, im using Windows server 2008, i thought i only needed to create the ssl and change a config file.
    >
    >
    > Yes, that should suffice. If you fire up an out-of-box Tomcat, you will find this (and much more) documented pretty well athttp://localhost:8080/docs. Note that "documented well" does not imply "quick and easy".
    >
    >
    > Don't bother about the reverse proxy unless you have a specific situation requiring it (like an RP already being there). There is no point in reverse proxying Tomcat in general, and it is entirely useless in the case of OpenClinica. That is, unless bandwidth inside your datacentre issmaller than outside - but if that's the case, you have bigger fish to fry :-I
    >
    >
    > Best regards,
    > Marco
    >
    >
    > PS: do note Michael's advice to block outside access to Tomcat's other ports.
    >
    >
    > --
    > Marco van Zwetselaar
    > KCRI - Kilimanjaro Clinical Research Institute
    > Moshi
    > Tanzania
    > e [email protected]
    > m +255 782 334124
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > Sent: terça-feira, 18 de Dezembro de 2012 14:58
    > To: [email protected]
    > Subject: Re: [Users] using oc with https
    >
    >
    > Dear Ricardo,
    >
    >
    > The Using a reverse proxy page has a section on setting up Nginx as a secure proxy on Linux:
    > http://en.wikibooks.org/wiki/OpenClinica_User_Manual/UsingAReverseProxy
    >
    >
    > If you are using Windows you could follow the instructions to set up Apache HTTP server (on the same page) as a proxy and then search for how to add SSL.
    >
    >
    > In both cases, ensure you block external access to any insecure ports Tomcat is listening on.
    >
    >
    > Yours,
    >
    >
    > Michael
    >
    >
    >
    > Sent: 18 December 2012 14:35
    > To: [email protected]
    > Subject: [Users] using oc with https
    >
    >
    >
    > Hi again,
    >
    > How can I configure tomcat to use oc with https?
    >
    > Thanks
    >
    > Ricardo Simões
    >
    > The University of Dundee is a registered Scottish Charity, No: SC015096
    >
This discussion has been closed.