9

Severing Concurrent Requests for A LibreOffice Services

 3 years ago
source link: https://jdhao.github.io/2021/06/11/libreoffice_concurrent_requests/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Severing Concurrent Requests for A LibreOffice Services

2021-06-11310 words 2 mins read spinner.svg times read

We have set up a server to convert pptx files to pdf files using LibreOffice (version: 6.0.7.3 ). Libreoffice is started using subprocess.run() command in Python. The external command I use is something like the following1:

soffice --headless --convert-to pdf test.pptx

If the client requests this service concurrently, some of the request will fail with no result. The weird part is that subprocess.run() will not report any errors. It is just that we can not convert pptx to pdf. If the client only request the service one pptx file after another, there is no error in getting the result. It seems that libreoffice can not handle multiple concurrent requests gracefully.

I captured the stdout and stderr from the external command:

subprocess.run(command_list, timeout=5, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

I find that although the command is executed without error (r.check_status() runs without error), r.stdout and r.stderr will be empty when no pdf is generated.

So I add a retry strategy. If the r.stdout is empty, we will try at most three times to re-run the subprocess command. This reduces the failure numbers for concurrent requests, but there are still quite a lot of failures.

Since one instance of libreoffice is limited in its concurrent handling of requests, why not deploy multiple instances of libreoffice? So we isolate the relevant code to generate from pptx to pdf as a separate service and deploy it in multiple docker containers. When new requests comes, it will be distributed evenly to different instances of this services. To handle more concurrent requests, we just need to deploy more docker containers. After this step, the failure rates drops to negligible count.

It seems that there is another way to solve this issue by spawning multiple LibreOffice instances in the same server, as documented here.


  1. there are more discussions on how to convert pptx to pdf here. ↩︎

Author jdhao

LastMod 2021-06-11

License CC BY-NC-ND 4.0

Reward
Excel Processing using Pandas

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK