Export large data from DB to MSExcel using DSP page

knaresh455 · April 26, 2021, 2:04pm

Hi ,
I have requirement to fetch data from DB and move it to Native Excel file using DSP. Where DSP page will have simple download button and end-user allowed to download file. data size will be about 20 MB. I tried below steps:

Read data from DB using Adapter.
Push data to IS Cache. ( for faster fetch)
call fetch data service in DSP like table.
Call JS to move Table data to Excel file.
Call another JS to hide table content and show only download button to download excel file.

Code is working fine and able to download file but issue is it taking move than 4 min time to perform all above options.

Can some one suggest more suitable options here to reduce download time.

John_Carter4 · April 26, 2021, 2:26pm

Your service seems rather complicated, why not;

build service that reads data from DB using adapter.
service sets response to CSV formatted text.
set response headers so that response gets treated as downloadable file
Enable service cache.
User clicks on link with service as target.

Use the service ‘pub.flow:setResponseHeaders’ to ensure the browser will propose a file download and set the headers as follows;

This way you don’t need any custom js code.
regards,
John.

knaresh455 · April 26, 2021, 3:35pm

Thankyou for quick reply John.
One quick question here.
“2. service sets response to CSV formatted text.” - Here I need file in native excel format.
Like proper header and values under them.
does this works with above steps?

John_Carter4 · April 26, 2021, 4:41pm

just make sure that you add a first row giving column names in csv file, then format after importing.
I wouldn’t impose a specific product on a user in a web scenario, not everyone will want to use excel, therefore if you use a CSV format, a Mac user can import into Numbers, a linux user can use libreOffice etc.

regards
John.

reamon · April 26, 2021, 8:09pm

My 2 cents:

As @John_Carter4 notes, use CSV as the response payload. Returning a native Excel file will involve getting a library of some sort to create that and you’ll be constantly dealing with compatibility and interoperability issues. Keep it simple.
Most likely the service cache is not needed. More often than not, the service cache gets in the way because you’re constantly needing to tweak the cache expiration and chasing “why did it not get the latest data” issues.
Take care to NOT LOAD all of the data from the DB into memory. Use streaming techniques to read a bunch, write a bunch.

Holger_von_Thomsen · April 27, 2021, 3:20pm

Additional Note:

It might be possible to create the CSV via FlatFile techniques as this can ease the mapping from the database to FlatFile-CSV-structure resp. -document.

Regards,
Holger

knaresh455 · April 29, 2021, 8:43am

Thankyou for suggestions.
Could you please give more insights about streaming techniques to read a bunch and write bunch.

John_Carter4 · April 29, 2021, 9:04am

Use the service ‘pub.flow:setResponse2’, rather than relying on the implicit service output.
Then map the input stream from your original file/etc to the “responseStream” input.

The hard part is getting your input stream, because if you are reading records from a DB, you will need to convert each record into a json or xml string fragment and then append it to a temporary file. After which you would then read the file as input stream, which you would then map to setResponse2 as above.

Ideally you would want to avoid having to create a file, however I don’t think you can do that in flow. you would have to write a java service to avoid that.
regards,
John.

reamon · April 29, 2021, 3:24pm

Good info from @John_Carter4 for the approach.

Indeed. The use of a temp file is likely the way to go.

Here is an alternative to consider, which would forego the temp file but have its own set of complexities. This thread has an example of the basic tasks to be addressed to have data from the DB be the source of the InputStream more directly.

Aspects you’ll need to consider:

The example returns the raw results of the query. You’ll want to convert each row to the CSV you need as @John_Carter4 noted.
The example does everything in Java. Okay for a Java environment but less so for wM Integration Server. You’ll want to determine if doing this all in Java is okay for your environment. Or figure out a way, if any, to leverage JDBC adapter service to get the result set. Or perhaps a ref cursor.

John_Carter4 · April 30, 2021, 7:06am

A custom OutputStream is the only way I can think of to ensure that you don’t have to do everything in java.

Implement a custom class that implements both InputStream and OutputStream.
Create three java services, one to create an instance of the above class, another that writes to the object and the last to close the inputStream.
You would have to call setResponse with the output of the create service (2) as the ResponseStream input before calling your JDBC logic, this might sound counterintuitive, but trust me.
Loop overt JDBC calls and use write service (2) to write the fragment to the output stream.

Your custom class via the java service would then pipe the output to the input stream assigned to setResponse.

After the loop has finished, make sure you call the close service, otherwise the caller of your service will not finish and eventually trigger a timeout error.

I think I wrote something similar a long time ago, but I don’t if I still have the code. I would have to go on a rummage.
John.

reamon · April 30, 2021, 2:59pm

Having done this before (tie output and input stream related together, but not with JDBC-obtained data), I can tell you this can be a challenge. The key is that the reading from the InputStream drives the writing to the OutputStream. For anyone not confident/strong in Java, proceed with caution.

Edit: If @John_Carter4 finds the code in a trunk in the attic, it would be cool to see it!

system · July 29, 2021, 3:00pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Read File List Content Performance inefficiency EDI	17	3312	April 2, 2021
Handling JSON strings and large files in Integration Server Knowledge base Large-JSON , JSON , iterator , batch-iterator , JSON-iterator	8	6810	August 1, 2021
Creation of Excel Sheet in webMethods EDI	13	4795	April 22, 2021
How to get on-prem IS to upload a file into Azure 'Blob' storage azure	32	8499	May 2, 2022
Is Parallel flow processing possible? If-it-doesn-t-fit-anywhere-else	9	3363	September 3, 2021

Export large data from DB to MSExcel using DSP page

Related topics