May 07
Occasionally one needs to pick up and process a large number of files, on the order of hundreds or thousands. With the Batch Inbound eWay/JCA Adapter it is not possible to pick up more then one file per poll. The Batch Local File, if triggered by some event other then an appearance of a file in a directory, perhaps a Scheduler trigger or a manual trigger, with correctly designed logic, can process many files in a single invocation.
The document, ProcessingHundredsOfFileWithBatchAdapter.pdf, discusses how Batch Local File-based solution can be constructed to effectively process hundreds of files in a single pass.
This article references a ZIP archive “ProcessingHundredsOfFileWithBatchAdapter.zip“.
Hello Michael,
Thank you very much for your solution!
It adresses exactly what we where talking about…
Best regards
Thomas
I am glad you found this solution useful.
Regards
Michael
Hello Michael,
I recognized, that our solution of processing `million` of files depends on the location of the files.
If the files are local to the batcheway, the speed is acurate.
If the files ar located on a fileshare, the speed is very poor (I tested your solution on an local disk last time – this way I didn’t realize the problem).
The problem still remains – the eWay takes a lot of time to scan all files – and – it rescans all files each time we put the .get() or the .getifexists() command.
Is there a solution, to get the list of files?
(as it can be done with the BatchFTP – not shure but I think so)
Thanks and regards,
Thomas
Hello, Thomas.
A solution that comes to mind is as follows:
1. construct a scheduler-triggered JCD which uses standard Java IO classes to get a list of files in a directory. Make the schedule rather large (minutes to hours) to allow all files to be processed.Perhaps one of the exmples returned by: http://www.google.com.au/search?q=java+io+listfiles will assist
2. loop over the list and for each complete file path create and send a JMS message to a designated queue, say qProcessFile.
3. construct a JMS-triggered JCD which receives a JMS message, gets the file path from the message and use it ti get and process a single file.
If you change the 1 in such a way that the sceduler triggeres a JCD which puts a trigger into a JMS queue, say qTriggerFileList, and the actual listing of files gets done by another JCD triggered by the trigger message in the qTriggerFileList then 2 can be modified to submit a trigger message to qTriggerFileList once it has processed all files in the list, so to trigger another file list pass before a scheduled trigger arrives.
I have not actually built this logic so there may be false assumptions in it.
Regards
Michael