May 08

Just now I had an occasion to work with an integration solution intended to process lots of records. By lots I mean over 1 million smallish records. My customary platform to experiment on is Windows XP. Lots of reasons for that, most of them historical – I have tools I know and like and so on. While trying to work with such a volume of data I noticed a number of “interesting” things, which I thought I should share. These things are related to both the platforms (Windows vs. Linux), the tools and the architectural decisions.

I needed lots off data to test the solution I was contemplating, which involved XML processing, to see how constructing and parsing XML affects solution performance. To make it easier to compare timing differences I though I should use lots of records.

The discoveries are discussed in Right tools for the job.pdf.

May 07

Occasionally one needs to pick up and process a large number of files, on the order of hundreds or thousands. With the Batch Inbound eWay/JCA Adapter it is not possible to pick up more then one file per poll. The Batch Local File, if triggered by some event other then an appearance of a file in a directory, perhaps a Scheduler trigger or a manual trigger, with correctly designed logic, can process many files in a single invocation.

The document, ProcessingHundredsOfFileWithBatchAdapter.pdf, discusses how Batch Local File-based solution can be constructed to effectively process hundreds of files in a single pass.

This article references a ZIP archive “ProcessingHundredsOfFileWithBatchAdapter.zip“.

Oct 22

Handling very large messages in a messaging solution may require memory resources many times greater than the size of the largest message to be handled. Frequently the architect has no choice but to consume or produce a very large message, a file containing a batch or related transactions, for example, or a large and complex XML message generated by, or intended for, an external application. Handling such messages poses special challenges.

Java CAPS can assist with Batch eWay support for data streaming when large messages are manifested as files in a file system.

If it is possible to break large messages up into components and process components individually, or collect components and assemble them into a large message. eTL, another of the products in the Java CAPS Suite, can assist in processing large volumes of data. Whilst ETL (Extract, Transfer and Load) is typically associated with one off batch extraction and load of data, Java CAPS’ eTL can be used both standalone and in-stream as part of a larger Java CAPS solution. In this in-stream mode it will be discussed as a possible means of streaming data between a flat file and database table or between
database tables/views.

This extract from the soon–to-be-released book, “Java CAPS Basics – Implementing Common EAI Patterns”, discusses the Java CAPS 5.1 Batch eWay streaming facilities, and presents and compares a number of data streaming implementations, including an eTL implementation.

Link to Java CAPS 5.1.3 Data Streaming  section of the manuscript.

Tagged with:
preload preload preload