I am working on a project where i need to extract data from disparate datasources and load into an oracle dataware house.
I am considering webMethods as an option.
I request the users of this group to let me know of the pros and cons of using webMethods for extraction and loading into a dataware house.
Do i get any benefits over webMethods enterprise server if i use any ETL tool like Informatica for the same?
What are the advantages/disadvantages the ETL tools enjoy over webMethods ?
Its a batch process based when you have huge files you need to send accross then you use informatica, the reason is it supports multiple instances across the platforms, it has a temporary space where you hold huge data and do the transformtions and send the data across, it supports the views, sql loader and other stuffs.
The Cons of Informatica,
It does not have the transaction ability let say if any failure occured during transactions you really need to start all the bactch process once again.
If you want to send huge files through message queues it will bomb out because most of the queues will support only up to 8mb to 10 mb thats why you dont want to use webMethods for this purpose.
ETL tools are best if you’re moving large amounts of data on a periodic basis (e.g. nightly). They have some powerful transformation/lookup/etc. facilities built-in that would require additional work in webMethods. DBAs will be comfortable with this tool. An ETL tool will have a definite speed advantage.
webMethods (or any integration tool) would be appropriate if you need to capture data for the warehouse on the fly.
wrt, “huge files,” one shouldn’t try to push large files through any integration tool as a single entity–large files should be busted up into records (or other atomic entity) which then would pass through the integration infrastructure. Trying to shove a “batch” of data through an integration tool is contrary to what the integration tools were built to address–the near real-time movement/transformation of data. Batching of data is to be avoided when possible. (ready for debate on this point if anyone is interested).
webMethods vs Informatica or ETL is really a question
of data source? and process. If you have many DBs that
you extract and trasform data from and they all
consolidate in some datawarehouse like in your case.
You have control of all these sources and access and
if you use batch type process then ETL like
Informatica is better solution than WM!
However if your datasources are iteractive and
realtime message driven business that may get
transformed and consolidated in same datawarehouse
than webMethods and TradingNetworks is better
solution.
The volume load is not a concern although WM does have
some artificial limitations there is an easy way to
make it handle any load of data.
The question you need is do you dump huge data files
and DBs into warehouse or you have trading partners
thet send you messages in real time.
I have a similar question. We want to integrate data from more than 10 subsidiaries, who all have a lot of different systems. So we are looking at a lot of integration. We already have webMethods for other purposes installed, but would we use this for creating an (Enterprise-wide) Data Warehouse? Can I do transformation and clensing and so on with webMethods, or does it not support this kind of tasks like Informatica does?
I work for a project where we use webMethods IS + TN + Enterprise along the lines pointed by Chris.
We get data from more than 60 source systems each supplying data in its own format.
A transformation module with webMethods EDI and TN with help of lookup tables converts it into a common data model.
The transformed data is cleansed with Trillum and loaded on to the datawarehouse on Oracle DB.
As in this case the data feed is pseudo realtime depending on the business process it supports and works fine for incremental data although the initial data load for history data requires a considerable effort and time.
I do feel using webMethods would be a right approach when you try to extract data on the fly and integrate.