How to convert a nonXML file into an XML format that can be sent to wmtnreceive

Hi,
I need to write a service that will take a non-XML flat file and convert it into an XML form that can be passed into wm.tn.receive .

This is for an integration project with a large number of clients whom will have widely different file formats.

Presuming that I can convert a flat file into an object, using my own Java service, is there an easy way to transmit this information to wm.tn.receive as a java.io.File or String with a content type set to text/xml?

Thanks in advance for any help. This is my first real webMethods project - I received my training around two months ago, so I would appreciate advice as to the most effective solution.

Nick_F

Nick, you did a good job explaining your goal. I have some questions about the methods, though.

You said that you are receiving a flat file. Is the flat file EDI? Is it a comma-separated list? Is it some other format?

It sounds like that you are figuring this to be more work than it really is. That should be good news to you, right? :slight_smile:

Nick,
You can do two things to convert your flat file into XML and send it to TN.

  1. Write your own java service (conventional string handling) to read the flat file and convert it to IData objects.
  2. You can create flat file tempaltes to read and parse a flat file using some EDI services (holds good for both EDI and non EDI flat files) into IData objects

Finallynd build an XML using these objects and route it to Trading networks.

Nick,

Another option is to create a simple XML document with one ROOT TAG that wraps around your flat-file content.

This can be submitted to TN as usual.

You can also create your own service to populate SENDER, RECEIVER ID’s within the BIZDOC prior to SUBMIT being called( if you want your Processing Rules to be SENDER/RECEIVER dependent).

Hope this helps.

Halsalam-

Do you have a sample Java code to map the flatfile to IData structure of EDI Template? COuld you please share it if you have?

Thanks,
Bals

Bals-

If I understand correctly - you want to convert IData to EDI (or vice versa)…

There are sample templates that ship with the EDI Adapter. Also, you should be able to download EDI templates directly from the webmethods site (which I have not tried - but that’s what their docs say )

Im impressed and thankful for such a swift reply - there is obviously a strong community here!

Dan - for the long-term idea of this project we anticipate having over 200 clients whom may have different file formats for purchase orders, etc… Our high end clients will be submitting XML, in all probability, but we need to cater for older systems that will deliver in CSV and a host of proprietary formats.

So, I need to plan an architecture that can handle this with T.N. in a way that will produce a system that wont have a different mapping services for each integration partner. Hence, a Java service converting each order into a uniform XML format then passing it into T.N.

PU - If I convert the flat file into an IData object, I cannot invoke receive? We want to establish Document Types, use pre-processing rules, etc. for our clients.

  I really want to know if there are any webMethods mechanisms for building an XML file from the Java objects holding client information. Is the best option simply working with java.io.* to create an XML file within a Java service? 

Halsalam - your suggestion is a very tidy. I will think over it.

Once again, thanks guys.

If you’re going to support multiple formats from multiple partners, you’re going to have a bunch of different mapping services. The key will be to keep these to the bare minimum. Here are strategies:

  • Define an XML schema for each doc type that meets your needs. Ideally, this would be an industry standard but quite often those don’t work out. Get as many partners as you can to support your format. This works great if your company is an 800 pound gorilla and can dictate terms to some extent.

  • As you pointed out, differing formats are a fact of life. Fortunately, dealing with these is a strength of Integration Server. Translate incoming docs to what’s commonly referred to as a canonical format–a format that supports all the data you need for a given document type. From this canonical, you can translate to various target formats needed by backroom systems. Do the same for outbound documents as well–legacy format–>canonical format–>partner format. This approach keeps the number of mapping services down (the ol’ m+n vs. m*n rationale).

  • Although Java services are very useful (and sometimes indispensable) don’t be too quick to drop into writing Java code. Normally, most of what you want done can be done with built-in services or with custom FLOW services. Mapping services in particular should be done in FLOW, not Java services.

  • Use TN for all interaction with everybody. This provides document tracking, restarts, etc.

  • It’s important to have a strong doc type and IS record naming structure. You’re going to have a lot of definitions, so you should structure things to aid the management of the definitions.

  • Search in the forums in Advantage and in ITToolbox (http://eai.ittoolbox.com/) for discussions regarding content handlers, flat file parsing, etc. with tips on handling custom formats. If you want to run everything through TN (which is a good idea) you may need to “pre-process” some docs with IS services before submitting to TN.

There are definitely mechanisms for generating XML docs within IS. Getting the data into IS can be tricky sometimes (though usually it’s trivial) but once there, there is tremendous capability to create virtually any XML doc layout. Using java.io.* to create the XML file is NOT the way to go. Use the built-in services and FLOW mechanisms.

I’ve got a somewhat different suggestion, which you might want to consider. Especially if you decided to go the Java service route. Please keep in mind that though I understand B2B well enough to have taught it in the past I don’t work with it on a daily basis (hey, it wasn’t my idea).

The problem you describe seems to be ideally suited for the Strategy Design Pattern (“Design Patterns,” Gamma et al, 1994). The idea would be to define an interface with a method that takes a standard input, such as a string, and returns a standard output, such as a canonical document. For each different format that can be passed as a string you will define a different java class implementing the interface. Each java class will contain the specific strategy for converting a specific format to a canonical document.

For each Trading Partner you store the class name for the java class that performs the correct conversion strategy as an optional attribute in the Partner Profile.

Create a single java flow service that takes the string with all the information and the java class name attribute as arguments. The java class name is used to dynamically create an instance of that class. The beauty is that since all the classes implement the same Strategy Interface, the flow service only need a reference to the Interface which has a defined method that returns the canonical document.

The beauty of this solution is that you can very easily change the conversion strategy for a client by just creating a new java class (if one doesn’t already exist) and modify the partner profile attribute.

Another benefit of this approach if you deal with many (10+) different formats is that you don’t have to create a 10+ nested if then else flow to call the correct conversion flow. Now it might be possible to dynamically select a flow service in a similar fashion using a partner profile attribute, but this is where my knowledge is lacking. Someone else can feel free to chime in here.

The consideration would be how to extract the string with information of the specific format. I would assume you would use different types of entry services, ftp, socket, file, http, etc., etc… Each one would extract the string and call the same java flow service with the string and the attribute from the partner profile, without any nested if then else structures.

A second side effect is that you easily can add new partners with new conversion strategies by just creating a new class, adding it to the classpath, and setting the attribute of the partner profile.

People, I would like some feedback for this idea. It is based on the wonderful world of design patterns which are very useful in an OO setting. I would like to know if it wouldn’t work for some reason or if a similar solution could be deviced within the flow service world.

Rgs,
Andreas Amundin
www.amundin.com

Nick - you’ve received excellent advice in this thread. I’d like to emphasize how important Rob’s point about using canonical formats is.

Someone famous once said most computer science problems can be solved by adding another level of indirection. That process is at work here. Translating ‘n’ partner formats → one canonical format → backend format (and vice versa) has two main advantages:
(1) As familiarity with the canonical format grows with each implementation, your time-to-implement speeds up.
(2) If your backend changes, you change just one translation (canonical format–>backend format) instead of ‘n’ translations.

In my opinion, the best way to implement a canonical format is to spend time cooking up your own ‘inhouse’ DTD for each document type. Lets say you work at a company called Data Dimensions (“DD Inc.”) that receives POs and sends out Invoices in many formats. Start off by defining your standard (called, say, “ddXML”). The fastest way to do this is to emulate the structure of another standard, say xCBL 2.0. Of course, you can use xCBL as your inhouse standard too, but that may be overkill, especially if it doesn’t address some special requirements you have. If you do create your own inhouse XML standard, here are some tips: keep the your XML standard “loose” --avoid doing much business validation – leave that to your backend. Validate only basic requirements - try keeping fields optional as much as possible. Plan on creating two seperate fields for much of your data - your backend’s version of that datum and your partner’s version. i.e. for PO numbers, plan on having seperate and tags.

Lets say, you’re done creating DTDs for two documents:
ddXML
Order…
and
ddXML
Invoice…
Now import these two DTDs into B2B and create the records in a separate, versioned package (called say, ‘ddXML_1_Records’). You may find you have to do some manual cut and paste creation of records, especially if you have a single source DTD.

Once you’ve got your canonical documents as records in B2B, you’re ready to start using them in your mappings.

Thanks guys.

 We already have the canonical format idea working at the level of data structures and XML.  

 Im currently working through your ideas and communicating them to my colleagues.  

Hearing the design alternatives pursued at this level of conversation is enlightening. 
  
Your advice and response has been excellent. I hope to be able to contribute in some way to this community. 

Will keep you informed …

Nick

I am having fun with this thread.

[Switching from DesignFreak to BusinessFreak while stepping up on soap box. General warning issued]

Speaking of Canonicals, a term borrowed from the Enterprise server world (my stomping grounds), I would like to add some general comments.

Actually there is only one point I want to make. Who should define your canonicals?

Let me give you a hint, by proposing to use the term Business Model rather than the term canonicals.

IDEALLY a company, more specifically the business people running the company, should be aware of its Business Model and the Business Processes operating on the Business Model data. IDEALLY this Business Model should already be well defined by the business people.

I keep on using the word IDEALLY because many times the business is too complex or the exercise of defining a business model is considered to be too great of an effort. The problem is that if the business people won’t define it, then the developers implementing the systems or business integration ends up defining it. This is wrong, the technical people should not define the business model that is supposed to support a company.

Don’t get me wrong, I am not trying to put down the capabilities of the technical people working on integration projects. We all know us integrators are the bomb, but our approach tend to be to choose the quickest implementation (especially under time constraints). A common mistake I have seen in the past is to adopt the business model as defined by the most important third party system, be it Oracle Financials, xCBL, SAP, or whatever. As more systems are added more and more integrations has to conform with a third party business model while the entire system is actually supposed to support the undefined business model of the company. Over time the entire system will become more and more difficult to maintain.

The key is to define the “true” business model for your company. Make sure it is created and maintained by the business people (find someone who actually knows how to create a business model). Us techies then just have to make sure the system supports the true business model, no mean feat in and of itself.

IN REALITY, I have not seen one project where a “true” business model has been defined before a business or systems integration project has started. There are many reasons for this. Lack of modeling experience and failure to recognize and act on the need of a defined business model are two reasons, but there are many more. For know, we techies have to do our best with what we are dealt, but at least we should know what we should strive for and maybe we can help the business people realize they need a business model.

The move to create Business Models comes and goes. There was a big movement years ago for Business Process Reengineering, which amounted to defining the “true” business model and thereby streamline processes, mostly through increased understanding of the business.

[Stepping off soap box]

Rgs,
Andreas Amundin
www.amundin.com

I am in most agreement with Rob. We have use an internal format for a particular document and developed generic flow services to store the data. While the design pattern approach is interesting, Trading Networks has already nicely set up ProcessingRules - it will invoke a company-specific service (eg. transformation) based on whose sending us the document and what document type is received. This transforms csv,tsv, other XML into our XML version (while we wait for external standards to evolve such as PIDX).

The wrinkle with incoming non-XML files is that you can’t have a partner send it directly to TN, so we will provide different entry points (URLs) depending on the file type the partner wants to send (only for non-XML). It’s not as easy to detect the document type of say a csv file (as opposed to an XML file - using the root tag). The other thing is that you can’t have one entry point for all csv’s because each company will most likely have their own format. So you will only have as many different URLs as you have doc types - you don’t have to have company specific URLs. Eg. you wrap all files coming from one URL with an tag and send it to TN. We stick the sender (from the current user invoking the service) into a bizdoc and send to TN. TN invokes a service based on the current partner and doc type (a special type with one tag).

We see webMethods’ value as mostly a transformation service using TN - ie. company specific processing. Audit trails for doc, conversations, etc.
Hope this helps!

Will, the Strategy design pattern should be applicable to the problem of determining the doc type as well. Thereby avoiding the need to add distinct URLs for each client.

Andreas

Ahh, the error of my thoughts. I just realized that without a doctype, there is no way to identify the partner. This would make it pretty difficult to pick up an attribute from the partner profile to use when creating an Strategy instance whose responsibility it would be to determine the doctype.

Andreas

Andreas–Will’s approach didn’t have distinct URLs for each client. It had a distinct URL for each doc type.

While the strategy pattern is elegant, and conceptually can be applied to the wM environments, the specific thought of having a Java class as a parameter suffers from the need for the partner to specify the Java class. This doesn’t seem like a good idea. And as Will pointed out, the whole purpose of TN is to invoke services based on content type, sender, receiver, and other attributes. TN more or less implements the strategy design pattern. Duplicating this in Java classes is…duplicative :slight_smile:

Regarding your “BusinessFreak” post (great name by the way!) I’m with you that it is rare that there is a business-oriented process model created before the integration project is started. What I’ve seen however, is that it’s the IT folks that tend to ask and push for this and it’s the business folks that demand the quick and dirty. “We just want to connect system A to B–why do we have go through all this other jazz? I don’t have time for that nor can I pay for it.” IT people usually see the bigger picture because they are exposed to more processes and systems and see the overlaps. Business folks tend to be a bit more myopic.

I could go off on a rant on how this situation is caused and perpetuated by how companies organize their business units, how they try to optimize the profitability of the company by optimizing the profitability of each business unit, how most companies continue to view IT as a cost center that provides “free” services to the business units, etc. but that would be way off topic. :wink:

Rob, good point about distinct URLs per doctype. As to Partner’s specifying the java class, that was not my intention. I was under the impression that you could define optional attributes in a partner profile that would NOT be entered by the partner.

I have to say I am enjoying this thread. I couldn’t agree more with your sentiment that IT folks are the ones driving for a business-oriented process. After all it is self-defense, we are the ones that will get blamed when it doesn’t work. Most times we will assign blame to the inadequate technology and soon the company is buying a new million dollar silver bullet. So far we have gone through Mainframes, client/server systems, ERPs, Corba, EAI, and now it’s Web services. I almost forgot the intranet portal. What will be next? Hmm, am I really this cynical?

Andreas

Ah, I see. You can indeed specify additional attributes (called Extended Fields) within profiles. A project I worked on recently used extended fields to identify the name of the mapping service (conceptually equivalent to specifying a Java class) that a general purpose document handling service would invoke. Here’s the high-level steps:

  1. TN would do it’s thing, identifying the doc type, the sender, the receiver, etc., select the appropriate processing rule and then invoke the document handler.

  2. The doc handler is generic–it knows nothing about the specifics of the document type. Using the sender and/or receiver data, it would lookup from extended fields which mapping service to use to change the record to a target format. It knows what extended field to retrieve from parameters passed to it from the processing rule.

  3. The mapping service is document specific. It changes a source record to a target record and returns the target record along with the target record name.

  4. The doc handler converts the target record to a bizdoc.

  5. The target bizdoc is submitted to TN for delivery, additional transformation, or whatever other processing that needs to be done.

We did a custom delivery service to move documents to the Enterprise Server, when that was desired.

Thus, we had an architecture where every document was submitted to TN, using TN as a sort of broker. Docs are never submitted directly to B2B services by outside processes (except when necessary to “preprocess” in order to get to TN) and docs are never directly sent from B2B services to anywhere other than TN.

Regarding the silver bullet, I agree it can be discouraging. You’ll note that all those things are still with us! I think it’s a matter of viewing things as evolutionary and as being the natural progression of things–client/server led to more powerful/flexible ERPs which led to the need to tie things together more quickly which led to EAI which led to the desire for standards which led to web services and so on.

IMO, web services won’t supplant EAI any more than any other RPC mechanism would. Direct RPC connections are often the right way to go but the facilities provided by EAI tools will not be/cannot be replaced by web services. The “aggregate application” model, in which some umbrella logic exists that does not/cannot be hosted in one of the participating applications, assures this.

Of course, that’s just my opinion. I’m probably wrong. :wink:

I am still trying to digest all the above… I am comperatively new to TN…

in this scnerio…
partner formats → one canonical format → backend format

why isnt it a good idea to convert from partner format to canonical format in B2B ?? once its converted then pass it onto TN to manage… (sorry for such a dumb question :expressionless: )

can somebody explain, please.

Thanks.

Ultimately, that’s exactly what you should do, but via TN.

The reason you want to pass the partner format directly to TN is so you can record it (and restart, log errors, etc).

Partner doc → TN → recorded → passed to service for conversion to canonical format

Canonical format → TN → recorded → passed to service for conversion to backend format

Backend format → TN → reccorded → delivered using ftp, http or custom delivery service (written to DB, passed to wM Enterprise, written to MQSeries queue, etc.)

TN does some important work, but none of it pertains to transformation directly. It simply figures out what services to invoke or where to deliver a doc based on doc contents and processing rules. The B2B services do all the transformation work.

To sum up: use TN for all routing, use B2B services for transformation

This is similar in concept to wM Enterprise–all events are routed by the broker. All transformations are done not by the broker but by the agents and adapters.