Parsing Excel File (.xls)

Hi All,

Can anyone help me in parsing an Excel File(.xls) which consists of multiple records.

I have created a schema but when I tried to run with the file, it is showing “No Valid records Found”. I have defined the Flat File Definition with record delimiter as “carriage return line feed” and field delimiter as “Tab” but could not solve my problem.

I was told to write a generic java service which parses the Xcel file with any number of columns. It would be great and helpful if anyone can provide me a solution.

Many Thanks in Advance.

Thanks & Regards,
Ravikiran.

I believe using FlatFile Adapter ie FFSchema editor you can parse only (.csv) files

record delimiter as “new line” and field delimiter as “,” …If possible ask source team if they can send .csv files (flat file) instead.

Use the search functionality of this forum keyword on “excel files”

HTH,
RMg

There is a significant difference between an “Excel” file and a CSV file. Parsing an .xls/.xlsx file will need the help of code libraries to read them (there a several available). A CSV file, while editable by Excel is not really an “Excel” file. It is a text file following a particular format. It is CSV files that be parsed by the flat file services within IS, not Excel files.

1 Like

Yes agreed Rob…just putting .csv option also here…

Also there are serveral discussions on this Excel file processing topic,a search on this forum could guide in right direction.

HTH,
RMG

Right. I know you know the difference. Was just making sure that Ravikiran was also aware.

Hi All,

Thanks for the solutions given for this problem but the partners were not ready to send .csv files instead of .xls file as .csv file does not support UTF8 characters. They are sending some chinese characters which .csv doesnt support.

Please let me know more on this.

Thanks & Regards,
Ravikiran.

A .csv file can use any character encoding desired. The issue here may be in getting Excel to save the file such that a proper character encoding can be specified/used.

With Excel 2007 I was able to save as “Unicode text”, which saved the sheet in tab-delimited form and the multi-byte characters were properly preserved (saves using UTF-16). Perhaps someone can figure out how to have it save using commas instead of tabs for the delimiters.