Parsing Excel-2007-Format (xlsx)

Hi,
i am trying to parse the new xlsx-Excel-Format by using a slightly changed version of the MSExcelDocumentToRecord-Java-Service and relying on the new org.apache.poi-3.5.beta3 package. Unfortunately it doesn’t work. The code looks like this:

// Get file name from pipeline

IDataCursor pipelineCursor = pipeline.getCursor();
String file_name = “”;
file_name = IDataUtil.getString( pipelineCursor, “file_name” );
String date_format = null;
date_format = IDataUtil.getString( pipelineCursor, “date_format” );
String time_format = null;
time_format = IDataUtil.getString( pipelineCursor, “time_format” );
String number_format = null;
//number_format = IDataUtil.getString( pipelineCursor, “number_format” );

DecimalFormat df = (DecimalFormat)DecimalFormat.getInstance(Locale.GERMAN);
df.applyPattern(".########");

if (date_format == null)
{
date_format = “yyyy-MM-dd”;
}
SimpleDateFormat dateFormat = new SimpleDateFormat(date_format);

if (time_format == null)
{
time_format = “HH:mm:ss”;
}
SimpleDateFormat timeFormat = new SimpleDateFormat(time_format);

pipelineCursor.destroy();

IData dataRows[] = null;
IData excelWorksheet = IDataFactory.create();
int row_cnt = 0;

try
{
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(file_name));
XSSFSheet sheet = (XSSFSheet) wb.getSheetAt(0);
XSSFRow row = null;
XSSFCell cell = null;



// Read Excel data and create dynamic record based on fileds

row_cnt = sheet.getLastRowNum();
dataRows = new IData[row_cnt+1];


// read and process rows
for (int i = 0; i <= row_cnt; i++)
{
// process row i
row = (XSSFRow) sheet.getRow(i);

// create instance of data row representing this row
dataRows[i] = IDataFactory.create();
IDataCursor dataRowCursor = dataRows[i].getCursor();
if ((row!=null) && (row.getLastCellNum()>0))
{
// create and fill array for values
String[] cellValues = new String[row.getLastCellNum()];
int isRowEmpty = 0;
for (int j = 0; j < row.getLastCellNum(); j++)
{
cell = (XSSFCell) row.getCell((short)j);
if (cell != null)
{
int type = cell.getCellType();
switch(type)
{
case XSSFCell.CELL_TYPE_STRING:
cellValues[j]=cell.getRichStringCellValue().getString();
if (cellValues[j].length() > 0) {
isRowEmpty = 1;
}
break;
case XSSFCell.CELL_TYPE_NUMERIC:
if (HSSFDateUtil.isCellDateFormatted(cell))
{
java.util.Date dateValue = cell.getDateCellValue();

Integer year = dateValue.getYear(); // or getYear
if (year != -1)
{
cellValues[j]=dateFormat.format(dateValue);
}
else
{
cellValues[j]=timeFormat.format(dateValue);
}
isRowEmpty = 1;
}
else
{
//cellValues[j]=Double.toString(cell.getNumericCellValue());
cellValues[j]=df.format(cell.getNumericCellValue());
isRowEmpty = 1;
}
break;
case XSSFCell.CELL_TYPE_BLANK:
cellValues[j]="";
}
}
}
// convert array to data row
IDataUtil.put(dataRowCursor, “cells”, cellValues);
if (isRowEmpty == 0)
{
IDataUtil.put(dataRowCursor, “rowEmpty”, “true”);
}
else
{
IDataUtil.put(dataRowCursor, “rowEmpty”, “false”);
}

}
dataRowCursor.destroy();
}


}
catch (Exception e)
{
throw new ServiceException(e);
}

// Assemble generated rows in excelWorksheet object
IDataCursor excelWorksheetCursor = excelWorksheet.getCursor();
IDataUtil.put(excelWorksheetCursor, “dataRows”, dataRows);
excelWorksheetCursor.destroy();

// put data into pipeline
IDataCursor pipelineCursorOut = pipeline.getCursor();
IDataUtil.put(pipelineCursorOut,“excelWorksheet”, excelWorksheet );
IDataUtil.put(pipelineCursorOut, “row_cnt”, row_cnt);
pipelineCursorOut.destroy();

Does anybody have an idea how to solve this problem. Many Thanks in advance!

With best Regards,

Benedikt

Thanks for the code sample. However, you’ll probably get better assistance if you describe the issue. “It doesn’t work” doesn’t help too much.

Mark

Thanks for the quick answer. I allready found the solution for the problem (using at the parts where eclipse was complaining about incompatible conversion from hssf to xssf a cast for xssf) and it work. By the way, the next time I will formulate my request more comprehensible for other users (typical greenhorn mistake I guess :))

Hi,

I am a newbie to webmethods( 2008 graduate) and I am given a task to parse the xlsx(Excel-2007) format files. I see that here at our company webMethods is used as the integration layer. I have checked the code given in the below link…
http://www.wmusers.com/forum/showthread.php?t=16439&highlight=Excel+parsing

However I couldn’t really build a service using that. Can anyone provide some help in building a service that can parse the 2007 excel files.

Thanks in advance

Hi,

we used the code (posted before) in a JavaService. In the beginning we make a differentiation between the excel-format “xls” and the excel-2007-format (“xlsx”). For the xls-format the code with the class-hssf should work. For the excel-2007-format you can use the same code. Only replace the hssf-class with the xssf-class. This works. Hope, this short description can help you.

with best regards,

benedikt

Thank a lot for the reply…

I am trying to use the same code what you have used in your Java API. Please check below for all the info on what I am doing

Inputs #
file_name(declared as string)
date_format(declared as string)
time_format(declared as string)
number_format(declared as string)

Outputs#
dataRows(declared as object)
excelWorksheet (declared as stringList)
row_cnt(declared as string)

Imports #
java.io.*
org.apache.poi.poifs.filesystem.*
org.apache.poi.hssf.usermodel.*
java.util.*
java.text.*

Code # (Same code what you have posted in your thread and no change was made)

I would be really thankful to you if you can guide me in developing this Java API.

Thanks again

Hi,

unfortunately I am not allowed to paste here any more code from our company. So as hint: You also have to include the xssf-java-class (you probably have to download that class at apache-framework - project “poi”). Also include the parent-class of hssf and xssf: the ss-class. Then you can use the pasted code and make before that a if-else-statement to differentiate between xls- and xlsx-format. Use the code in both cases, just use xssf instead of hssf in the case of the xlsx-format. For the method isCellDateFormated which doesn’t exist yet fot the xssf-class use the parent-class of both classes the ss-class. This worked for us. I’m not a good java-programmer, so there might be more elegant ways to solve the problem. Good luck!

Benedikt

Hi,

Thanks for your help!!!..I will try to write the API using the tips you have given me. I really appreciate you for taking time and providing me the information what I have requested for…

Thanks again

Ram

Hi Benedikt,

Thanks for sharing your knowledge on this matter. You’ve done a great service to the community. I do have one silly question. Where did you place the poi-3.2-FINAL*.jar, poi-contrib-3.2-FINAL*.jar, and poi-scratchpad*.jar files? Did you put it in /IntegrationServer/lib/jar path? Did you have to restart the IS in order for the java service to work?

Hi,
yes I put the java classes into this path and then I restarded the IS.

With Best Regards,

Benedikt

Hi ,

Can anyone provide the final list of files (Jar files) and entire package if possible which got successfully executed without any errors we have a similar requirement.

Thanks in Advance.

Thanks,
Madhuri

Hi Madhuri,
the following java classes need to be included:

org.apache.poi.ss.usermodel*
org.apache.poi.xssf.usermodel*
java.util.*
java.text*
java.lang*
java.math*
java.io*
java.util*

With best Regards,

Benedikt

Hi Benedikt,

Thanks alot for your sharing the Jar files list. If you dont mind can you please share the code that contains the reading of Excel and then converting it to the app records structure again .

I have searched in the forums and found this link.Could you please let me knwo if you have used the same code

http://www.wmusers.com/software/browse.shtml?keys=excel&SUBMIT.x=7&SUBMIT.y=12

Please let me know if you need some more details to explain my problem.

Thanks & Regards,
Ramani