Jakarta-poi

Alfonso59 · September 26, 2003, 6:41pm

Hello guys,
this my question:
could i extract documents content directly using jakarta-poi API???, you doit???

Alfonso.

Stuart_Fyffe-Collins · September 26, 2003, 9:59pm

Hi,

The non-xml indexer internally uses Jakarta POI to extact meta-data (and not content) from MS documents. The meta-data is stored as a XML document with the same internal id (ino:id) as the non-xml document.

Hope this helps.

Stuart Fyffe-Collins
Software AG (UK) Ltd.

Alfonso59 · September 29, 2003, 9:26pm

hi,
i’m trying Tamino non xml indexer, but i’m really interesting only into “generated” by the indexer.

Alfonso.

Guest · September 30, 2003, 11:49am

Hi Alfonso,

The nonXMLIndexer generates . For Excel and MS word content POI is used internally. Of course all formatting disappears, but you can make text queries, for example "Find all word documents containing ‘Tamino’ "

Regards,
Martin

Alfonso59 · October 2, 2003, 3:38pm

Hi Martin,
i only want the content, if is possible, without using the indexer.

IndexedDocument.getContent();

Regards,
Alfonso

Guest · October 2, 2003, 3:57pm

Hi Alfonso,

then you have to write your own indexer (or content extractor). What do you want to do with that content?

regards,
Martin

Alfonso59 · October 2, 2003, 4:01pm

Full text search

Guest · October 2, 2003, 5:22pm

Thats what the nonXMLIndexer is designed for.

Regards,
Martin

Alfonso59 · October 2, 2003, 8:16pm

My problem is metadata(xml) adding for content, i only want use one collection.

Guest · October 2, 2003, 8:37pm

nonXML indexer writes the metadata(XML for properties AND content) in the same collection (even in the same schema) as the document itself.

Regards,
Martin

Topic		Replies	Views
Using newer POI versions with Indexer Tamino	2	10399	April 2, 2021
Confused about Non-XML Indexer Tamino	5	12247	April 2, 2021
Can't load Word document Tamino	6	12784	April 2, 2021
Search in nonxml documents Tamino	1	4732	April 2, 2021
Document content indexing Tamino	2	10455	April 2, 2021

Jakarta-poi

Related topics