Hey Julian,
Thanks for your elaborate answer! I would like to use this library because it sounds like a very wide plugin with lots of file extension parsers. I too tried to use the jar file downloaded from that link and placed it in the lib folder of the lucee web-inf. Next I used this code, converted from an example, to try to do a clean test (this is the small test, bigger one at the bottom):
handler = createObject(“java”, “org.apache.tika.sax.BodyContentHandler”);
metadata = createObject(“java”, “org.apache.tika.metadata.Metadata”);
inputstream = createObject(“java”, “java.io.FileInputStream”).init(createObject(“java”, “java.io.File”).init(‘C:\lucee\tomcat\webapps\ROOT\test\dummy.pdf’));
pcontext = createObject(“java”, “org.apache.tika.parser.ParseContext”);
pdfparser = createObject(“java”, “org.apache.tika.parser.AutoDetectParser”);
pdfparser.parse(inputstream, handler, metadata, pcontext);
writeDump(handler.toString());
But did you use the single jar file? I tried the server one and the app one. When I writeOutput the result i get null. But is it because of some hidden error? I tried it on the server and on my own clean local Lucee, but same result.
EDIT
I’ve found this link to the javaLoader you were talking about: link. So should I add this to the components folder and use this example? link. It would be very helpfull if you could explain that, we’re having some troubles with jar files lately.
(this is the sloppy one, but should work I think)
try {
httpGet = createObject(“java”, “org.apache.http.client.methods.HttpGet”).init(‘[local pdf]’);
response = createObject(“java”, “org.apache.http.impl.client.DefaultHttpClient”).execute(httpGet);
entity = response.getEntity();
input = createObject(“java”, “java.io.InputStream”);
if(entity !== null)
{
try{
input = entity.getContent();
handler = createObject(“java”, “org.apache.tika.sax.BodyContentHandler”);
metadata = createObject(“java”, “org.apache.tika.metadata.Metadata”);
parser = createObject(“java”, “org.apache.tika.parser.AutoDetectParser”);
parseContext = createObject(“java”, “org.apache.tika.parser.ParseContext”);
parser.parse(input, handler, metadata, parseContext);
map.put(“text”, handler.toString().replaceAll(“\n|\r|\t”, " "));
map.put(“title”, metadata.get(createObject(“java”, “org.apache.tika.metadata.TikaCoreProperties”).TITLE));
map.put(“pageCount”, metadata.get(“xmpTPg:NPages”));
// /map.put(“status_code”, response.getStatusLine().getStatusCode() + “”);
f_response = metadata.get(createObject(“java”, “org.apache.tika.metadata.TikaCoreProperties”).SOURCE);
writeOutput(parser);
}
catch(any e){
f_response = “#e.message# 1”;
}
finally{
if (input !== null) {
try {
input.close();
} catch (any e) {
f_response = “#e.message# 2”;
}
}
}
}
else
{
f_response = “Object is null”;
}
}
catch(any e){
f_response = e.message;
}