In this tutorial, you will learn how to setup Java Project with OpenNLP in Eclipse. The process should be same, to other IDEs as well.

Apache OpenNLP is a Java library used for common Natural Language Processing tasks such as sentence detection, tokenization, part-of-speech tagging, named entity recognition, chunking, parsing, and language detection. To use it in Eclipse, you need a Java project, the OpenNLP library jars or a Maven dependency, and the required model files for the NLP task you want to run.

Prerequisites for OpenNLP Java project setup in Eclipse

Before creating the OpenNLP Java project in Eclipse, make sure the following items are available on your system.

  • Java Development Kit installed and configured.
  • Eclipse IDE installed with Java development support.
  • Apache OpenNLP binary package or Maven dependency.
  • OpenNLP model files such as en-sent.bin for the example used in this tutorial.
  • Basic understanding of creating and running a Java class in Eclipse.

Setup OpenNLP Java Project

Follow these steps.

1. Create OpenNLP Java project in Eclipse.

Create a Java Project in the Eclipse. (Open Eclipse -> File(in Menu) -> New -> Project -> Java -> Java Project)

Provide a project name (Ex : OpenNLPJavaTutorial) and click on “Finish”.

After the project is created, Eclipse should show the project name in the Package Explorer. If the project does not show a src folder, create one before adding Java classes.

2. Download Apache OpenNLP jar files for Eclipse

Download jar files of openNLP from http://redrockdigimark.com/apachemirror/opennlp/.

At the time of writing this tutorial, opennlp-1.7.1 is the latest, and the list looks like in the below picture

How to setup OpenNLP Java Project - opennlp download links - Tutorialkart
opennlp version links

Step 4: Click on opennlp-1.7.1/ . We need bin package, because that could have the library (.jar) files.

How to setup OpenNLP Java Project - openNLP bin package - Tutorialkart
openNLP bin package

Click on apache-opennlp-1.7.1-bin.zip to download.

Once the zip file is downloaded, extract the contents, copy the lib folder and paste in the project as shown in the below picture.

How to setup OpenNLP Java Project - Lib Folder - Tutorialkart
opennlp-java-project-lib folder

Lib folder should contain the list of below jar files:
aopalliance-repackaged-2.5.0-b30.jar
grizzly-framework-2.3.28.jar
grizzly-http-2.3.28.jar
grizzly-http-server-2.3.28.jar
hk2-api-2.5.0-b30.jar
hk2-locator-2.5.0-b30.jar
hk2-utils-2.5.0-b30.jar
hppc-0.7.1.jar
jackson-annotations-2.8.4.jar
jackson-core-2.8.4.jar
jackson-databind-2.8.4.jar
jackson-jaxrs-base-2.8.4.jar
jackson-jaxrs-json-provider-2.8.4.jar
jackson-module-jaxb-annotations-2.8.4.jar
javassist-3.20.0-GA.jar
javax.annotation-api-1.2.jar
javax.inject-2.5.0-b30.jar
javax.ws.rs-api-2.0.1.jar
jcommander-1.48.jar
jersey-client-2.25.jar
jersey-common-2.25.jar
jersey-container-grizzly2-http-2.25.jar
jersey-entity-filtering-2.25.jar
jersey-guava-2.25.jar
jersey-media-jaxb-2.25.jar
jersey-media-json-jackson-2.25.jar
jersey-server-2.25.jar
morfologik-fsa-2.1.0.jar
morfologik-fsa-builders-2.1.0.jar
morfologik-stemming-2.1.0.jar
morfologik-tools-2.1.0.jar
opennlp-brat-annotator-1.7.1.jar
opennlp-morfologik-addon-1.7.1.jar
opennlp-tools-1.7.1.jar
opennlp-uima-1.7.1.jar
osgi-resource-locator-1.0.1.jar
validation-api-1.1.0.Final.jar

For newer OpenNLP releases, the exact jar names and versions can be different. The important jar for most basic examples is opennlp-tools, along with any dependency jars required by the downloaded release.

3. Add OpenNLP jar files to Eclipse Java build path

Add these jars to the build path (Project -> Properties -> Java Build Path -> Libraries -> Add Jars -> Select all the jars in lib folder -> Click “Apply” -> Click “OK”)

After adding the jars, expand the project in Eclipse and check the build path entries. If the jars are added correctly, import statements such as opennlp.tools.sentdetect.SentenceDetectorME should resolve without errors.

Apache has already trained some models for different problems in Natural Language Processing, with training data, and these models are available at http://opennlp.sourceforge.net/models-1.5/ . In the subsequent tutorials, we would refer to model files, which are available at this location. Do bookmark the link for a quick access.

We are ready with the openNLP Java Project Setup. Lets try Sentence detection using SentenceDetectExample.java.

4. Place OpenNLP model file in the Java project

Download “en-sent.bin” model file and place in the project. The final project structure should match with the structure shown in the below picture .

How to setup OpenNLP Java Project - java project structure - Tutorialkart
opennlp java project structure

In this example, en-sent.bin is loaded from the project working directory. If you place the model inside another folder, update the file path in the Java code. For example, if the model is inside a folder named models, the file path should be changed accordingly.

</>
Copy
InputStream is = new FileInputStream("models/en-sent.bin");

Alternative OpenNLP Eclipse setup using Maven

If you prefer Maven, you do not have to manually copy every jar file into the Eclipse project. Create a Maven project in Eclipse and add the OpenNLP tools dependency in pom.xml. Use the OpenNLP version that matches your project requirements.

</>
Copy
<dependency>
    <groupId>org.apache.opennlp</groupId>
    <artifactId>opennlp-tools</artifactId>
    <version>2.5.0</version>
</dependency>

With Maven, Eclipse downloads the dependency and updates the classpath automatically. You still need to download and place the model file, because model files are separate from the Java library dependency.

Run configuration for OpenNLP Java class in Eclipse

To run an OpenNLP Java class in Eclipse, right-click the class file that contains the main() method and select Run As -> Java Application. Eclipse creates a run configuration automatically for that class.

If the program cannot find en-sent.bin, open Run -> Run Configurations, select the Java application, and check the working directory under the Arguments tab. The model file path in the Java code is resolved relative to this working directory.

Example Java Project with OpenNLP

We shall try out the example, SentenceDetectExample.java to check if the setup is good.

SentenceDetectExample.java

</>
Copy
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import com.fasterxml.jackson.databind.exc.InvalidFormatException;

import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;
/**
 * @author tutorialkart
 */
public class SentenceDetectExample {

	public static void main(String[] args) {
		try {
			new SentenceDetectExample().sentenceDetect();
		} catch (IOException e) {
			e.printStackTrace();
		}
	}

	public void sentenceDetect() throws InvalidFormatException, IOException {
		String paragraph = "Apache openNLP supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.";

		// refer to model file "en-sent,bin", available at link http://opennlp.sourceforge.net/models-1.5/
		InputStream is = new FileInputStream("en-sent.bin");
		SentenceModel model = new SentenceModel(is);

		// load the model
		SentenceDetectorME sdetector = new SentenceDetectorME(model);

		// detect sentences in the paragraph
		String sentences[] = sdetector.sentDetect(paragraph);

		// print the sentences detected, to console
		for(int i=0;i<sentences.length;i++){
			System.out.println(sentences[i]);
		}
		is.close();
	}
}

When SentenceDetectExample.java is run, the console output is:

Apache openNLP supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.
These tasks are usually required to build more advanced text processing services.
OpenNLP also includes maximum entropy and perceptron based machine learning.

We are successfully done with the setup of openNLP Java Project in Eclipse.

Common Eclipse errors while setting up OpenNLP Java project

If the OpenNLP Java example does not run in Eclipse, check the following setup points before changing the program logic.

  • OpenNLP imports not resolved: confirm that opennlp-tools and its dependency jars are added to the Java Build Path.
  • Model file not found: confirm that en-sent.bin is in the working directory used by the Eclipse run configuration.
  • Class not found errors: add all jars from the extracted lib folder, not only one jar file.
  • Wrong Java project type: use a Java Project or Maven Project, not a generic project without Java build support.
  • Different OpenNLP version: check the imports and exception classes required by the version of OpenNLP used in your project.

Recommended OpenNLP project structure in Eclipse

For small learning examples, the model file can be kept directly in the project root as shown above. For a cleaner project, keep source code, libraries, and model files in separate folders.

OpenNLPJavaTutorial
├── src
│   └── SentenceDetectExample.java
├── lib
│   └── opennlp-tools-1.7.1.jar
└── models
    └── en-sent.bin

If you use the structure above, remember to update the Java file path to models/en-sent.bin. Keeping model files in a separate folder becomes helpful when you later add tokenization, name finder, part-of-speech tagging, or parsing examples.

QA checklist for OpenNLP setup tutorial in Eclipse

  • Verify that the Eclipse project is created as a Java Project or Maven Project.
  • Confirm that OpenNLP jars are visible under Java Build Path libraries.
  • Check that en-sent.bin is placed where the Java code expects it.
  • Run SentenceDetectExample.java and compare the console output with the expected three sentences.
  • Confirm that newly added XML and Java snippets use PrismJS-compatible language classes.

OpenNLP Java project setup FAQs for Eclipse

How do I add OpenNLP jars to a Java project in Eclipse?

Copy the extracted OpenNLP lib folder into the Eclipse project. Then open Project Properties, go to Java Build Path, choose Libraries, click Add Jars, select the jar files from the project, and apply the changes.

Can I use Maven instead of manually adding OpenNLP jars in Eclipse?

Yes. Create a Maven project and add the org.apache.opennlp:opennlp-tools dependency in pom.xml. Maven handles the Java library dependency, but you still need to provide the OpenNLP model file used by your program.

Where should I keep OpenNLP model files in an Eclipse project?

For a beginner example, you can keep the model file in the project root. For a more organized project, keep model files in a folder such as models and update the file path in the Java code.

Why does Eclipse show errors for OpenNLP import statements?

Eclipse shows import errors when the OpenNLP jars are not on the project build path or when the project has not refreshed after adding libraries. Add the jars to the Java Build Path, apply the changes, and refresh the project.

How do I run an OpenNLP Java class in Eclipse?

Open the Java class that contains the main() method, right-click inside the editor, and select Run As -> Java Application. If model loading fails, check the run configuration working directory and the model file path.

Conclusion

In this OpenNLP Tutorial, we have seen the setup of openNLP Java Project in Eclipse. In our next OpenNLP tutorials, we shall look into following topics.