Skip to content

Implementing java based text extractors as web APIs (currently only Boilerpipe & Goose)

Notifications You must be signed in to change notification settings

tomazk/Java-Text-Extractor-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Java Text Extractor API

Web API for Java based text extractors. Implemented using Play framework.

Author

Tomaž Kovačič <tomaz.kovacic@gmail.com>

Extractors supported

API Documentation

Note: All parameters should be encoded using x-www-form-urlencoded

Boilerpipe API

method: POST

endpoint: http://yourdomain/boilerpipe/extract/

params:

  • extractorType : (article|default|sentence)
  • rawHtml : html content

JSON response format:

{
        "result": RESULT_TEXT
        "status": (OK|ERROR)
        "errorMsg": ERROR_MESSAGE (optional)
}

Goose API

method: POST

endpoint: http://yourdomain/goose/extract/

params:

  • rawHtml : html content

JSON response format:

{
        "result": RESULT_TEXT
        "status": (OK|ERROR)
        "errorMsg": ERROR_MESSAGE (optional)
}

Dependencies

  • Play framework v1.1.1.

Licence

  • Everything that's not in the /lib/ directory is licenced under GPLv3

  • Jar packages in the /lib/ are licenced under their respective licence listed below:

Copyright (C) Tomaž Kovačič

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

About

Implementing java based text extractors as web APIs (currently only Boilerpipe & Goose)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages