Alain Antone's Library tagged → View Popular, Search in Google
The boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.
The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings.
Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate.
Boilerpipe is a Java library written by Christian Kohlschütter. It is released under the Apache License 2.0.
Java's Cool (alias Javascool) est un logiciel conçu pour l'apprentissage au sein des lycées français des bases de la programmation.
Il a été conçu à la demande de professeurs de lycées. Il permet de manipuler un Macro-Langage de programmation, basé sur le langage JAVA.
Pour plus de détails vous pouvez consulter les Manuels et Tutoriels disponibles.
Selected Tags
Related Tags
Top Contributors
Groups interested in java
-
Java and Java script Programind
Codes and techniques of prog...
Items: 4 | Visits: 113
Created by: stefan stoichev
-
Java
Items: 886 | Visits: 149
Created by: Lubos Pochman
Diigo is about better ways to research, share and collaborate on information. Learn more »
Join Diigo
