Parses Microformats, RDFa, Microdata, RDF/XML, Turtle, N-Triples, JSON-LD and NQuads.
Download and install Any23: visit the Developers Site and the Documentation.
http://.../best/twitter.com/cygri
http://.../rdfxml/http://data.gov
http://.../ttl/http://www.w3.org/People/Berners-Lee/card
http://.../?uri=http://dbpedia.org/resource/Berlin
http://.../?format=nt&uri=http://dbpedia.org/resource/Berlin
HTTP GET requests can be made to IRIs of the shape
http://.../format/input-uri
The response is the input document converted to the desired output format.
HTTP GET requests can be made to
the IRI
http://.../
with the following
query parameters:
uri | IRI of an input document. |
---|---|
format | Desired output format, defaults to best . |
validation-mode | The validation level to be applied on the input. Possible values:none (no validation applied);validate (apply validation and produce validation report if annotate flag is enabled);validate+fix (apply validation, try to fix detection issues and produce validation report if annotate flag is enabled). |
annotate | If specified the output RDF will contain extractor specific scope comments. Possible values: on /off |
report | If specified will produce a full XML report containing extraction and validation issues other than produced metadata. Possible values: on /off |
openie | If specified the
Open Information Extraction (Open IE) system will be activated (default off). Possible values: on /off |
http://...any23/format
.The response is the input document converted to the desired output format.
HTTP POSTing a document body to
http://.../format
will convert
the document to the specified output format.
The media type of the input
has to be specified in the Content-Type
HTTP header.
Depending on the servlet container, a Content-Length
header specifying
the length of the input document in bytes might also be required.
Typical media types for supported input formats are:
Input format | Media type |
---|---|
HTML | text/html |
RDF/XML | application/rdf+xml |
Turtle | text/turtle |
N-Triples | text/nt |
N-Quads | text/nq |
TriX | application/trix |
Example POST request:
POST /rdfxml HTTP/1.0 Host: example.com Content-Type: text/turtle Content-Length: 174 @prefix foaf: <http://xmlns.com/foaf/0.1/> . [] a foaf:Person; foaf:name "John X. Foobar"; foaf:mbox_sha1sum "cef817456278b70cee8e5a1611539ef9d928810e"; .
A document body can also be converted by HTTP POSTing form data to
http://.../
.
The Content-Type
HTTP header must be set to
application/x-www-form-urlencoded
. The following
parameters are supported:
type | Media type of the input, see the table above. If not present, auto-detection will be attempted. |
---|---|
body | Document body to be converted. |
format | Desired output format; defaults to best . |
validation-mode | The validation level to be applied on the input. Possible values:none (no validation applied);validate (apply validation and produce validation report if annotate flag is enabled);validate+fix (apply validation, try to fix detection issues and produce validation report if annotate flag is enabled). |
annotate | If specified the output RDF will contain extractor specific scope comments. Possible values: on /off |
report | If specified will produce a full XML report containing extraction and validation issues other than produced metadata. Possible values: on /off |
openie | If specified the
Open Information Extraction (Open IE) system will be activated (default off). Possible values: on /off |
Supported output format identifiers are:
best
for content negotiation according to the client's Accept
HTTP headerturtle
, ttl
, n3
for
Turtle/N3ntriples
, nt
for
N-Triplesnquads
, nq
for
N-Quadstrix
for
TriXrdfxml
, rdf
, xml
for
RDF/XMLjson
for JSONjsonld
for JSON-LDProcessing errors are indicated via
HTTP status codes and brief text/plain
error messages.
The following status codes can be returned:
Code | Reason |
---|---|
200 OK | Success |
400 Bad Request | Missing or malformed input parameter |
404 Not Found | Malformed request IRI |
406 Not Acceptable | None of the media types specified in the Accept header are supported |
415 Unsupported Media Type | Document body with unsupported media type was POSTed |
501 Not Implemented | Extraction from input was successful, but yielded zero triples |
502 Bad Gateway | Input document from a remote server could not be fetched or parsed |
The XML report format is subjected to changes. The current content is described in section Any23 Service.
Apache Any23 v.2.3 (2021-02-02 16:55:59+0000)
Any23 project homepage | Hosted at Apache Software Foundation