API Documentation
Table of Contents
Each team will have access to the API of a simplified version of the SUMMA platform. Teams must use at least one of the SUMMA technologies in their final prototype. They can use as many external libraries and tools, etc. as they like.
Each team will have their own SUMMA server for the duration of the hack. You will receive details of your team’s server at the beginning of day one.
The IP address of your SUMMA server should replace [base]
in the examples of this document.
Demo Server
To try the examples on the demo SUMMA server, replace [base]
with summa-hackathon.newslabs.tools.bbc.co.uk
. Each team will have their own dediacted SUMMA insance for the duration of the hack.
Text surrounded by angle brackets (<>
) is a placeholder and requires a real value.
Getting started
To test that your SUMMA server is running, perform an HTTP GET to http://[base]:8026/v1/api/mediaItems/
You should receive a JSON array as a response. Don’t worry if the array is empty, this means there are no articles yet.
Notes on Security
Please be aware that the provided SUMMA servers have no authentication mechanism. All requests are http
requests, not https
. Participants should not upload sensitive materials.
Basic Flow of Data
The basic process for processing data is as follows:
- Upload data to API endpoints via HTTP POST
- SUMMA platform returns response for transaction ID
- Use transaction ID to request result at API endpoint
- When the result is ready, SUMMA will respond with the processed result.
The uploaded data will be subject to translation and all other processing operations. Participants cannot submit text for processing by a single technology module (Named Entity Recognition, Summarisation, etc.).
API
The API is key to interacting with the SUMMA platform. It allows teams to upload text data into the SUMMA server and retrieve results.
Available Endpoints
The following endpoints are available:
Endpoint | HTTP Method | Location | Use | Link |
---|---|---|---|---|
Upload Article | POST | http://[base]:8026/v1/api/newsItems/ |
Upload a text article to SUMMA | Upload Article |
Retrieve Article | GET | http://[base]:8026/v1/api/mediaItems/<id> |
Gets a processed article from SUMMA | Retrieve Article |
Retrieve Cluster | GET | http://[base]:8026/v1/api/storylines/<id> |
Gets similar articles from SUMMA | Retrieve Article Cluster |
Retrieve Latest Articles | GET | http://[base]:8026/v1/api/mediaItems/ |
Gets a list of the latest articles processed. Can be more efficient than frequent polling using the upload article endpoint | Retrieve Latest Articles |
Reset | DELETE | http://[base]:8026/v1/api/reset/ |
Delete all articles, storyline and feeds | Reset |
Upload Article
Request
Type | Headers | URL | |
---|---|---|---|
Request | POST | Content-Type: application/json |
http://[base]:8026/v1/api/newsItems/ |
Body The body must be a JSON document containing the following:
Field | Required? | Description |
---|---|---|
sourceItemTitle | Yes | Article title |
sourceItemMainText | Yes | Article body |
sourceItemLangeCodeGuess | Yes | Two character language code: en, de, es, ru, ar, lv |
feedURL | Yes | A URL to use an ID, it’s ok if the URL doesn’t actually exist. Each article gets associated with a feed. |
sourceItemOriginFeedName | Yes | A name for the feed |
sourceItemIdAtOrigin | Yes | A unique identifier |
Example
{
"sourceItemTitle": "Colapso en Génova: las 3 diferencias del emblemático puente del lago Maracaibo y el Morandi que se derrumbó en Italia, diseñado por el mismo ingeniero",
"sourceItemMainText": "El desplome del puente en Génova llevó a muchos a preguntarse por la situación de su gemelo sobre el lago Maracaibo...",
"sourceItemLangeCodeGuess": "es",
"feedURL": "http://hack.summa-project.eu/team-x",
"sourceItemOriginFeedName": "Team X",
"sourceItemIdAtOrigin": "http://hackday.summa-project.eu/team-x/123"
}
Response
Body The response is a JSON object. The main properties are:
Field | Example value | Description |
---|---|---|
id | 1c350e2f-f747-44fb-8a8d-132f6b1d3a8f |
Identifier (string ) to use to retrieve translated text, auto-summary, entities, topics and cluster information. See Retrieve Article section below for details. |
Example
{
"customMetadata": {},
"feedId": "1c350e2f-f747-44fb-8a8d-132f6b1d3a8f",
"feedURL": "http://hack.summa-project.eu/team-x",
"id": "2110ec4a-5c6b-48b1-9914-6fe98e51f2dc",
"sourceItemIdAtOrigin": "http://hackday.summa-project.eu/team-x/123",
"sourceItemLangeCodeGuess": "es",
"sourceItemMainText": "El desplome del puente en Génova llevó a muchos a preguntarse por la situación de su gemelo sobre el lago Maracaibo...",
"sourceItemOriginFeedName": "Team X",
"sourceItemTitle": "Colapso en Génova: las 3 diferencias del emblemático puente del lago Maracaibo y el Morandi que se derrumbó en Italia, diseñado por el mismo ingeniero",
// Additional fields omitted
"timeAdded": "2018-08-22T15:24:32.663Z"
}
Retrieve Article
Request
Type | Headers | URL | |
---|---|---|---|
Request | GET | http://[base]:8026/v1/api/mediaItems/<id> |
Placeholder | Example value | Description |
---|---|---|
:id | 2110ec4a-5c6b-48b1-9914-6fe98e51f2dc | The identifier. Use Upload Article to get the identifier. |
Response
The response is a JSON object. The main properties are:
Name | Example value | Description |
---|---|---|
detectedTopics | [ ["videos", 0.09571], ["top stories", 0.0938] ] |
Topics (array ) |
mainText.english | The collapse of the bridge in Genoa led many to wonder about the situation... |
English translation (string ) |
namedEntities.entities | { "m.01ncqr": { "baseForm": "Lake Maracaibo", "currlangForm": "Lake Maracaibo", "id": "m.01ncqr", "type": "places" } } |
Named entities (object ) |
namedEntities.mentionsIn | "mentionsIn": { "mainText": { "m.01ncqr": [{ "endPosition": 109, "startPosition": 95, "text": "Lake Maracaibo" }] } } |
Object |
storyId | 1 |
Identifier (integer ) of the story cluster. Each cluster is a set of similar stories. |
summary | [ "The collapse of the bridge in Genoa led many to..." ] |
Array of strings. Each string is a “bullet point”. |
title.english | Collapse in Genoa: the three differences from the flagship Lake Maracaibo Lake and Morandi, which collapsed in Italy, designed by the engineer himself. |
English translation (string) |
Retrieve article cluster
Get storyId
from the Retrieve Article described above.
Request
Type | Headers | URL | |
---|---|---|---|
Request | GET | http://[base]:8026/v1/api/storylines/<id> |
Placeholder | Example value | Description |
---|---|---|
:id | 1 | Story identifier |
Response
The response in a JSON object. The main properties are:
Name | Example value | Description |
---|---|---|
highlightItems | [ { "highlight": "PM: Lebanon sees no reason for the Syrian refugees to...", "sentiment": null } ] |
Array of highlighted objects |
label | Collapse in Genoa: the three differences from the flagship Lake Maracaibo Lake and Morandi, which collapsed in Italy, designed by the engineer himself. |
Label (string ) |
newsItems | { "00956029-0194-4498-a70c-a684172f49df": { "id": "00956029-0194-4498-a70c-a684172f49df", "title": "DW English Live Stream Chunk", ... } |
Object containing articles in the cluster |
Retrieve latest articles
Request
Type | Headers | URL | |
---|---|---|---|
Request | GET | http://[base]:8026/v1/api/mediaItems/ |
Querystring parameters:
Name | Required? | Description | Example |
---|---|---|---|
limit | No | Maximum number of items to return. The default is 20. | 50 |
Response
The response is a JSON array. Each article in the array has fields in Retrieve Article.
Reset
Request
Type | Headers | URL | |
---|---|---|---|
Request | DELETE | http://[base]:8026/v1/api/reset/ |
Response
The response is a JSON object. The properties are:
Name | Example value |
---|---|
mediaItems | {"deleted":22, "errors":0, "inserted":0, "replaced":0, "skipped":0, "unchanged":0} |
feeds | {"deleted":1, "errors":0, "inserted":0, "replaced":0, "skipped":0, "unchanged":0} |
storylines | {"deleted":5, "errors":0, "inserted":0, "replaced":0, "skipped":0, "unchanged":0} |
Developer Resources
Visualisation Resources
Collections
- Benjamin: https://github.com/benjbach/vishub/wiki/Visualization-Tools
- Medialab: http://tools.medialab.sciences-po.fr
- Voyant (many tools on text): http://docs.voyant-tools.org/tools
General purpose
- Rawgraphs.io: http://rawgraphs.io
- Tableau: https://public.tableau.com/en-us/s
Multivariate data
- PoleStar: https://vega.github.io/polestar
- iVisDesigner: https://donghaoren.org/ivisdesigner
Text visualization
- Textexture (text networks): http://textexture.com
Networks
- Gephi: http://gephi.org
- iVisDesigner: https://donghaoren.org/ivisdesigner
Feedback
Problems, suggestions, missing information? Contact me before the day at andrew.secker@bbc.co.uk or find me at the venue.