Turn Text Into HERE Maps with Python NLTK

Solution

To solve this problem we will stand up a Python Flask server that exposes a few APIs to

  1. Extract locations from the text based on some clues with the Natural Language Toolkit (NLTK).
  2. Geocode the location to determine a latitude and longitude with the HERE Geocoder API.
  3. Place markers on a map to identify the recognized places with the HERE Map Image API.

Server Setup

For this section I make an assumption you are running an environment like OSX or Linux. If you are running Windows you will need to adjust some of the commands a bit.

Configuration

With the Twelve-Factor App the case is made that a best practice is to store config in the environment. I agree and like to store my API credentials in variables APP_ID_HERE and APP_CODE_HERE found in a file called HERE.sh.

#!/bin/bash 
export APP_ID_HERE=your-app-id-here
export APP_CODE_HERE=your-app-code-here

Structure

The web server component will need several files you can see summarized in the listing below. Start by running mkdir -p app/api_1_0.

├── app   
| ├── __init__.py
│ └── api_1_0
│ ├── __init__.py
│ ├── demo.py
│ ├── health.py
├── HERE.sh
├── manage.py
├── config.py
└── requirements.txt
Flask
Flask-Script
gunicorn
nltk
requests

App

We will use manage.py as the main entrypoint to our application. It looks like the following listing:

import os
import app from flask_script
import Manager, Server
app = app.create_app('default')
manager = Manager(app)
if __name__ == '__main__':
port = os.environ('PORT', 8000)
manager.add_command('runserver', Server(port=port))
manager.run()
import osclass Config(object):
SECRET_KEY = os.environ.get('FLASK_SECRET_KEY')
APP_ID_HERE = os.environ.get('APP_ID_HERE')
APP_CODE_HERE = os.environ.get('APP_CODE_HERE')
@staticmethod
def init_app(app):
pass
config = {'default': Config}
from config import config
from flask import Flask
def create_app(config_name):
app = Flask(__name__)
app.config.from_object(config[config_name])
config[config_name].init_app(app)
from .api_1_0 import api as api_1_0_blueprint
app.register_blueprint(api_1_0_blueprint, url_prefix='/api/1.0')
return app
from flask import Blueprint
api = Blueprint('api', __name__)
from . import health
from . import demo

Healthcheck

To make sure our server is running properly we can add a quick healthcheck endpoint in the file app/api_1_0/healthcheck.py.

from flask import jsonify
from flask import current_app as app
from . import api
@api.route('/health', methods=['GET'])
def handle_health():
return jsonify({
'hello': 'world',
'app_id_here': app.config['APP_ID_HERE'],
'app_code_here': app.config['APP_CODE_HERE']
})

Text

For the purposes of getting started I will use a simple text file with just the locations from before.

Venice
Mdina
Aswan
Soro
Gryfino

Extract

We need to extract text from HTML and tokenize any words found that might be a location. We will define a method to handle requests for the resource /tokens so that we can look at each step independently.

@api.route('/tokens', methods=['GET'])
def handle_tokenize():
# Take URL as input and fetch the body
url = request.args.get('url')
response = session.get('url')

# Parse HTML from the given URL
body = BeautifulSoup(response.content, 'html.parser')

# Remove JavaScript and CSS from our life
for script in body(['script', 'style']):
script.decompose()

text = body.get_text()

# Ignore punctuation
tokenizer = RegexpTokenizer(r'\w+')

# Ignore duplicates
tokens = set(tokenizer.tokenize(text))

# Remove any stop words
stop_words_set = set(stopwords.words())
tokens = [w for w in tokens if not w in stop_words_set]

# Now just get proper nouns
tagged = pos_tag(tokens)
tokens = [w for w,pos in tagged if pos in ['NNP', 'NNPS']]

return jsonify(list(tokens))
$ python
...
>>> import nltk
>>> nltk.download('stopwords')
>>> nltk.download('averaged_perceptron_tagger')
#!/bin/bash
curl http://localhost:8000/api/1.0/tokens?url=$1
["Retreat","Industry","Boushnak","Frise","National","Mesa","Chicago","Washington","Forest","Angeles","Canyons","Colorado",...]

Geocode

The HERE Geocoder API very simply takes a human understandable location and turns it into geocordinates. If you put in an address, you get back latitude and longitude.

@api.route('/geocode', methods=['GET'])
def handle_geocode():
uri = 'https://geocoder.api.here.com/6.2/geocode.json'
headers = {}
params = {
'app_id': app.config['APP_ID_HERE'],
'app_code': app.config['APP_CODE_HERE'],
'searchtext': request.args.get('searchtext')
}

response = session.get(uri, headers=headers, params=params)
return jsonify(response.json())
#!/bin/bash
curl http://localhost:8000/api/1.0/geocode?searchtext=$1
"DisplayPosition": { "Latitude": 53.25676, "Longitude": 14.48947 } 

Map

Finally, we’re going to take the latitude and longitude we received from our geocode request and generate a simple render with the HERE Map Image API.

@api.route('/mapview', methods=['GET'])
def handle_mapview():
uri = 'https://image.maps.api.here.com/mia/1.6/mapview'
headers = {}
params = {
'app_id': app.config['APP_ID_HERE'],
'app_code': app.config['APP_CODE_HERE'],
'poi': request.args.get('poi')
}

response = session.get(uri, headers=headers, params=params)
image_path = tempfile.mktemp()
open(image_path, 'wb').write(response.content)

return image_path

Summary

The reason for making /tokens, /geocode, and /mapview separate endpoints is that this illustrates how you might setup microservices with Python + Flask for each operation you want to perform. This would allow a deployment to scale them independently.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store