Azure Functions

Provision an Azure Function app via the Azure Portal and deploy an OntoPop event-driven data pipeline app to it.

Please note that the OntoPop backend open-source software project, which includes the event-driven data pipelines and APIs, is undergoing extensive redesign and refactoring as part of OntoPop Community 3.x in order to improve performance, security, extensibility and maintainability. As a result, the documentation on this page will be significantly updated. Please refer to the OntoPop Roadmap for further information.

Overview

Azure Functions is the native serverless, event-driven compute service offered by the Microsoft Azure cloud computing platform, enabling applications and backend services to be run without provisioning or managing any servers. This page provides instructions on how to provision Azure Function apps and then deploy the OntoPop event-driven data pipeline Spring Boot applications to them.

For further information regarding Azure Functions, please visit https://azure.microsoft.com/en-gb/services/functions.

It is recommended that you configure and integrate the steps described in this page into a CI/CD pipeline in order to automate the build, testing and deployment stages.

Data Pipeline

OntoPop provides Azure Function Spring Boot application deployments that wrap around each of the event-driven microservices described in the logical system architecture. These Azure Function applications are provided out-of-the-box to enable quick and easy deployment to Azure Function apps. Assuming that you have followed the instructions detailed in Build from Source, the Azure Function Spring Boot applications for each of the event-driven microservices that make up the OntoPop data pipeline may be found in the $ONTOPOP_BASE/ontopop-apps/ontopop-apps-azure Maven module, which itself contains the following child modules pertinent to the data pipeline:

  1. ontopop-azure-function-app-subscriber-github-webhook - Node.js application that subscribes to GitHub webhooks and forwards the webhook headers and payload to the ontology ingestion service via either a HTTP POST request or by publishing a message to the shared messaging system (to which the ontology ingestion service is subscribed). Note that this is a Node.js application (i.e. not a Spring Boot application) as GitHub webhook requests timeout after 10 seconds after which the HTTP connection is destroyed and the webhook payload lost. Thus we can deploy this lightweight Node.js application that returns a promise (i.e. an immediate response back to GitHub) to avoid the longer cold start-up times incurred by Java-based applications.
  2. ontopop-azure-function-app-data-ontology-ingestor - Azure Function Spring Boot application deployment wrapper around the ontology ingestion service, invoked by the ontopop-azure-function-app-subscriber-github-webhook application by either a HTTP POST request or via the shared messaging system to which the ontopop-azure-function-app-data-ontology-ingestor application is subscribed.
  3. ontopop-azure-function-app-data-ontology-validator - Azure Function Spring Boot application deployment wrapper around the ontology validation service, invoked via its subscription to the shared messaging system.
  4. ontopop-azure-function-app-data-ontology-loader-triplestore - Azure Function Spring Boot application deployment wrapper around the ontology triplestore loading service, invoked via its subscription to the shared messaging system.
  5. ontopop-azure-function-app-data-ontology-parser - Azure Function Spring Boot application deployment wrapper around the ontology parsing service, invoked via its subscription to the shared messaging system.
  6. ontopop-azure-function-app-data-ontology-modeller-graph - Azure Function Spring Boot application deployment wrapper around the property graph modelling service, invoked via its subscription to the shared messaging system.
  7. ontopop-azure-function-app-data-ontology-loader-graph - Azure Function Spring Boot application deployment wrapper around the property graph loading service, invoked via its subscription to the shared messaging system.
  8. ontopop-azure-function-app-data-ontology-indexer-graph - Azure Function Spring Boot application deployment wrapper around the property graph indexing service, invoked via its subscription to the shared messaging system.

Setup

Build from Source

In order to compile and build the OntoPop event-driven data pipeline Azure Function Spring Boot applications in preparation for deployment to Azure Function apps, please follow the instructions detailed in Build from Source.

Azure CLI

We shall use the Azure Command Line Interface (CLI) to deploy the OntoPop Java artifacts (i.e. OntoPop's data pipeline Azure Function Spring Boot applications packaged as JAR files) that were created in the Build from Source stage above to Azure Function apps. To install the Azure CLI, please follow the instructions below:

The instructions below are for Ubuntu 20.04. Installation instructions for other Linux distributions and other operating systems such as Windows may be found at https://docs.microsoft.com/en-us/cli/azure.

# Remove any old installations of the Azure CLI
$ sudo apt remove azure-cli -y && sudo apt autoremove -y

# Install the Azure CLI
$ curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

# Sign into your Azure account
$ az login
Azure Functions Core Tools

We shall use Azure Functions Core Tools to deploy a function app directly to an Azure subscription (note that Azure Functions Core Tools also enables developers to develop and test functions locally before remote deployment). To install Azure Functions Core Tools, please follow the instructions below:

The instructions below are for Ubuntu 20.04. Installation instructions for other Linux distributions and other operating systems such as Windows may be found at https://docs.microsoft.com/en-us/azure/azure-functions/functions-run-local.

We shall be installing the version 4.x of Azure Functions Core Tools that supports version 4.x of the Azure Functions runtime which is the recommended version (for Java and indeed all languages) at the time of writing. For further information regarding Azure Functions runtime versions, please refer to https://docs.microsoft.com/en-us/azure/azure-functions/functions-versions.

# Install the Microsoft package repository GPG key
$ curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg
$ sudo mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg
$ sudo sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/microsoft-ubuntu-$(lsb_release -cs)-prod $(lsb_release -cs) main" > /etc/apt/sources.list.d/dotnetdev.list'

# Install Azure Functions Core Tools
$ sudo apt-get update
$ sudo apt-get install azure-functions-core-tools-4
Azure Function App

We shall use the Azure Portal to provision Azure Function apps. To do so, navigate to the Function App service via the Azure Portal, select "Create" and follow the instructions below:

  1. Subscription and Resource Group - select the Azure subscription and resource group to use for the new function app.
  2. Function App Name - enter a custom function name that describes the purpose of the function, for example ontology-ingestor-service.
  3. Publish - select "Code".
  4. Runtime Stack - with the exception of the GitHub Webhook Subscriber (which is a Node.js application), all other OntoPop event-driven data pipeline functions are Java Spring Boot applications. Thus if you are deploying ontopop-azure-function-app-subscriber-github-webhook, then please select "Node.js" and version "14 LTS". Otherwise please select "Java" and version "11.0" for all other deployments.
  5. Storage Account - select (or create) an Azure Storage account for this function app.
  6. Operating System - select "Linux".
  7. Plan Type - dependent on your deployment architecture, select a plan that will dictate how the function app will scale. For most use cases, the "Consumption (Serverless)" plan should suffice.

Once configured, select "Review + Create" to create the new Azure Function app.

Azure Service Bus

All the OntoPop event-driven data pipeline Azure Function Spring Boot applications are invoked via their subscription to the shared messaging system. Assuming that you are integrating OntoPop with the Azure Service Bus managed enterprise broker, please ensure that you have provisioned and integrated an Azure Service Bus instance with OntoPop.

Deployment

GitHub Subscriber

In the following instructions we detail how the GitHub webhook subscriber Node.js application can be deployed to an Azure Function app.

  1. Create a new Azure Function app configured with the Node.js 14.x runtime via the Azure Portal as detailed above. We shall call this Azure Function app github-webhook-subscriber for the purposes of these instructions.
  2. To invoke the ontology ingestion service via a HTTP POST request whose request body will contain the GitHub webhook payload, set the ontology ingestion service URL as an environment variable named ONTOLOGY_INGESTOR_URL. To do this, navigate to the GitHub webhook subscriber Azure Function app via the Azure Portal, then select "Configuration" and "Application Settings". Select "New Application Setting" and create a new environment variable called ONTOLOGY_INGESTOR_URL with the relevant value (for example https://ontology-ingestor-service.azurewebsites.net/management/ontologies/ingest).
  3. Alternatively, to invoke the ontology ingestion service via the shared messaging system to which the ontology ingestion service is subscribed (recommended), set the Azure Service Bus connection string and topic name as environment variables. Specifically these environment variables are named AZURE_SERVICE_BUS_CONNECTION_STRING and AZURE_SERVICE_BUS_TOPIC respectively. To do this, navigate to the GitHub webhook subscriber Azure Function app via the Azure Portal, then select "Configuration" and "Application Settings". Select "New Application Setting" and create a new environment variable called AZURE_SERVICE_BUS_CONNECTION_STRING and another called AZURE_SERVICE_BUS_TOPIC, as illustrated in the following screenshot (remember to press save after creating these environment variables - this will restart the Azure Function app):

Ensure that the value of AZURE_SERVICE_BUS_TOPIC exactly matches the name of the destination property of the gitRepositoryUpdatedPublicationChannel binding in the spring.cloud.stream.bindings namespace in OntoPop's application context.

To get the connection string of the Azure Service Bus broker, please refer to the SAS Policy of the Azure Service Bus instance.

GitHub webhook subscriber environment variables
GitHub webhook subscriber environment variables
  1. We are now ready to deploy the GitHub webhook subscriber Node.js application code contained in $ONTOPOP_BASE/ontopop-apps/ontopop-apps-azure/ontopop-azure-function-app-subscriber-github-webhook project to this Azure Function app. Assuming that you have followed the instructions detailed in the Setup section above, navigate to $ONTOPOP_BASE/ontopop-apps/ontopop-apps-azure/ontopop-azure-function-app-subscriber-github-webhook and execute the following commands via your command line:
# Navigate to the relevant project folder
$ cd $ONTOPOP_BASE/ontopop-apps/ontopop-apps-azure/ontopop-azure-function-app-subscriber-github-webhook

# Install any dependencies via NPM
$ npm install

# Deploy the Node.js function to the relevant Azure Function app
$ func azure functionapp publish github-webhook-subscriber --javascript
  1. Now that we have uploaded the application code to the GitHub webhook subscriber Azure Function app, we need to configure a webhook in the relevant GitHub repository and set its payload URL as the Azure Function publicly-accessible URL. To get the URL of the Azure Function, navigate to the GitHub webhook subscriber Azure Function app via the Azure Portal, select "Functions" and select "gitHubWebhookSubscriberFunction" (i.e. the name of the uploaded Azure Function). Then select "Get Function URL" and make a note of the resultant URL (for example https://github-webhook-subscriber.azurewebsites.net/ gitHubWebhookSubscriberFunction/index.js).
  2. To configure a webhook in the relevant GitHub repository and set its payload URL as the Azure Function URL noted in step 5 above, navigate to the relevant GitHub repository in a web browser, select Settings > Webhooks > Add webhook and enter the following properties:
PropertyDescriptionExample
Payload URLThe public URL of the GitHub webhook subscriber Azure Function (as noted above in step 5). Set a request parameter called protocol with the value of azure-amqp to invoke the ontology ingestion service via an Azure Service Bus topic to which the ontology ingestion service is subscribed (recommended). Alternatively set the protocol request parameter to http to invoke the ontology ingestion service via a HTTP POST request.https://github-webhook-subscriber.azurewebsites.net/ gitHubWebhookSubscriberFunction/index.js?protocol=azure-amqp
Content typeWebhook media type. This should be set to application/json.application/json
SecretA custom string that will be used by OntoPop to validate GitHub webhook payloads. Please visit Securing your webhooks for further information. Please make a note of the secret token that you create, as it will be required when creating a new ontology for OntoPop to monitor via the OntoPop Management API.mysecret123
SSL verificationWhether to verify SSL certificates when delivering payloads. This should be set to enabled.Enable SSL verification
Which events would you like to trigger this webhook?The event type that will trigger the GitHub webhook. This should be set to the push event.Just the push event.

The following screenshot provides an example GitHub webhook configuration integrated with a GitHub webhook subscriber Azure Function app:

GitHub webhook configuration
GitHub webhook configuration

As described in the table above, a request parameter called protocol must be set with the payload URL. Set the protocol request parameter with the value of azure-amqp to invoke the ontology ingestion service via an Azure Service Bus topic to which the ontology ingestion service is subscribed (recommended). Alternatively set the protocol request parameter to http to invoke the ontology ingestion service via a HTTP POST request. For further details, please refer to the GitHub webhook subscriber Node.js application code for this Azure Function.

  1. Press "Add webhook". Now every time a push event occurs in the relevant GitHub repository, this webhook will be triggered and a HTTP POST request made to the public URL of the GitHub webhook subscriber Azure Function.
Function Apps

In the following instructions we use the ontopop-azure-function-app-data-ontology-validator child Maven module as an example with which to demonstrate how to deploy OntoPop's event-driven data pipeline Spring Boot applications to Azure Function apps. However these instructions can be equally applied to deploy any and all of the data pipeline Spring Boot applications listed in the Data Pipeline section above.

  1. Create a new Azure Function app configured with the Java 11 runtime via the Azure Portal as detailed in the Setup section above. We shall call this Azure Function app ontology-validator-service for the purposes of these instructions.
  2. Assuming that you are integrating OntoPop with the Azure Key Vault secrets engine, to integrate the OntoPop Spring bootstrap context with Azure Key Vault we configure bootstrap.yml as follows:
spring:
    application:
        name: ontopop
    cloud:
        vault:
            enabled: false
            host:
            port:
            scheme:
            authentication:
            token:
            kv:
                enabled:
                backend:
                default-context:
azure:
    keyvault:
        enabled: true
        client-id: ${AZURE_KEYVAULT_CLIENT_ID}
        client-key: ${AZURE_KEYVAULT_CLIENT_SECRET}
        tenant-id: ${AZURE_KEYVAULT_TENANT_ID}
        uri: ${AZURE_KEYVAULT_URI}
aws:
    secretsmanager:
        enabled: false
        name:
        prefix:
        defaultContext:
        failFast:
        region:

The Azure Key Vault client ID (i.e. the app ID of the service principal object with privileges to read secrets from the relevant Azure Key Vault instance), client secret (i.e. the password of the service principal object to privileges to read secrets from the relevant Azure Key Vault instance), tenant ID and URI properties should be set as environment variables and NOT stored as plaintext in bootstrap.yml. Thus these environment variables should be set in our Azure Function app as application settings. To do this, navigate to the relevant Azure Function app via the Azure Portal, then select "Configuration" and "Application Settings". Select "New Application Setting" and create all the environment variables defined in bootstrap.yml.

  1. When deploying Java-based functions to Azure Function apps, Azure requires us to explicitly provide the main Java class to invoke as an environment variable called MAIN_CLASS. To do this, navigate to the relevant Azure Function app via the Azure Portal, then select "Configuration" and "Application Settings". Select "New Application Setting" and create a new application setting called MAIN_CLASS with the relevant value from the following table:
Maven ModuleMain Class
ontopop-azure-function-app-data-ontology-ingestorai.hyperlearning.ontopop.apps.azure.functions.data.ontology .ingestor.OntologyIngestorAzureFunctionApp
ontopop-azure-function-app-data-ontology-validatorai.hyperlearning.ontopop.apps.azure.functions.data.ontology .validator.OntologyValidatorAzureFunctionApp
ontopop-azure-function-app-data-ontology-loader-triplestoreai.hyperlearning.ontopop.apps.azure.functions.data.ontology .loader.triplestore.OntologyTriplestoreLoaderAzureFunctionApp
ontopop-azure-function-app-data-ontology-parserai.hyperlearning.ontopop.apps.azure.functions.data.ontology .parser.OntologyParserAzureFunctionApp
ontopop-azure-function-app-data-ontology-modeller-graphai.hyperlearning.ontopop.apps.azure.functions.data.ontology .modeller.graph.OntologyGraphModellerAzureFunctionApp
ontopop-azure-function-app-data-ontology-loader-graphai.hyperlearning.ontopop.apps.azure.functions.data.ontology .loader.graph.OntologyGraphLoaderAzureFunctionApp
ontopop-azure-function-app-data-ontology-indexer-graphai.hyperlearning.ontopop.apps.azure.functions.data.ontology .indexer.graph.OntologyGraphIndexerAzureFunctionApp
  1. Next we need to configure another environment variable called JAVA_OPTS configured with the JVM memory parameters for our Azure Function. To do this, navigate to the relevant Azure Function app via the Azure Portal, then select "Configuration" and "Application Settings". Select "New Application Setting" and create a new application setting called JAVA_OPTS with the value -Xms512m -Xmx2g.
  2. All the OntoPop event-driven data pipeline Azure Function Spring Boot applications are triggered by the publication of a message to the relevant Azure Service Bus topic. To configure the Azure Function app to subscribe to the relevant topic and subscription, we need to first set a new environment variable called AZURE_SERVICEBUS_CONNECTION_STRING with the connection string of the Azure Service Bus broker. To get the connection string of the Azure Service Bus broker, please refer to the SAS Policy of the Azure Service Bus instance. Next we need to define the name of the relevant Azure Service Bus topic and subscription to subscribe to. These must also be set as environment variables, called TOPIC_NAME and SUBSCRIPTION_NAME respectively. Make sure that the values of these environment variables exactly match the values of the destination and group properties of the relevant consumption channels defined in the spring.cloud.stream.bindings namespace in OntoPop's application context. The following table provides the default topic and subscription names for each of the OntoPop data pipeline services:
Maven ModuleDefault Topic NameDefault Subscription Name
ontopop-azure-function-app-data-ontology-ingestorgit.repository.updatedontopop
ontopop-azure-function-app-data-ontology-validatorontopop.data.ingestedontopop
ontopop-azure-function-app-data-ontology-loader-triplestoreontopop.data.validatedontopop.loaders.triplestore
ontopop-azure-function-app-data-ontology-parserontopop.data.validatedontopop.parsers
ontopop-azure-function-app-data-ontology-modeller-graphontopop.data.parsedontopop
ontopop-azure-function-app-data-ontology-loader-graphontopop.data.modelledontopop.loaders.graph
ontopop-azure-function-app-data-ontology-indexer-graphontopop.data.modelledontopop.indexers.graph
  1. We have now set all the required Azure Function app environment variables. Press save to persist the new application settings (which will restart the Azure Function app).
  2. We are now ready to deploy the packaged Java Spring application artifact to the relevant Azure Function app. Assuming that you have followed the instructions detailed in the Setup section above, navigate to $ONTOPOP_BASE/ontopop-apps/ontopop-apps-azure/ontopop-azure-function-app-data-ontology-validator (in our example). Next open the pom.xml file inside this directory with your preferred text editor. In order to deploy the compiled Java Spring application artifact to the remote Azure Function app using the Azure Maven Plugin, we need to define four (4) environment variables in our deployment environment that are used in the pom.xml file, namely:
Environment VariableDescriptionExample Value
ONTOPOP_AZURE_FUNCTION_APP_DATA_ONTOLOGY_VALIDATOR_SUBSCRIPTION_IDThe ID of the Azure subscription in which this Azure Function app is provisioned.12345678-1234-abcd-wxyz-987654321a
ONTOPOP_AZURE_FUNCTION_APP_DATA_ONTOLOGY_VALIDATOR_RESOURCE_GROUP_NAMEThe name of the resource group in which this Azure Function app is provisioned.my-resource-group
ONTOPOP_AZURE_FUNCTION_APP_DATA_ONTOLOGY_VALIDATOR_APP_NAMEThe name of the Azure Function app.ontology-validator-service
ONTOPOP_AZURE_FUNCTION_APP_DATA_ONTOLOGY_VALIDATOR_REGIONThe region in which this Azure Function app is provisioned. For a list of current region names, please refer to https://azuretracks.com/2021/04/current-azure-region-names-reference/.uksouth
  1. Once these environment variables are set in our deployment environment, we can upload and deploy the packaged Java Spring application artifact to the relevant Azure Function app by executing the following commands via our command line:
# Login to the relevant Azure account
$ az login

# Explicitly set the name of the Azure subscription to use
$ az account set --subscription my-subscription

# Navigate to the relevant project folder
$ $ONTOPOP_BASE/ontopop-apps/ontopop-apps-azure/ontopop-azure-function-app-data-ontology-validator

# Deploy the packaged artifact to the remote Azure Function app
$ mvn azure-functions:deploy

Now every time a message is published to the relevant Azure Service Bus topic, the respective Azure Function that is subscribed to it will be invoked and will proceed to execute its respective stage of the OntoPop data pipeline.