-
Tor-Einar Skog authoredTor-Einar Skog authored
- Developing a model using Python
- Preparations
- Workflow example
- Implementing a forecasting model
- Creating a new project for the forecasting model
- Make the package editable locally
- Create the main VIPSModel class
- Create the method for finding when 500 day degrees has been passed
- Start testing
- Create the method to calculate the infection risk
- Putting it together
- Data in
- Data out
- Implementing the meta information methods

Developing a model using Python
Tor-Einar Skog, Senior developer, NIBIO
Updated: 2023-02-23
This page builds on A step-by-step introduction to implementing prediction models in VIPS
Preparations
The tools you need to develop a VIPS model in Python are:
- A coding tool, such as an IDE(Integrated Development Environment). Common choices for Python are:
- VIPSCore-Python-Common, a Python package that contains the shared tools and classes for VIPS models.
You should be familiar with
- The Python programming language, version 3
- Using Python's venv
Workflow example
The normal workflow is
- that you have some correctly formatted weather data in a file that you put on the project's classpath,
- you mix this with the other configuration data and
- develop the model based on these input data.
- You must have one main class that extends the VIPSModel Abstract Base Class, which is available in the VIPSCore-Python-Common package.
- The test framework can be used to test single methods that are part of the algorithms or you can test the complete model.
- When you're happy with how the model works you can test deploy it to the VIPSCore server (TODO: Document this)
We will take this step-by-step below
Implementing a forecasting model
We are going to implement a forecasting model for a virtual fungus called «Fungus pilosus flavis» (please bear with me, any phytopatologists who might read this). Let's say that it there is a forecasting model for it that states that
- There is no infection risk until you have reached 500 day degrees (celcius)
- After that, the risk multiplies by 2 for each consecutive hour of leaf wetness (starting at 1 on the first hour). When reaching the threshold of 24, there is serious risk of infection, and measures should be taken.
We'll be using MS Visual Studio Code for this example, but the process should be transferable to other IDEs.
Creating a new project for the forecasting model
We are going to create a Python package. There is a very good post about how to do this in RealPython, and Python's own documentation is also good (and a bit shorter/simpler)
Start with creating a folder named e.g. FungusPilosisFlavisModel
. Enter it,
and activate git inside, like this (or however you normally do it!):
$ git init
It is recommended to create a .gitignore
file with this standard content:
Create and activate a Python virtualenv for this project. This can be done outside or inside the project folder, just make sure that your IDE knows where to find it. If you create it inside the folder, the IDE may autodiscover it and suggest that it will use it as the default virtualenv in this project.
$ python3 -m venv venv
$ source venv/bin/activate
(venv)$
Open the folder with your IDE, and add these files and folders

Below you'll find example contents of the pyproject.toml
file. Make sure to edit the
data that are specific for your project, such as
- name
- version
- authors
- dependencies
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "vips_fungus_pilosus_flavis_model"
version = "0.0.1"
description = "Example VIPS model, showcasing functionality"
readme = "README.md"
authors = [{ name="Foo Barson", email="tor-einar.skog@nibio.no" }]
license = { file = "LICENSE"}
classifiers = [
"Programming Language :: Python :: 3",
"License :: GNU Affero GPL v3",
"Operating System :: OS Independent",
]
dependencies = [
"shapely",
"pydantic",
"pytz",
"pandas",
"vipscore_common @ git+https://gitlab.nibio.no/VIPS/vipscore-python-common.git"
]
requires-python = ">=3.9"
[project.optional-dependencies]
dev = ["pytest"]
Make the package editable locally
Create and activate a Python virtualenv for this project. This can be done outside or inside the project folder, just make sure that your IDE knows where to find it. If you create it inside the folder, the IDE may autodiscover it and suggest that it will use it as the default virtualenv in this project.
# Create virtualenv
$ python3 -m venv venv
$ source venv/bin/activate
(venv)$
# Install your package in editable mode inside your virtualenv so that you can develop and test it
(venv)$ python -m pip install -e .
Create the main VIPSModel class
In /src
, create a folder named e.g. vips_fungus_pilosus_flavis_model
Inside of this folder, create the main module fungus_pilosus_flavis_model.py
. Then, add this contents to the file:
from vipscore_common.vips_model import VIPSModel
from vipscore_common.entities import Result, ModelConfiguration, WeatherObservation, WeatherElements
from vipscore_common.data_utils import *
class FungusPilosisFlavisModel(VIPSModel):
"""
This is the result of a VIPS Model implementation class
"""
MODEL_ID = "FUNGUSPILO"
COPYRIGHT = "(c) 2023 ACME Industries"
Make sure you add the __init__.py
file in the same folder as your module
Create the method for finding when 500 day degrees has been passed
To find out when 500 day degrees (since some date) have passed, you need the mean temperature of each day. All weather observations in VIPS are represented by an instance of the class WeatherObservation. This class has a few important properties:
- ElementMeasurementTypeId: Rain, mean temperature, leaf wetness etc.
- TimeMeasured
- LogInterval: Hourly, Daily, Monthly measurement
- Value: the numerical value of the weather observation
We need a list of one WeatherObservation with mean temperature per day. A simple approach could be:
- Loop through all the WeatherObservation objects, and add the value to the total day degree sum as we do so
- When the threshold of 500 has been reached, return the date of that WeatherObservation object.
So we could start by writing this method:
THRESHOLD = 500.0
def get_date_when_day_degree_limit_has_passed(self, observations: list):
# Initalize the day degree counter
day_degrees = 0.0
# Iterate through the list of observations
# !! Assuming the observations list is sorted chronologically!!
for observation in observations:
# Make sure it's only daily temperature observations that are used
if observation.logIntervalId == WeatherObservation.LOG_INTERVAL_ID_1D and observation.elementMeasurementTypeId == WeatherElements.TEMPERATURE_MEAN:
# Add to day_degree_sum
day_degrees = day_degrees + observation.value
# If threshold is reached, return the date of the current temperature measurement
if day_degrees >= self.THRESHOLD:
return observation.timeMeasured
# We have finished looping through the observations, and dayDegrees has
# not passed 500. So we can't return a Date, we must return None(nothing)
return None
IMPORTANT NOTE: These kinds of operations are better solved using Pandas, but for people unfamiliar with using Pandas, we stick with the simplest form of Python.
Start testing
Now that we have a method, we need to start testing. We will use pytest. Start by adding a test module named test_fungus_pilasus_flavis_model.py
in the tests
folder, and add imports and a method declaration:
import datetime, pytz
import unittest
from src.vips_fungus_pilosus_flavis_model.fungus_pilosus_flavis_model import *
class TestFungusPilasusFlavisModel(unittest.TestCase):
def test_get_date_when_day_degree_limit_has_passed(self):
# TODO: Get observations list
observations = None
# Instantiate the model
instance = FungusPilosisFlavisModel()
result = instance.get_date_when_day_degree_limit_has_passed(observations)
expected_date = datetime(2016, 5, 25, 22, 0, tzinfo=pytz.timezone("UTC"))
self.assertEquals(result, expected_date)
As you can tell, we lack a couple of important parts here:
- The list of weather observations (
observations
) - The expected date We need to get hold of weather data:
- Mean daily temperature
- Hourly leaf wetness Test data can be obtained from NIBIO's Norwegian Agromet service. So to get daily temperature values for a period, you can run:
Save the returned results as e.g. tests/tm.json
Now we add this helper method to the test class:
import vipscore_common.data_utils
def get_temperature_data():
with open("tests/tm.json") as f:
return get_weather_observations_from_json(f.read())
...and we call it from the test method:
class TestFungusPilasusFlavisModel(unittest.TestCase):
def test_get_date_when_day_degree_limit_has_passed(self):
# Get observations list
observations = get_temperature_data()
# Instantiate the model
instance = FungusPilosisFlavisModel()
result = instance.get_date_when_day_degree_limit_has_passed(observations)
self.assertEqual(result, expected_date)
Let's see if we can get this to work. Install pytest and run:
(venv) $ pip install pytest
(venv) $ pytest
If everything else is correctly set up, this will fail with the following error message:
============================================================ test session starts ============================================================
platform linux -- Python 3.10.6, pytest-7.2.1, pluggy-1.0.0
rootdir: /home/treinar/nextcloud/MaDiPHS/workshop_2023-02/FungusPilosusFlavisModel
collected 1 item
tests/test_fungus_pilasus_flavis_model.py F [100%]
================================================================= FAILURES ==================================================================
________________________________ TestFungusPilasusFlavisModel.test_get_date_when_day_degree_limit_has_passed ________________________________
self = <tests.test_fungus_pilasus_flavis_model.TestFungusPilasusFlavisModel testMethod=test_get_date_when_day_degree_limit_has_passed>
def test_get_date_when_day_degree_limit_has_passed(self):
# TODO: Get observations list
observations = None
# Instantiate the model
> instance = FungusPilosisFlavisModel()
E TypeError: Can't instantiate abstract class FungusPilosisFlavisModel with abstract methods copyright, get_model_description, get_model_name, get_model_usage, get_result, get_warning_status_interpretation, license, model_id, sample_config, set_configuration
tests/test_fungus_pilasus_flavis_model.py:15: TypeError
========================================================== short test summary info ==========================================================
FAILED tests/test_fungus_pilasus_flavis_model.py::TestFungusPilasusFlavisModel::test_get_date_when_day_degree_limit_has_passed - TypeError: Can't instantiate abstract class FungusPilosisFlavisModel with abstract methods copyright, get_model_description, get_model_n...
============================================================= 1 failed in 0.63s =============================================================
pytest fails because we have not implemented any of the abstract methods of VIPSModel. So let's do that, but only by passing them all
def set_configuration(self, model_configuration: ModelConfiguration):
"""
Set the configuration object (with all its possible parameters)
Must be done before you call get_result
"""
pass
def get_result(self) -> list[Result]:
"""Get the results as a list of Result objects (TODO ref)"""
pass
@property
def model_id(self) -> str:
"""10-character ID of the model. Must be unique (at least in the current system)"""
pass
@property
def sample_config(self) -> dict:
"""A sample configuration in JSON format (TODO check relation with Dict)"""
pass
@property
def license(self) -> str:
"""Returns the license for this piece of software"""
pass
@property
def copyright(self) -> str:
"""Name of person/organization that holds the copyright, and contact information"""
pass
def get_model_name(self, language: str) -> str:
"""Returns the model name in the specified language (<a href="http://www.loc.gov/standards/iso639-2/php/English_list.php">ISO-639-2</a>)"""
pass
def get_model_description(self, language: str) -> str:
"""Returns the model description in the specified language (<a href="http://www.loc.gov/standards/iso639-2/php/English_list.php">ISO-639-2</a>)"""
pass
def get_warning_status_interpretation(self, language: str) -> str:
"""How to interpret the warning status (red-yellow-green, what does it mean?) in the specified language (<a href="http://www.loc.gov/standards/iso639-2/php/English_list.php">ISO-639-2</a>)"""
pass
def get_model_usage(self, language: str) -> str:
"""Technical manual for this model, in the specified language (<a href="http://www.loc.gov/standards/iso639-2/php/English_list.php">ISO-639-2</a>)"""
pass
Running pytest
again should make the tests pass.
Create the method to calculate the infection risk
We can operate on hourly weather data for leaf wetness and calculate the infection risk. Data in will be a list of weather observations (leaf wetness, hourly). Output data will be a dictionary with timestamp as key, and the infection risk as value. So for instance for 24th July 2014 14:00 UTC there will be only one value.
To get hourly leaf wetness values for the same period and location, request:
An example of a solution can be:
def get_infection_risk(self, observations:list):
# Create the map with dates and infection risk values
risk_map = {}
# Counter for consecutive hours of leaf wetness
consecutive_hours_with_leaf_wetness = 0
# !! Assuming the observations list is sorted chronologically!!
# Loop through the list of observations
for observation in observations:
# We define a lower threshold for leaf wetnes to be 10mins/hour
if observation.value > 10.0:
# Leaf wetness registered, add to consecutive hours counter
consecutive_hours_with_leaf_wetness = consecutive_hours_with_leaf_wetness + 1
else:
# No leaf wetness, reset counter
consecutive_hours_with_leaf_wetness = 0
# We set the risk value
risk_map[observation.timeMeasured] = consecutive_hours_with_leaf_wetness * 2
# Return the map with all values
return risk_map
Exercise: Write a test for get_infection_risk()
Putting it together
We now have the most important methods created (and successfully tested). What we need to do now is to get data in (set configuration, get weather data etc) and get the results out in the expected format.
Data in
Input data are sendt in a large lump called a ModelConfiguration. It's a key based store of many different kind of objects: Numbers, strings, dates, WeatherObservations. This configuration object is sent to the model through the method set_configuration. So to get the weather data, we need to extract them from the configuration object in that method. An example of how to do this is as follows:
First, at the top of the class, declare the object that holds the weather data:
observations = None
This list will stay empty (NULL) until set_configuration does something about it. So let's do that, e.g.:
def set_configuration(self, model_configuration: ModelConfiguration):
"""
Set the configuration object (with all its possible parameters)
Must be done before you call get_result
"""
# Get the observation list, using the data_utils helper module
self.observations = get_weather_observations_from_json_list(model_configuration.config_parameters["observations"])
So now we have the weather data in a list, and we can start using them
Data out
Data out are sent as a list of Result objects. The method to get the
data is called get_result()
, surprisingly. An example of this method
could be:
CONTROLLED_INFECTION_RISK = "CONTROLLED_INFECTION_RISK"
def get_result(self) -> list[Result]:
"""Get the results as a list of Result objects (TODO ref)"""
# Initialize the list of results
results = []
# !! Assuming the observations list is sorted chronologically!! TODO Sort algorithm
# Which date did day degree sum exceed 500?
day_degree_limit_reach_date = self.get_date_when_day_degree_limit_has_passed(self.observations)
# Get infection risk for the whole period
uncontrolled_infection_risk = self.get_infection_risk(self.observations)
# Get all dates from the map of infection risk
date_list = list(uncontrolled_infection_risk.keys())
date_list.sort()
for current_date in date_list:
result = Result(
validTimeStart=current_date,
validTimeEnd=None,
warningStatus=0 # Temporary, set it later
)
# If we're after the date of day degree sum > 500, use the infectionrisk
if current_date >= day_degree_limit_reach_date:
# Set infection risk
result.set_value(self.MODEL_ID, self.CONTROLLED_INFECTION_RISK, "%s" % uncontrolled_infection_risk[current_date])
else:
# Set infection risk to 0
result.set_value(self.MODEL_ID, self.CONTROLLED_INFECTION_RISK, "0")
# Set the warning status
# If controlled infection risk < 64, status is NO RISK
# Otherwise it's HIGH RISK
result.warning_status = Result.WARNING_STATUS_NO_RISK if uncontrolled_infection_risk[current_date] <64 else Result.WARNING_STATUS_HIGH_RISK
results.append(result)
return results
Now it's time to test the methods. We
def test_get_result(self):
"""
We get an infection risk of 10 at a certain point in the time series
"""
tm_obs = get_temperature_data()
lw_obs = get_lw_data()
observations = tm_obs + lw_obs
instance = FungusPilosisFlavisModel()
model_config = ModelConfiguration(
model_id = instance.MODEL_ID,
config_parameters = {"observations": observations}
)
instance.set_configuration(model_config)
results = instance.get_result()
self.assertIsNotNone(results)
self.assertEqual(int(results[5094].get_value(instance.MODEL_ID,instance.CONTROLLED_INFECTION_RISK)),10)
Implementing the meta information methods
So, now you have a forecasting model that produces the expected results. When this model is deployed to the VIPS core runtime, it is discovered automatically and added to the list of available models. In order for other systems (like VIPSLogic or another client) to be able to query and show information about the model, it needs to implement the methods that provide documentation:
- get_model_name() - the name of the model. For instance «Fungus pilosus flavis model»
- get_license() - Open Source? Proprietary? Your pick
- get_copyright() - For instance «(c) 2014 Bioforsk»
- get_model_description() - Detailed description of how the model works, from a biological perspective
- get_model_usage() - How to configure the model (what parameters are needed, what values may they have and so on)
- get_sample_config() - A sample JSON configuration file.
Most of these methods have two versions: One takes language into account, one doesn't. Translation in model documentation is part of a presently unwritten chapter. For now, you can do this if you want as a general pattern:
def get_model_name(self, language = VIPSModel.default_language) -> str:
"""Returns the model name in the specified language (<a href="http://www.loc.gov/standards/iso639-2/php/English_list.php">ISO-639-2</a>)"""
return "Fungus pilosus flavis model"
The Reference Model contains examples for translation and how to include images in the description text.