Skip to main content

A classification of application metrics

6 min read

When you're a product guy you love monitoring your metrics, don't you? How, otherwise, could you develop your product purposefully?

In the last weeks I was musing on a classification of application metrics for back-end data systems which do not provide interaction with (end) users / customers. I.e. classic and well known e-commerce metrics like average order value, conversion rate, etc. are absent. This is how I have come so far:

What is a metric?

My ultra short (Twitter compatible) and simple definition is:

A metric is a quantitative information which enables conclusion about quality.

Application metrics need a connection to business objectives to make an interpretation possible (positive? negative?). See Kaushik for a well defined drill down [1].
KPIs are a subset of metrics which - according to Kaushik - "helps you understand how you are doing against your objectives" [2]. Full ack.

Why metrics are important?

Metrics are essential for product managers to drive their products against the business objectives. Furthermore metrics are indispensable for efficient communication between the product division and the company management. Why? They are basic common sense and act like wormholes between these two worlds of mostly divergent thinking, acting, and talking.

The popular Lean Startup movement has even established the Build - Measure - Learn cycle as an axiomatic premise for building successful products.

Scope

The scope of this blog post comprises business data systems / services without direct interaction with (end) users / customers. Systems that collect or produce, transform, store, and provide e.g. product or customer data via APIs. True back-end.

Ensuring data integrity is a crucial requirement especially in distributed and / or microservice environments. Example: Receiving five new object events via the input stream must generate five new object events on the output stream regardless of complexity and distribution level of the process chain.

The metrics in question focus the application layer which is following (i.e. is above) the infrastructure layer. Therefore I call these metrics application metrics. The application layer is the layer where business logic resides. Datadog’s Alexis Lê-Quôc has also published a classification of metrics (which has inspired me) where he has identified "work metrics" [3] which look similar to what I refer to as application metrics.

Classification

The following classification of application metrics helps organizing your metrics as well as identifying your KPIs, and thus remembers you what kind of metrics is worth to have a look at.

Class: Integrity

Description: Covers metrics that indicate if processing the business objects along the process chain is successful or not. As integrity - especially in distributed and/or microservice environments - is crucial, metrics from this class should be casted to KPIs.

Example:
A process chain

  1. Import customer data from a message queue
  2. Transform customer data
  3. Save customer data in database
  4. Provide transformed customer data via RESTful API

Example values:
Calculate ratios

  • "No. of new customers imported per day" / "No. of new customers saved in database per day" = 1
  • ...
  • "No. of new customers saved in database per day" / "No. of new customers provided via API per day" = 1,25

The resulting ratios are metrics. A ratio = 1 indicates processing is successful; i.e. integrity is ensured. If ratios are != 1 errors must have occurred.

Of course, atomic "No. of new customers imported per day" is a metric, too. It is a quantitative information about the quality of efforts to generate new customers. However, the product owner of the business data system which processes, saves, and provides customer data to other internal systems has no influence on this metric. Therefore it cannot be one of his KPIs. It is a minor quantitative information (see next class).

-------------------------------------------------------------

Class: (Minor) Quantities

Description: Covers metrics of less interest the product owner cannot influence

Example:
A process chain

  1. Import customer data from a message queue
  2. Transform customer data
  3. Save customer data in database
  4. Provide transformed customer data via RESTful API

Example value: No. of unique customers in database

-------------------------------------------------------------

Class: Performance

Description: Self-explanatory. Performance metrics are fundamental indicators how an application is doing against its non-functional objectives. Therefore performance metrics should be casted to KPIs.

Example:
A process chain

  1. Import customer data from a message queue
  2. Transform customer data
  3. Save customer data in database
  4. Provide transformed customer data via RESTful API

Example value: 90th percentile end to end processing time of customer data (“a customer”) in seconds per day: 1.2

-------------------------------------------------------------
Class: Errors

Description: Self-explanatory. Error metrics could be casted to KPIs because errors indicate that an application is not meeting its objectives.

Example:
A process chain

  1. Import customer data from a message queue
  2. Transform customer data
  3. Save customer data in database
  4. Provide transformed customer data via RESTful API

Example value: No. of HTTP 500 per day: 2

-------------------------------------------------------------

Class: API analytics

Description: Covers metrics to analyse client server interaction

Example:
A process chain

  1. Import customer data from a message queue
  2. Transform customer data
  3. Save customer data in database
  4. Provide transformed customer data via RESTful API

Example value: No. of requests responded with HTTP 200 on resource /customers/city/Berlin

Actions

In the context of metrics some specific actions are relevant which are sometimes confused IMHO.

  • Logging: The act of producing raw data e.g. by Syslog to enable the generation of metrics (e.g. a count of specific log entries).
  • Monitoring: The act of watching metrics constantly; performed by a human or a computer (program).
  • Notifying: The act of notifying a specific person or a specific group of persons when a specific event in the context of metrics happens, e.g. the occurrence of an HTTP 500.
  • Analyzing: Best see https://en.wikipedia.org/wiki/Data_analysis

The support dimension

Some last words are dedicated to the support people doing a great job often 27/4. When collecting raw data for your metrics in class Error collect as much information as possible. "What good data looks like" in [3] is a great inspiration on this topic. Focus on these two objectives: A) identify the problem asap, b) fix the problem asap. And don’t forget to measure the total time an incident needs to be solved.

[1] http://www.kaushik.net/avinash/web-analytics-101-definitions-goals-metrics-kpis-dimensions-targets/

[2] http://www.kaushik.net/avinash/web-analytics-101-definitions-goals-metrics-kpis-dimensions-targets/#kpi

[3] https://www.datadoghq.com/blog/monitoring-101-collecting-data/