Introduction Link to heading

If you’re here probably you know what an OpenTelemtry collector is right? If not this is a good starting point from which I’ll “stole” the nice one line description:

Vendor-agnostic way to receive, process and export telemetry data.

So now we can dig deeper with its config to achieve “transformations” and “routing”.

Setting up a collector Link to heading

We’ll use the opentelemetry-collector-contrib build which includes a lot of components out-of-the-box. You can create your own binary/image by including only the components you actually need. How to build your collector is out of scope but you can find many blog posts to quick-start this process.

OpenTelemetry collector is a fast moving project and its documentation, albeit good, is still falling short on changing features: be aware of deprecations and differences in supported features across builds

Some definitions and useful references Link to heading

These are the main “sections” of the YAML file that configures the collector (here folded for readability):

collapsed_config

No matter how that config gets put in place (ConfigMap on Kubernetes or a file somewhere) those section can be confusing at first. It’s worth repeating what they are for (don’t worry: I’ll put the link to the official docs if you want to dig deeper):

Exporters Link to heading

Here you define (emphasis on “define”) a list of target systems/methods that will receive your data (logs, metrics, traces). For example you can decide to have a different destination for metrics and traces. The actual “use” of the exporter will be done in the pipeline later on… Main docs are here.

Extensions Link to heading

Any generic extension that is “out of bound” of your telemetry data. Here the long explanation. Extensions are not required to have your observability data flow from the source to its destination.

Processors Link to heading

This part is important and it’s used to process your observability data. Since it’s the core of this post we’ll see the details later.

Receivers Link to heading

Here we define our “inputs”, ie the ports/protocols that we can receive. One common, and often the default, is gRPC on port 4317 (heads up: it needs special annotations on a nginx ingresses and “its own” method of load balancing on the service).

Processors and Service Link to heading

The important sections in this case are the processors and service sections: the first one defines how to process data (each one of logs, traces or metrics) and the second one is the actual pipeline definition (ie what needs to be done and in what order).

Example of a working config Link to heading

Exporters Link to heading

Just one exporter that is worth mentioning:

exporters:
  file:
    path: /dev/null # added as a destination for dropped data for services with a bad service.name
  [...other exporters definitions removed for brevity...]

You may see some examples with a “no op” nop exporter defined to drop data (see notes below).

Of course you’ll have all other exporters defined, in my case an Elasticsearch APM backend which will store the actual data. Each exporter has its own fields so it’s just easier to link the the list of exporters.

Processors Link to heading

processors:
  [...other processor definitions removed for brevity...]
  resource:
    attributes:
    - action: upsert
      key: deployment.environment
      value: development
    - action: upsert
      key: service.environment
      value: development
  transform:
    error_mode: ignore
    trace_statements:
      - context: resource
        statements:
          - set(attributes["service.name"], ConvertCase(attributes["service.name"], "lower")) # turn service.name to lowercase
          - replace_pattern(attributes["service.name"], "_", "-")                             # replace _ with - in the service.name
    metric_statements:
      - context: resource
        statements:
          - set(attributes["service.name"], ConvertCase(attributes["service.name"], "lower")) # turn service.name to lowercase
          - replace_pattern(attributes["service.name"], "_", "-")                             # replace _ with - in the service.name
    log_statements:
      - context: resource
        statements:
          - set(attributes["service.name"], ConvertCase(attributes["service.name"], "lower")) # turn service.name to lowercase
          - replace_pattern(attributes["service.name"], "_", "-")
  routing:
    default_exporters:
      - otlp
      - logging
    error_mode: ignore
    table:
      - statement: route() where IsMatch(resource.attributes["service.name"], "^[0-9]+.*") # drop data when service name starts with a number
        exporters: [file]
      - statement: route() where IsMatch(resource.attributes["service.name"], "^[_\\-].*") # drop data when service name starts with _ or -
        exporters: [file]
      - statement: route() where IsMatch(resource.attributes["service.name"], "^[^a-z].*") # drop data when service name  NOT starting with a lowercase letter
        exporters: [file]
  spanmetrics:
    metrics_exporter: otlp

Service Link to heading

service:
  extensions:
    - health_check
  pipelines:
    logs:
      exporters:
        [...list removed for brevity...]
      processors:
        [...list removed for brevity...]
      receivers:
        [...list removed for brevity...]
    metrics:
      exporters:
        [...list removed for brevity...]
      processors:
        - memory_limiter
        - batch
        - resource
        - transform
        - routing
      receivers:
        [...list removed for brevity...]
    traces:
      exporters:
        [...list removed for brevity...]
      processors:
        - memory_limiter
        - batch
        - resource
        - transform
        - routing
      receivers:
        [...list removed for brevity...]

Putting it all together Link to heading

Let’s start the collector locally using docker:

❯ docker run --name otel-collector \
  --rm \
  -v $PWD/local-project-test-collector-config.yaml:/config.yaml \
  -p 4317:4317 \
  -p 4318:4318 \
  otel/opentelemetry-collector-contrib:0.91.0 --config=/config.yaml

By using this sample code (ref) we can run a sample Flask application with auto instrumentation with:

❯ opentelemetry-instrument --exporter_otlp_endpoint=http://localhost:4318 \
  --exporter_otlp_protocol=http/protobuf \
  --resource_attributes="service.name=python-example,service.namespace=local,deployment.environment=development" \
  flask run

This will allow you to test your transformations and routing.

You’ll need to generate traffic to your locally-running application, for example with siege1:

❯ siege -c 2 -d 3 http://127.0.0.1:5000/rolldice

Please note: if you don’t specify -c 2 -d 3 expect a LOT of traces as siege will hammer your service

Notes Link to heading

  • this configuration has been tested with version 0.91 (but should work in previous versions as well, always check the CHANGELOG to be sure that a feature that you need is actually there)
  • /dev/null as a file destination to discard data sounds weird but a simple “nop” exporter defined as:
      exporters:
        nop:
    
    didn’t work at all (error in config) even if it’s used somewhere in the examples (see this GH issue).

  1. you may need to brew install siege ↩︎