Introduction Link to heading
If you’re here probably you know what an OpenTelemtry collector is right? If not this is a good starting point from which I’ll “stole” the nice one line description:
Vendor-agnostic way to receive, process and export telemetry data.
So now we can dig deeper with its config to achieve “transformations” and “routing”.
Setting up a collector Link to heading
We’ll use the opentelemetry-collector-contrib
build which includes a lot of components out-of-the-box. You can create your own binary/image by including only the components you actually need. How to build your collector is out of scope but you can find many blog posts to quick-start this process.
OpenTelemetry collector is a fast moving project and its documentation, albeit good, is still falling short on changing features: be aware of deprecations and differences in supported features across builds
Some definitions and useful references Link to heading
These are the main “sections” of the YAML file that configures the collector (here folded for readability):
No matter how that config gets put in place (ConfigMap on Kubernetes or a file somewhere) those section can be confusing at first. It’s worth repeating what they are for (don’t worry: I’ll put the link to the official docs if you want to dig deeper):
Exporters Link to heading
Here you define (emphasis on “define”) a list of target systems/methods that will receive your data (logs, metrics, traces). For example you can decide to have a different destination for metrics and traces. The actual “use” of the exporter will be done in the pipeline later on… Main docs are here.
Extensions Link to heading
Any generic extension that is “out of bound” of your telemetry data. Here the long explanation. Extensions are not required to have your observability data flow from the source to its destination.
Processors Link to heading
This part is important and it’s used to process your observability data. Since it’s the core of this post we’ll see the details later.
Receivers Link to heading
Here we define our “inputs”, ie the ports/protocols that we can receive. One common, and often the default, is gRPC on port 4317 (heads up: it needs special annotations on a nginx ingresses and “its own” method of load balancing on the service).
Processors and Service Link to heading
The important sections in this case are the processors
and service
sections: the first one defines how to process data (each one of logs
, traces
or metrics
) and the second one is the actual pipeline definition (ie what needs to be done and in what order).
Example of a working config Link to heading
Exporters Link to heading
Just one exporter that is worth mentioning:
exporters:
file:
path: /dev/null # added as a destination for dropped data for services with a bad service.name
[...other exporters definitions removed for brevity...]
You may see some examples with a “no op” nop
exporter defined to drop data (see notes below).
Of course you’ll have all other exporters defined, in my case an Elasticsearch APM backend which will store the actual data. Each exporter has its own fields so it’s just easier to link the the list of exporters.
Processors Link to heading
processors:
[...other processor definitions removed for brevity...]
resource:
attributes:
- action: upsert
key: deployment.environment
value: development
- action: upsert
key: service.environment
value: development
transform:
error_mode: ignore
trace_statements:
- context: resource
statements:
- set(attributes["service.name"], ConvertCase(attributes["service.name"], "lower")) # turn service.name to lowercase
- replace_pattern(attributes["service.name"], "_", "-") # replace _ with - in the service.name
metric_statements:
- context: resource
statements:
- set(attributes["service.name"], ConvertCase(attributes["service.name"], "lower")) # turn service.name to lowercase
- replace_pattern(attributes["service.name"], "_", "-") # replace _ with - in the service.name
log_statements:
- context: resource
statements:
- set(attributes["service.name"], ConvertCase(attributes["service.name"], "lower")) # turn service.name to lowercase
- replace_pattern(attributes["service.name"], "_", "-")
routing:
default_exporters:
- otlp
- logging
error_mode: ignore
table:
- statement: route() where IsMatch(resource.attributes["service.name"], "^[0-9]+.*") # drop data when service name starts with a number
exporters: [file]
- statement: route() where IsMatch(resource.attributes["service.name"], "^[_\\-].*") # drop data when service name starts with _ or -
exporters: [file]
- statement: route() where IsMatch(resource.attributes["service.name"], "^[^a-z].*") # drop data when service name NOT starting with a lowercase letter
exporters: [file]
spanmetrics:
metrics_exporter: otlp
Service Link to heading
service:
extensions:
- health_check
pipelines:
logs:
exporters:
[...list removed for brevity...]
processors:
[...list removed for brevity...]
receivers:
[...list removed for brevity...]
metrics:
exporters:
[...list removed for brevity...]
processors:
- memory_limiter
- batch
- resource
- transform
- routing
receivers:
[...list removed for brevity...]
traces:
exporters:
[...list removed for brevity...]
processors:
- memory_limiter
- batch
- resource
- transform
- routing
receivers:
[...list removed for brevity...]
Putting it all together Link to heading
Let’s start the collector locally using docker:
❯ docker run --name otel-collector \
--rm \
-v $PWD/local-project-test-collector-config.yaml:/config.yaml \
-p 4317:4317 \
-p 4318:4318 \
otel/opentelemetry-collector-contrib:0.91.0 --config=/config.yaml
By using this sample code (ref) we can run a sample Flask application with auto instrumentation with:
❯ opentelemetry-instrument --exporter_otlp_endpoint=http://localhost:4318 \
--exporter_otlp_protocol=http/protobuf \
--resource_attributes="service.name=python-example,service.namespace=local,deployment.environment=development" \
flask run
This will allow you to test your transformations and routing.
You’ll need to generate traffic to your locally-running application, for example with siege
1:
❯ siege -c 2 -d 3 http://127.0.0.1:5000/rolldice
Please note: if you don’t specify
-c 2 -d 3
expect a LOT of traces assiege
will hammer your service
Notes Link to heading
- this configuration has been tested with version
0.91
(but should work in previous versions as well, always check theCHANGELOG
to be sure that a feature that you need is actually there) /dev/null
as a file destination to discard data sounds weird but a simple “nop” exporter defined as:didn’t work at all (error in config) even if it’s used somewhere in the examples (see this GH issue).exporters: nop:
-
you may need to
brew install siege
↩︎