Introduction to Vector
What is Vector?

❏ Allows you to Gather, Transform, and Route all log and metric data with one simple tool.

❏ Made up of three components (sources, transforms, sinks) of two types (logs, metrics).

❏ Currently being used on Redmine and the Quoin website to collect journal logs, transform them, and send them to AWS Cloudwatch logs.

Why Vector?

❏ It’s built in Rust! Therefore it’s fast and memory efficient

❏ End-to-End, meaning only one service needed to get data from A to B

❏ 100% open source

❏ Easy to configure and transform data

❏ Great documentation

Components

Sources

Where vector gets its data from.

  • Examples: Journald, Docker, File.

Transforms

Used to transform an input into an output by parsing, filtering, sampling, or aggregating.

  • Examples: Filter, Remove Fields, Add Fields.

Sinks

The destination for an event.

  • Examples: AWS Cloudwatch Logs, AWS S3, File.
Events
  1. Logs: A structured representation of a point-in-time event. It contains an arbitrary set of fields that describe the log event. Vector is schema neutral therefore it supports legacy and future schemas.
  2. Metrics: Represents a numerical operation performed on a time series.

A component can either be a Metric or Log event type component. Components can only interact with other components of the same event type.

25 Source Types

apache_metrics (M)

aws_ecs_metrics (M)

aws_kinesis_firehose (L)

aws_s3 (L)

docker_logs (L)

file (L)

generator (L)

heroku_logs (L)

host_metrics (M)

http (L)

internal_logs (L)

internal_metrics (M)

internal_logs (L)

internal_metricsjour

journald (L)

kafka (L)

kubernetes_logs (L)

mongodb_metrics (M)

nginx_metrics (M)

prometheus_remote_write (M)

prometheus_scrape (M)

socket (L)

splunk_hec (L)

statsd (M)

stdin (L)

syslog (L)

vector (L)

30 Transform Types

add_fields (L)

add_tags (M)

ansi_stripper (L)

aws_cloudwatch_logs_subscription_parser (L)

aws_ec2_metadata (L)

coercer (L)

concat (L)

dedupe (L)

filter (M, L)

geoip (L)

grok_parser (L)

json_parser (L)

key_value_parser (L)

log_to_metric (M, L)

logfmt_parser (L)

lua (M)

merge (L)

metric_to_log (M)

reduce (L)

regex_parser (L)

remap (M)

remove_fields (L)

remove_tags (M)

rename_fields (L)

sampler (L)

split (L)

swimlanes (L)

tag_cardinality_limit (M)

tokenizer (L)

wasm (L)

38 Sink Types

aws_cloudwatch_logs (L)

aws_cloudwatch_metrics (M)

aws_kinesis_firehose (L)

aws_kinesis_streams (L)

aws_s3 (L)

aws_sqs (L)

azure_monitor_logs (L)

blackhole (L)

clickhouse (L)

console (M, L)

datadog_logs (L)

datadog_metrics (M)

elasticsearch (L)

file (L)

gcp_cloud_storage (L)

gcp_pubsub (L)

gcp_stackdriver_logs (L)

honeycomb (L)

http (L)

humio_logs (L)

humio_metrics (M)

influxdb_logs (L)

influxdb_metrics (M)

kafka (L)

logdna (L)

loki (L)

nats (L)

new_relic_logs (L)

papertrail (L)

prometheus_exporter (M)

prometheus_remote_write (M)

pulsar (L)

sematext_logs (L)

sematext_metrics (M)

socket (L)

splunk_hec (L)

statsd (M)

vector (M, L)

Topology Model

Vector's topology model is based on a directed acyclic graph (DAG) of components. Events must flow in a single direction from sources to sinks, and cannot create cycles. Each component in the graph can produce zero or more events.

API

Endpoint

❏ POST /graphql: Main endpoint for receiving and processing GraphQL queries.

❏ GET /health: Healthcheck endpoint. Useful to verify that Vector is up and running.

❏ GET /playground: A bundled GraphQL playground that allows you to explore the available queries and manually run queries.

Other Cool Things

❏ Two transforms available called log_to_metric and metric_to_log allow converting events to the other types. Enables sending one event type component to and opposite event type component after transformation.

❏ Internal sources for logs and metrics, allows monitoring of vector itself.

❏ Sends events from sinks to downstream services in batch payloads. This can be configured in each sink.

Redmine Example

Two sources are used of the same type. The journald_docker source only takes in data from the docker service, and the journald source takes is data from every unit other than the docker and vector services.

Redmine Example

This is an instance where one transform is used at the input for two other transforms. The first transform just removes all unnecessary fields, and the next two just filter the first transform by container name. Inputs can undergo multiple transformations before they reach their final destination, a sink.

Redmine Example

There are three sinks for this project. The container_nginx_out and container_redmine_out sinks send the respective transform input to cloudwatch logs. The journald_out sink sends its source input to cloudwatch logs.

Redmine Example DAG