Packages

  • package root
    Definition Classes
    root
  • package com
    Definition Classes
    root
  • package adform
    Definition Classes
    com
  • package streamloader

    The entry point of the stream loader library is the StreamLoader class, which requires a $KafkaSource and a $Sink.

    The entry point of the stream loader library is the StreamLoader class, which requires a $KafkaSource and a $Sink. Once started it will subscribe to the provided topics and will start polling and sinking records. The sink has to be able to persist records and to look up committed offsets (technically this is optional, but without it there would be no way to provide any delivery guarantees). A large class of sinks are batch based, implemented as $RecordBatchingSink. This sink accumulate batches of records using some $RecordBatcher and once ready, stores them to some underlying $RecordBatchStorage. A common type of batch is file based, i.e. a batcher might write records to a temporary file and once the file is full the sink commits the file to some underlying storage, such as a database or a distributed file system like HDFS.

    A sketch of the class hierarchy illustrating the main classes and interfaces can be seen below.



    For concrete storage implementations see the clickhouse, hadoop, s3 and vertica packages. They also contain more file builder implementations than just the $CsvFileBuilder included in the core library.

    Definition Classes
    adform
  • package sink
    Definition Classes
    streamloader
  • package batch
    Definition Classes
    sink
  • package storage
  • AppendableRecordBatch
  • RecordBatch
  • RecordBatchBuilder
  • RecordBatcher
  • RecordBatchingSink
  • RecordBatchingSinker
  • RecordFormatter
  • RecordPartitioner
  • RecordStreamWriter
  • package encoding
    Definition Classes
    sink
  • package file
    Definition Classes
    sink

package batch

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. Protected

Package Members

  1. package storage

Type Members

  1. trait AppendableRecordBatch[B <: RecordBatch] extends RecordBatch

    Appendable sequential record batches can be chained together to form new merged batches.

  2. trait RecordBatch extends AnyRef

    A base trait representing a batch of records.

  3. abstract class RecordBatchBuilder[+B <: RecordBatch] extends AnyRef

    A record batch builder, the base implementation takes care of keeping track of contained record ranges.

    A record batch builder, the base implementation takes care of keeping track of contained record ranges. Concrete implementations should additionally implement actual batch construction.

    B

    Type of the batches being built.

  4. trait RecordBatcher[+B <: RecordBatch] extends AnyRef

    A record batcher that provides new record batch builders.

    A record batcher that provides new record batch builders.

    B

    Type of record batches being built.

  5. class RecordBatchingSink[+B <: RecordBatch] extends PartitionGroupingSink

    A Sink that uses a specified RecordBatcher to construct record batches and commit them to a specified RecordBatchStorage once they are ready.

    A Sink that uses a specified RecordBatcher to construct record batches and commit them to a specified RecordBatchStorage once they are ready. It is a partition grouping sink, meaning records from each partition group are added to different batches and processed separately. Batch commits to storage are queued up and performed asynchronously in the background for each partition group. The commit queue size is configurable and commits block if the queues get full.

    B

    Type of record batches.

  6. class RecordBatchingSinker[B <: RecordBatch] extends PartitionGroupSinker with Logging with Metrics

    A PartitionGroupSinker that accumulates records to batches and stores them to some storage once ready.

    A PartitionGroupSinker that accumulates records to batches and stores them to some storage once ready.

    B

    Type of record batches persisted to storage.

  7. trait RecordFormatter[+R] extends AnyRef

    A formatter for mapping source records to R typed records.

    A formatter for mapping source records to R typed records.

    R

    Type of records being formatted to.

  8. trait RecordPartitioner[-R, +P] extends AnyRef

    Base trait for defining a record partitioning strategy, e.g.

    Base trait for defining a record partitioning strategy, e.g. by day or by country, etc.

    R

    Type of formatted records, i.e. the one being written to storage.

    P

    Type of the partition.

  9. trait RecordStreamWriter[-R] extends AnyRef

    An abstract writer for a stream of records of type R, most probably backed by some storage, e.g.

    An abstract writer for a stream of records of type R, most probably backed by some storage, e.g. a file.

Value Members

  1. object RecordBatchingSink

Ungrouped