Packages

  • package root
    Definition Classes
    root
  • package com
    Definition Classes
    root
  • package adform
    Definition Classes
    com
  • package streamloader

    The entry point of the stream loader library is the StreamLoader class, which requires a $KafkaSource and a $Sink.

    The entry point of the stream loader library is the StreamLoader class, which requires a $KafkaSource and a $Sink. Once started it will subscribe to the provided topics and will start polling and sinking records. The sink has to be able to persist records and to look up committed offsets (technically this is optional, but without it there would be no way to provide any delivery guarantees). A large class of sinks are batch based, implemented as $RecordBatchingSink. This sink accumulate batches of records using some $RecordBatcher and once ready, stores them to some underlying $RecordBatchStorage. A common type of batch is file based, i.e. a batcher might write records to a temporary file and once the file is full the sink commits the file to some underlying storage, such as a database or a distributed file system like HDFS.

    A sketch of the class hierarchy illustrating the main classes and interfaces can be seen below.



    For concrete storage implementations see the clickhouse, hadoop, s3 and vertica packages. They also contain more file builder implementations than just the $CsvFileBuilder included in the core library.

    Definition Classes
    adform
  • package sink
    Definition Classes
    streamloader
  • package batch
    Definition Classes
    sink
  • package encoding
    Definition Classes
    sink
  • package file
    Definition Classes
    sink
  • PartitionGroupSinker
  • PartitionGroupingSink
  • RewindingPartitionGroupSinker
  • Sink
  • WrappedPartitionGroupSinker
  • WrappedSink

trait Sink extends AnyRef

Base trait that describes Kafka consumer record sinking to some underlying storage. Implementers need to define committed position lookup for a list of assigned partitions which are then used to rewind the streams. They are also responsible for implementing offset committing.

Implementers can chose to use Kafka itself for offset storage by using the provided KafkaContext, they can also store/retrieve offsets to/from the storage itself. The delivery guarantees of a sink depend on the offset management, e.g. storing offsets atomically with data guarantees exactly-once storage, whereas using only Kafka for offset storage will usually result in either at-least-once of at-most-once semantics.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Sink
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Abstract Value Members

  1. abstract def assignPartitions(partitions: Set[TopicPartition]): Map[TopicPartition, Option[StreamPosition]]

    Assigns a set of partitions to the sink and returns their committed stream positions.

    Assigns a set of partitions to the sink and returns their committed stream positions. Called during the initial partitions subscription and on subsequent rebalance events.

    partitions

    Set of partitions assigned.

    returns

    Committed stream position map for all the assigned partitions.

  2. abstract def close(): Unit

    Closes the sink and performs any necessary clean up.

  3. abstract def heartbeat(): Unit

    Notifies the sink that record consumption is still active, called when no records get polled from Kafka.

    Notifies the sink that record consumption is still active, called when no records get polled from Kafka. Gives the sink an opportunity to perform flushing when working with very low traffic streams.

  4. abstract def initialize(kafkaContext: KafkaContext): Unit

    Initializes the sink with a KafkaContext.

    Initializes the sink with a KafkaContext.

    kafkaContext

    The context provided by the source, can be used for offset management in Kafka.

  5. abstract def revokePartitions(partitions: Set[TopicPartition]): Map[TopicPartition, Option[StreamPosition]]

    Revokes a set of partitions from the sink, called during rebalance events.

    Revokes a set of partitions from the sink, called during rebalance events. Returns committed stream positions for partitions that need rewinding.

    partitions

    Set of partitions revoked.

    returns

    Committed stream position map for partitions that need rewinding.

  6. abstract def write(record: StreamRecord): Unit

    Writes a record to the underlying storage.

    Writes a record to the underlying storage.

    record

    Stream record to write.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  8. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  9. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  11. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  12. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  13. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  15. def toString(): String
    Definition Classes
    AnyRef → Any
  16. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  17. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  18. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from AnyRef

Inherited from Any

Ungrouped