Packages

  • package root
    Definition Classes
    root
  • package com
    Definition Classes
    root
  • package adform
    Definition Classes
    com
  • package streamloader

    The entry point of the stream loader library is the StreamLoader class, which requires a $KafkaSource and a $Sink.

    The entry point of the stream loader library is the StreamLoader class, which requires a $KafkaSource and a $Sink. Once started it will subscribe to the provided topics and will start polling and sinking records. The sink has to be able to persist records and to look up committed offsets (technically this is optional, but without it there would be no way to provide any delivery guarantees). A large class of sinks are batch based, implemented as $RecordBatchingSink. This sink accumulate batches of records using some $RecordBatcher and once ready, stores them to some underlying $RecordBatchStorage. A common type of batch is file based, i.e. a batcher might write records to a temporary file and once the file is full the sink commits the file to some underlying storage, such as a database or a distributed file system like HDFS.

    A sketch of the class hierarchy illustrating the main classes and interfaces can be seen below.



    For concrete storage implementations see the clickhouse, hadoop, s3 and vertica packages. They also contain more file builder implementations than just the $CsvFileBuilder included in the core library.

    Definition Classes
    adform
  • package sink
    Definition Classes
    streamloader
  • package batch
    Definition Classes
    sink
  • package encoding
    Definition Classes
    sink
  • package file
    Definition Classes
    sink
  • PartitionGroupSinker
  • PartitionGroupingSink
  • RewindingPartitionGroupSinker
  • Sink
  • WrappedPartitionGroupSinker
  • WrappedSink
t

com.adform.streamloader.sink

PartitionGroupingSink

trait PartitionGroupingSink extends Sink with Logging

An abstract sink that implements partition assignment/revocation by grouping partitions to sets using a user defined function and delegates the actual data sinking to instances of PartitionGroupSinker, which operate with fixed sets of partitions and thus do not need to worry about rebalance events.

Implementers must define a partition grouping, which is a mapping from a topic partition to a group name string, the simplest grouping would simply map all partitions to a "root" group, a more elaborate grouping could create separate groups for smaller and larger topics, each group receiving a separate PartitionGroupSinker. E.g., if the sinker writes all data to a file, you would end up with separate files for smaller and larger topics.

Linear Supertypes
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. PartitionGroupingSink
  2. Logging
  3. Sink
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Abstract Value Members

  1. abstract def groupForPartition(topicPartition: TopicPartition): String

    Maps a given topic partition to a group name.

    Maps a given topic partition to a group name.

    returns

    A group name, e.g. "root" or topicPartition.topic().

  2. abstract def sinkerForPartitionGroup(groupName: String, partitions: Set[TopicPartition]): PartitionGroupSinker

    Creates a new instance of a PartitionGroupSinker for a given partition group.

    Creates a new instance of a PartitionGroupSinker for a given partition group. These instances are closed and re-created during rebalance events.

    groupName

    Name of the partition group.

    partitions

    Partitions in the group.

    returns

    A new instance of a sinker for the given group.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. final def assignPartitions(partitions: Set[TopicPartition]): Map[TopicPartition, Option[StreamPosition]]

    Assigns a set of partitions to the sink and returns their committed stream positions.

    Assigns a set of partitions to the sink and returns their committed stream positions. Called during the initial partitions subscription and on subsequent rebalance events.

    partitions

    Set of partitions assigned.

    returns

    Committed stream position map for all the assigned partitions.

    Definition Classes
    PartitionGroupingSinkSink
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  7. def close(): Unit

    Closes all active sinkers.

    Closes all active sinkers.

    Definition Classes
    PartitionGroupingSinkSink
  8. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  9. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  10. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def heartbeat(): Unit

    Forwards the heartbeat to all sinkers.

    Forwards the heartbeat to all sinkers.

    Definition Classes
    PartitionGroupingSinkSink
  13. def initialize(context: KafkaContext): Unit

    Initializes the sink with a KafkaContext.

    Initializes the sink with a KafkaContext.

    Definition Classes
    PartitionGroupingSinkSink
  14. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  15. val kafkaContext: KafkaContext
    Attributes
    protected
  16. val log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  17. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  18. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  19. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  20. final def revokePartitions(partitions: Set[TopicPartition]): Map[TopicPartition, Option[StreamPosition]]

    Revokes a set of partitions from the sink, called during rebalance events.

    Revokes a set of partitions from the sink, called during rebalance events. Returns committed stream positions for partitions that need rewinding.

    partitions

    Set of partitions revoked.

    returns

    Committed stream position map for partitions that need rewinding.

    Definition Classes
    PartitionGroupingSinkSink
  21. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  22. def toString(): String
    Definition Classes
    AnyRef → Any
  23. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  24. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  25. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  26. def write(record: StreamRecord): Unit

    Forwards a consumer record to the correct partition sinker.

    Forwards a consumer record to the correct partition sinker.

    record

    Stream record to write.

    Definition Classes
    PartitionGroupingSinkSink

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from Logging

Inherited from Sink

Inherited from AnyRef

Inherited from Any

Ungrouped