Packages

  • package root
    Definition Classes
    root
  • package com
    Definition Classes
    root
  • package adform
    Definition Classes
    com
  • package streamloader

    The entry point of the stream loader library is the StreamLoader class, which requires a KafkaSource and a Sink.

    The entry point of the stream loader library is the StreamLoader class, which requires a KafkaSource and a Sink. Once started it will subscribe to the provided topics and will start polling and sinking records. The sink has to be able to persist records and to look up committed offsets (technically this is optional, but without it there would be no way to provide any delivery guarantees). A large class of sinks are batch based, implemented as RecordBatchingSink. This sink accumulate batches of records using some RecordBatcher and once ready, stores them to some underlying RecordBatchStorage. A common type of batch is file based, i.e. a batcher might write records to a temporary file and once the file is full the sink commits the file to some underlying storage, such as a database or a distributed file system like HDFS.

    A sketch of the class hierarchy illustrating the main classes and interfaces can be seen below.



    For concrete storage implementations see the clickhouse, hadoop, s3 and vertica packages. They also contain more file builder implementations than just the CsvFileBuilder included in the core library.

    Definition Classes
    adform
  • package source
    Definition Classes
    streamloader
  • KafkaContext
  • KafkaSource
  • LockingKafkaContext
  • MaxWatermarkProvider
  • WatermarkProvider

class KafkaSource extends Metrics

A source of data for stream loading that is backed by Kafka.

Not thread safe, create separate instances when loading data from multiple threads.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. KafkaSource
  2. Metrics
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new KafkaSource(consumerProperties: Properties, topics: Either[Seq[String], Pattern], pollTimeout: Duration, watermarkProviderFactory: () => WatermarkProvider)(implicit timeProvider: TimeProvider = TimeProvider.system)

    consumerProperties

    Kafka consumer properties to use.

    topics

    Topics to subscribe to, either a list or a pattern of topics.

    pollTimeout

    Timeout when polling data from Kafka.

    watermarkProviderFactory

    Factory for constructing watermark providers.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. def close(): Unit

    Closes the source and all underlying resources.

  7. def createCounter(name: String, tags: Seq[MetricTag] = Seq()): Counter
    Attributes
    protected
    Definition Classes
    Metrics
  8. def createDistribution(name: String, tags: Seq[MetricTag] = Seq()): DistributionSummary
    Attributes
    protected
    Definition Classes
    Metrics
  9. def createGauge[T <: AnyRef](name: String, metric: T, tdf: ToDoubleFunction[T], tags: Seq[MetricTag] = Seq()): Gauge
    Attributes
    protected
    Definition Classes
    Metrics
  10. def createTimer(name: String, tags: Seq[MetricTag] = Seq(), maxDuration: Duration = null): Timer
    Attributes
    protected
    Definition Classes
    Metrics
  11. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  12. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  13. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  14. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  15. def initialize(): KafkaContext

    Initializes the source by creating a Kafka consumer using the provided configuration.

    Initializes the source by creating a Kafka consumer using the provided configuration.

    returns

    A KafkaContext that can later be used for offset committing.

  16. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  17. def metricsRoot: String

    A common prefix for all created metrics.

    A common prefix for all created metrics.

    Attributes
    protected
    Definition Classes
    KafkaSourceMetrics
  18. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  19. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  20. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  21. def poll(): Iterable[StreamRecord]

    Polls Kafka for new records.

    Polls Kafka for new records.

    returns

    An iterator of polled records.

  22. def removeMeters(meters: Meter*): Unit
    Attributes
    protected
    Definition Classes
    Metrics
  23. def seek(partition: TopicPartition, position: StreamPosition): Unit

    Resets the consumer offsets to the given stream position.

    Resets the consumer offsets to the given stream position. Should be called in the subscription callback.

    partition

    Kafka topic partition to seek.

    position

    Stream position to seek to.

  24. def subscribe(listener: ConsumerRebalanceListener): Unit

    Subscribes to the Kafka topics provided in the constructor.

    Subscribes to the Kafka topics provided in the constructor.

    listener

    The callback for assignment/revocation, users should perform any needed seeking here using seek().

  25. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  26. def toString(): String
    Definition Classes
    AnyRef → Any
  27. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  28. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  29. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from Metrics

Inherited from AnyRef

Inherited from Any

Ungrouped