com.adform.streamloader.vertica
ExternalOffsetVerticaFileStorage
Companion object ExternalOffsetVerticaFileStorage
class ExternalOffsetVerticaFileStorage extends InDataOffsetBatchStorage[ExternalOffsetVerticaFileRecordBatch] with Logging
A Vertica file storage implementation that loads data to some table and commits offsets to a separate dedicated offset table. The commit happens in a single transaction, offsets and data can be joined using a file ID that is stored in both the data table and the offset table. The offset table contains only ranges of offsets from each topic partition contained in the file. Its structure should look as follows (all names can be customized):
CREATE TABLE file_offsets ( _file_id INT NOT NULL, _consumer_group VARCHAR(128) NOT NULL, _topic VARCHAR(128) NOT NULL, _partition INT NOT NULL, _start_offset INT NOT NULL, _start_watermark TIMESTAMP NOT NULL, _end_offset INT NOT NULL, _end_watermark TIMESTAMP NOT NULL );
Compared to the InRowOffsetVerticaFileStorage this implementation does not preserve individual row offsets, however it is much less expensive in terms of data usage in the licensing scheme, as the license is calculated based on the size of the data as it occupies converted to strings, ignoring compression and encoding. Thus while it does not cost much to store the topic name, partition and offset next to each row physically (this data compresses very well), it can be significant when auditing data usage for licensing.
- Alphabetic
- By Inheritance
- ExternalOffsetVerticaFileStorage
- InDataOffsetBatchStorage
- Logging
- RecordBatchStorage
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new ExternalOffsetVerticaFileStorage(dbDataSource: DataSource, table: String, offsetTable: String, fileIdColumnName: String, consumerGroupColumnName: String, topicColumnName: String, partitionColumnName: String, startOffsetColumnName: String, startWatermarkColumnName: String, endOffsetColumnName: String, endWatermarkColumnName: String)
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def commitBatch(batch: ExternalOffsetVerticaFileRecordBatch): Unit
Stores a given batch to storage and commits offsets, preferably in a single atomic transaction.
Stores a given batch to storage and commits offsets, preferably in a single atomic transaction.
- Definition Classes
- InDataOffsetBatchStorage → RecordBatchStorage
- def commitBatchWithOffsets(batch: ExternalOffsetVerticaFileRecordBatch): Unit
Stores a given batch to storage together with the offsets.
Stores a given batch to storage together with the offsets.
- Definition Classes
- ExternalOffsetVerticaFileStorage → InDataOffsetBatchStorage
- def committedPositions(topicPartitions: Set[TopicPartition]): Map[TopicPartition, Option[StreamPosition]]
Gets the latest committed stream positions for the given partitions where streams should be sought to, i.e.
Gets the latest committed stream positions for the given partitions where streams should be sought to, i.e. this should be the last stored offset + 1.
- Definition Classes
- ExternalOffsetVerticaFileStorage → RecordBatchStorage
- def committedPositions(connection: Connection): Map[TopicPartition, StreamPosition]
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def initialize(context: KafkaContext): Unit
Initializes the storage with a Kafka context, which can be used to lookup/commit offsets, if needed.
Initializes the storage with a Kafka context, which can be used to lookup/commit offsets, if needed.
- Definition Classes
- RecordBatchStorage
- def isBatchCommitted(batch: ExternalOffsetVerticaFileRecordBatch): Boolean
Checks whether a given batch was successfully committed to storage by comparing committed positions with the record ranges in the batch.
Checks whether a given batch was successfully committed to storage by comparing committed positions with the record ranges in the batch.
- batch
Batch to check.
- returns
Whether the batch is successfully stored.
- Definition Classes
- RecordBatchStorage
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- val kafkaContext: KafkaContext
- Attributes
- protected
- Definition Classes
- RecordBatchStorage
- val log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def recover(topicPartitions: Set[TopicPartition]): Unit
Performs any needed recovery upon startup, e.g.
Performs any needed recovery upon startup, e.g. rolling back or completing transactions. Can fail, users should handle any possible exceptions.
- Definition Classes
- InDataOffsetBatchStorage → RecordBatchStorage
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()