SparkReceiver IO

SparkReceiverIO 是一个从 Apache Spark Receiver 读取数据的无界源转换。

Spark Receiver 支持

SparkReceiverIO 目前支持 Apache Spark Receiver.

Spark Receiver 的要求

有关更多详细信息,请参见 SparkReceiverIO 自述文件.

使用 SparkReceiverIO 进行流式读取

为了从 Spark Receiver 读取数据,你需要传入

你可以通过传入以下参数轻松创建 receiverBuilder 对象

例如

//In this example, MyReceiver accepts a MyConfig object as its only constructor parameter.
MyConfig myPluginConfig = new MyConfig(authToken, apiServerUrl);
Object[] myConstructorArgs = new Object[] {myConfig};
ReceiverBuilder<String, MyReceiver<String>> myReceiverBuilder =
  new ReceiverBuilder<>(MyReceiver.class)
    .withConstructorArgs(myConstructorArgs);

然后,你可以将此 receiverBuilder 对象传入 SparkReceiverIO

例如

SparkReceiverIO.Read<String> readTransform =
  SparkReceiverIO.<String>read()
    .withGetOffsetFn(Long::valueOf)
    .withSparkReceiverBuilder(myReceiverBuilder)
p.apply("readFromMyReceiver", readTransform);

使用可选参数读取数据

你还可以选择传入以下可选参数

例如

SparkReceiverIO.Read<String> readTransform =
  SparkReceiverIO.<String>read()
    .withGetOffsetFn(Long::valueOf)
    .withSparkReceiverBuilder(myReceiverBuilder)
    .withPullFrequencySec(1L)
    .withStartOffset(1L)
    .withTimestampFn(Instant::parse);
p.apply("readFromReceiver", readTransform);

特定 Spark Receiver 的示例

CDAP Hubspot Receiver

ReceiverBuilder<String, HubspotReceiver<String>> hubspotReceiverBuilder =
  new ReceiverBuilder<>(HubspotReceiver.class)
    .withConstructorArgs(hubspotConfig);
SparkReceiverIO.Read<String> readTransform =
  SparkReceiverIO.<String>read()
    .withGetOffsetFn(GetOffsetUtils.getOffsetFnForHubspot())
    .withSparkReceiverBuilder(hubspotReceiverBuilder)
p.apply("readFromHubspotReceiver", readTransform);