DEV Community

Apache SeaTunnel
Apache SeaTunnel

Posted on

New release! Support for Kubernetes, multiple connectors added, SeaTunnel 2.1.2 is here!

Image description

In the month or so since the release of Apache SeaTunnel (Incubating) 2.1.1, the community has accepted nearly 100 PRs from teams or individuals around the world to bring you version 2.1.2. Stability is enhanced in this release, and new features, documentation, examples, and other optimizations have also been made.

This article will introduce the Apache SeaTunnel (Incubating) 2.1.2 update in detail.

  • Release Note:

https://github.com/apache/incubator-seatunnel/blob/2.1.2/release-note.md

01 Major feature updates

Webhook and Http 2 connectors are added to enhance Http-related data handling capabilities.

Special thanks to tmljob for his contribution.

01 Webhook

This connector allows users to implement a variety of useful functions such as task scheduling, event scheduling, data pushing, etc., as long as the output side provides support for Http service capabilities.

See https://seatunnel.apache.org/docs/2.1.2/connector/source/Webhook for details.

02 Http

The new version supports reading Http interface data, to provide upstream with the ability to transfer the data to SeaTunnel for further processing via Http. Http is a common standard interface, by which you can access a variety of business. It is used as shown below.

Http {
url = "http://date.jsontest.com/"
result_table_name= "response_body"
}

The Kafka and ElasticSearch connectors have been added to the FlinkSQL module and SeaTunnel can now use SQL to read and write data from these data sources.

Transform support for UUIDs and Replace has been added, allowing more flexibility for the simple processing of data. Custom functions have also been added to help users implement various custom business logic.

03 Support for running SeaTunnel on Kubernetes

As Kubernetes has become a must-have component in the cloud-native era, SeaTunnel naturally needs to provide the corresponding support.

The official adaptation of SeaTunnel to run on Kubernetes can be found in the tutorial
https://seatunnel.apache.org/docs/2.1.2/start/kubernetes

02 Specific updates

01 [Connector]

  • Added support for Spark webhook connector
  • Optimized the Jar package structure of Connector
  • Added Spark Replace transform component
  • Added Spark Uuid transform component
  • Added Oracle adaptation to Flink's JDBC source
  • Newly support for Flink HTTP connector
  • Added Flink registration of custom functions
  • Flink SQL module adds support for Kafka and ElasticSearch connectors

02 [Core]

  • Add support for Flink application runtime mode
  • Support for dynamic addition of Flink configuration

03 [Bug Fix]

  • Fix some types conversion issues with Clickhouse Sink component
  • Fix the problem that the Spark runtime script fails for the first time in some cases.
  • Fix the problem that Spark on yarn cluster mode cannot get the configuration file in some cases.
  • Fix the problem that Spark extraJavaOptions cannot be empty.
  • Repair the problem that internal files cannot be decompressed in Spark standalone cluster mode.
  • Repair the problem that Clickhouse Sink cannot handle multi-node configuration properly.
  • Repair the error of Flink SQL configuration parsing.
  • Repair the problem of incomplete matching of Flink JDBC Mysql types.
  • Repair the problem that variables cannot be set in Flink mode
  • Repair the problem that the configuration of SeaTunnel cannot be checked in Flink mode.

04 Optimization

  • Upgrade Jackson version to 12.6
  • Add wizard for deploying SeaTunnel to Kubernetes
  • Tweak some generic type code
  • Added Flink SQL e2e module
  • Flink JDBC connector added pre SQL and post SQL features
  • Use @AutoService to generate SPI files
  • Flink FakeSourceStream support for mock data
  • Support reading Hive data via Flink JDBC connector
  • ClickhouseFile support for ReplicatedMergeTree engine
  • Hive sink support saving ORC format data
  • Support for Spark Redis sink with custom expiration times
  • Add Spark JDBC transaction isolation level configuration
  • Replace Fastjson in code with Jackson

03 Acknowledgements

Thanks to the following contributors for their dedication and hard work,(GitHub IDs, in no particular order), we were able to get this release out quickly, and welcome more people to join contributions to the Apache SeaTunnel (Incubating) community.

v-wx-v, GezimSejdiu, zhongjiajie, CalvinKirs, ruanwenjun, tmljob, Hisoka-X, 1996fanrui, wuchunfu, legendtkl, mans2singh, whb-bigdata, xpleaf, wuzhenhua01, chang-wd, quanzhian, taokelu, gleiyu, chenhu, dijiekstra, tobezhou33, LingangJiang, mosence, asdf2014, waywtdcc, Emor-nj, dik111, forecasted

About SeaTunnel

SeaTunnel (formerly Waterdrop) is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day in a stable and efficient manner.

Why do we need SeaTunnel?

SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.

  • Data loss and duplication
  • Task buildup and latency
  • Low throughput
  • Long application-to-production cycle time
  • Lack of application status monitoring

SeaTunnel Usage Scenarios

  • data synchronization
  • Massive data integration
  • ETL of large volumes of data
  • Massive data aggregation
  • Multi-source data processing

Features of SeaTunnel

  • Rich components
  • High scalability
  • Easy to use
  • Mature and stable

How to get started with SeaTunnel quickly?

Want to experience SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.

https://seatunnel.apache.org/docs/2.1.0/developement/setup

How can I contribute?

We invite all partners who are interested in making local open-source global to join the SeaTunnel contributors family and foster open-source together!

Submit an issue:

https://github.com/apache/incubator-seatunnel/issues

Contribute code to:

https://github.com/apache/incubator-seatunnel/pulls

Subscribe to the community development mailing list :

dev-subscribe@seatunnel.apache.org

Development Mailing List :

dev@seatunnel.apache.org

Join Slack:

https://join.slack.com/t/apacheseatunnel/shared_invite/zt-10u1eujlc-g4E~ppbinD0oKpGeoo_dAw

Follow Twitter:

https://twitter.com/ASFSeaTunnel

Come and join us!

Top comments (0)