Data Eng Weekly Issue #264

13 May 2018

Lots of great content this week, including a new project for MapReduce using AWS Lambda called Corral, coverage of the Onyx and Wallaroo stream processing systems, distributed tracing with Jaeger, and tips for working with Kafka Connect. In news, there's an article warning that the prevalence of open source could change and a podcast episode covering the Great Expectations library for testing data pipelines. In releases, Apache Impala 3.0.0 is out, and there are several other interesting projects to catch up on like the Mara python ETL framework.

Data Eng Jobs

There are three listings for data engineering jobs in New York, San Francisco, Mountain View, and Paris. Check them out or add your own!

https://dataengweekly.com/jobs

News

Pyrostore is a new product that provides Kafka-like semantics but uses object storage (Amazon S3 or Google Cloud Storage) to decouple the storage from compute.

http://pyrostore.io/blog/2018/05/10/kafka-potential-past-present.html

The Call for Presentations for Flink Forward Berlin, which takes place in September, is open now through June 4th.

https://flink-forward.org/call-for-presentations-submit-talk/

I've noticed recently that a number of newer distributed system projects/companies are using a combo of open-source and freemium software licenses. This article is a great analysis of the trend—it highlights that we are starting to take open-source licenses for granted, and we may need to accept new software built using a different model.

http://redmonk.com/sogrady/2018/05/11/taking-open-source-for-granted/

Podcast.init has an interview this week with creators of Great Expectations, which is a python library for modeling assumptions in your data pipelines and performing automated tests to verify those.

https://www.podcastinit.com/great-expectations-with-abe-gong-and-james-campbell-episode-161/

Data Eng Jobs

There are three listings for data engineering jobs in New York, San Francisco, Mountain View, and Paris. Check them out or add your own!

https://dataengweekly.com/jobs

Events

Curated by Datadog ( http://www.datadog.com )

UNITED STATES

California

BASM @ Bloomberg (San Francisco) - Tuesday, May 15
https://www.meetup.com/spark-users/events/250221273/

Building Data Pipelines x2 + ML and Go (San Francisco) - Wednesday, May 16
https://www.meetup.com/golangsf/events/249469374/

Colorado

Building a Real-Time Streaming Platform (Englewood) - Tuesday, May 15
https://www.meetup.com/Front-Range-Apache-Kafka/events/250217845/

Utah

Monthly Spark, Big Data, and Data Engineering Meetup (Salt Lake City) - Monday, May 14
https://www.meetup.com/apache-spark-slc/events/248643902/

Virginia

High Speed Data Visualization: Kafka Meets Elasticsearch (Ashburn) - Wednesday, May 16
https://www.meetup.com/Apache-Kafka-DC/events/250336706/

BELGIUM

Intro to & Experiences Implementing Databricks (Zaventem) - Monday, May 14
https://www.meetup.com/Microsoft-Advanced-Analytics-User-Group/events/248414092/

BULGARIA

Uber Engineering Sofia (Sofia) - Thursday, May 17
https://www.meetup.com/Uber-Engineering-Events-Sofia/events/250449225/

ISRAEL

KSQL Deep Dive, with Kai Waehner of Confluent (Tel Aviv) - Monday, May 14
https://www.meetup.com/ApacheKafkaTLV/events/249561540/

Async with Akka, Spark Pitfalls (Tel Aviv-Yafo) - Tuesday, May 15
https://www.meetup.com/underscore/events/249827607/

RUSSIA From Queues to Stream Processing (Moscow) - Thursday, May 17
https://www.meetup.com/Moscow-Kafka-Meetup/events/250138402/

AUSTRALIA

Stream All Things with Kafka and KSQL (Sydney) - Tuesday, May 15
https://www.meetup.com/apache-kafka-sydney/events/250052794/

Streaming ETL with Apache Kafka and KSQL/Apache Metron (Melbourne) - Wednesday, May 16
https://www.meetup.com/KafkaMelbourne/events/250373412/

Fast Data: Stream All the Things! (Sydney) - Thursday, May 17
https://www.meetup.com/Sydney-Apache-Spark-User-Group/events/249667854/

NEW ZEALAND

Streaming ETL with Apache Kafka and KSQL (Wellington) - Thursday, May 17
https://www.meetup.com/Kafka-Wellington/events/250078551/

Data Eng Weekly

Data Eng Weekly Issue #264

Sponsors

Technical

Data Eng Jobs

News

Sponsors

Releases

Data Eng Jobs

Events

UNITED STATES

California

Colorado

Utah

Virginia

BELGIUM

BULGARIA

ISRAEL

AUSTRALIA

NEW ZEALAND