Writing Big Data Pipelines: the Apache Beam Project – Wellington – 14 March 2019

What

Apache Beam (https://beam.apache.org/) is an open-source project for writing big-data pipelines.

In the first part of this talk, I’ll describe Beam from a non-technical perspective – what it is, why you would use it, how it compares to other technologies in the big data space.

In the second half of the talk I will go into a high-level overview of the technical aspects of Beam. In particular, its heart is a programming model that unifies both batch and stream processing, allowing the programmer to separate the what, where, when, and how of processing. What actual processing is performed on the data. Where in event time is that processing done – how are event times windowed. When in processing time to materialise results. How are updates of results (due e.g. to late data) combined. Beam also provides several language-specific SDKs that instantiate the model for particular languages. Currently Java and Python are available and Go is under development.

Beam also provides a portability framework that allows pipelines to be run on a variety of execution technologies. Beam itself provides a reference runner. There are also efforts to develop runners based on Apache Flink and Apache Spark. Google provides a commercial managed runner on its Google Cloud. Beam builds on the work of Map Reduce, Hadoop, Flume, Spark, and Flink.

Speaker Bio

Neal Glew is a software engineer in the Flume project at Google, where he mostly works on the shuffle system. He previously worked at Intel on parallel programming models within Intel Labs. He has a PhD in computer science from Cornell University and a BSc(hons) in computer science from Victoria University of Wellington.

Data Driven Wellington Meetup Group

There’s so much going on in the world of data that it can be hard to keep up with what’s happening in your own speciality area let alone make connections to others who might have complementary skills or interests. This Meetup is intended to make it easier to stay informed and to make those connections. Its focus is on what people working with data in Wellington-based public, private, non-profit, and academic organisations are doing, what challenges they’re experiencing and what they need help with. It welcomes members who spend their days capturing, storing, manipulating and analysing data as well as those who use data generated by others for decision- and policy-making.

When and Where

Thursday, 14 March 2019
5:30pm – 7:30pm

Rutherford House
23 Lambton Quay
Wellington

Rutherford house is the tall building between the Beehive, Railway Station, and Old Government Building; The Meetup will be in VicBooks Cafe, on the Bunny Street side of the ground floor.

How Much

Free

More

Writing Big Data Pipelines: the Apache Beam Project
Data Driven Wellington Meetup

SQLSaturday 809 – Wellington – 22 – 23 February 2019

What

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.

SQLSaturday Wellington 2019 Pre-Cons
We are planning to run the following Pre-Cons on Friday, 22nd February 2019, 9:00am – 5:00pm. The Pre-Con venue will be confirmed shortly.

Creating a Database Deployment Pipeline using DevOps Processes by Hamish Watson
SQL Server Performance Tuning for DBAs by Amit Bansal
Data Science with Microsoft Cloud by Leila Etaati

When and Where

Friday 22 February 2019 – Pre-Cons
Saturday 23 February 2019 – Main Event

Whitireia New Zealand,
3 Wi Neera Dr,
Elsdon,
Porirua,
Wellington

How Much

Main Event    – Free
Pre Cons: Student – $150
Pre-cons Regular – $250

More

Website and registration