⬅️ Zettelkasten End of Year for 2023 🧭 Interoperability and Convergence Ahead ➡️

Tip of the Apache Iceberg

by Jay Cuthrell
Share and discuss on LinkedIn or HN

Fudge Sunday readers will recall my use of songs as inspiration. While the newsletter is not going back to the series format, the lyrics of SchoolHouse Rock - Mother Necessity are appropriate.

The topic of ever more responsive, effective, and efficient data analytics is not new to Fudge Sunday newsletter. Today, there are rapidly updating cloud native projects and stories from the communities and companies involved in Hudi, Iceberg, Presto, Spark, Superset, Trino, and many more.

Shot

Just 132 days ago, Compute.AI was mentioned in Fudge Sunday #214 “Are You Gonna Go Parquet”.

A few readers were curious about how to “kick the tires” and get a demo of Compute.AI.

Chaser

Last week, the first public GitHub repository appeared for Compute.AI. Practitioners can now launch a containerized deployment of the ComputeAI SQL engine.

The Compute.AI team has posted updates on LinkedIn for the announcement itself as well as a video demo on the integration with Jupyter Notebook.

Also, there is a video demo on the integration with Apache Superset.

As such, it’s time to add to the topics list. 🤓

Getting Informed

When reading about cloud native projects, it is common to see references to the Apache Software Foundation (ASF). Over almost a quarter of a century, ASF has grown to almost 300 projects.

One such project, Apache Iceberg, began in 2017 within Netflix and was donated to ASF in 2018 to promote the efficacy and longevity of a modern approach to low-level standards involving very large tables. This rapid promotion of the project within ASF is because the community is focused on ensuring Iceberg does certain things very well.

Then, in 2021, a commercial entity known as Tabular was formed and funded to simplify, secure, and streamline the adoption of Iceberg.

https://www.techmeme.com/230919/p43#a230919p43
https://www.youtube.com/watch?v=_GW3GYZK66U

Commercial options for Iceberg besides Tabular are growing. There are recognizable companies like Cloudera and Snowflake, but there are smaller companies like ClickHouse and Starburst that deserve attention as well — not to mention the Hudi play from companies like Onehouse.

https://www.techmeme.com/211028/p21#a211028p21
https://www.techmeme.com/220209/p15#a220209p15
https://www.techmeme.com/230203/p27#a230203p27

In 2024, expect to see more community activities including practitioner challenges that promote the art of the possible drawing upon various projects mentioned in this newsletter.

https://www.morling.dev/blog/one-billion-row-challenge/
https://ftisiot.net/posts/1brows/

So, what will be the next big thing in Apache Iceberg and related projects?

Until then… Place your bets!

Disclosure

I am linking to my disclosure.


p.s. As I’ve gotten older, I have come to appreciate getting snail mail. If you have time to drop me a postcard that would be amazing.

Topics:

✍️ 🤓 Edit on Github 🐙 ✍️

⬅️ Previously: Zettelkasten End of Year for 2023

➡️ Next: Interoperability and Convergence Ahead

Share and discuss on LinkedIn or HN
  • Get Fudge Factor each week