Tweets
- Tweets, current page.
- Tweets & replies
- Media
You blocked @schrockn
Are you sure you want to view these Tweets? Viewing Tweets won't unblock @schrockn
-
Pinned Tweet
1/ Today we at Elementl are excited to launch an early release of Dagster, an open-source Python library for building data applications. Here's a post about what Dagster is, why I moved to data infra, why data is hard, and why we need a new system.https://medium.com/p/dbd28442b2b7
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
24/ We are also looking for additional founding team members! All the way from full stack, dev tools/PL folks to data eng/science. Must have a passion for tools and belief in abstractions to reshape more than dev workflow, but orgs and industries. DMs open or email (see above)
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
23/ We are early with this project and looking for just a few additional design partners/adopters to work with. The idea is to directly work/embed with your team and get into a fast feedback cycle etc to ensure that you are successful. DMs open or email hello at elementl dot com
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
22/ For the GraphQL-aware: Structurally this serves a similar role in data as GraphQL in the API domain. A software abstraction backed by arbitrary compute that one can build shared tooling on top of and deploy to any infrastructure. Type system, metadata etc software-defined.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
21/ We believe that these issues are best addressed with a software abstraction. In this case, we believe there should be a layer that can describe and model a data app regardless of programming language, computational runtime, orchestration engine etc.pic.twitter.com/EKO02T3o8t
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
20/ We’re not claiming to “solve” testability, but providing a software structure make it more possible. We’re not claiming to make the impossible easy; we are claiming that we can make the impossible possible.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
19/ High latency/computationally intensive make for extraordinary long developer feedback loop cycles. Can be hours when it ideally should be seconds. Changing the system very high cost. Can easily result in poorly structured systems with low code quality and low productivity.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
18/ Really hard to test. They have dependencies on external, hosted services (e.g. Redshift, Snowflake) or heavyweight runtimes (e.g. Spark). Business logic encoded in these systems. Cannot faithfully mock out or fake. Doing so is too much effort.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
17/ Data apps are multi-tool and -persona. Often you have analysts, eng, data eng/science all collaborating on the same logical app. They use a variety tools (spark, data warehouse, notebooks, python etc). Massive amount of context lost as data flow across tool boundaries.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
16/ First data apps don’t control their inputs. A normal app can reject invalid input from users. Not true with data apps. Incoming data changes all the time. Can't update data so you have to update the code. Data apps must account for this unfortunate reality.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
15/ We define data applications as graphs of functional computations that produce and consume data assets. They are increasingly complex and mission-critical to businesses today. They also require unique approaches because they have unique properties.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
14/ We believe that ETL, ELT, ML Pipelines, data integration, etc are a single category of software. ETL produces a file/table; ML pipeline produces a model. Other than that structurally similar/identical: They are data applications.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
13/ We believe the data domain is on the cusp of a similar transition, and we want help drive that. Data engs/scientists should no longer be stitching together disconnected jobs. They should be building full data applications.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
12/ React also respected the discipline. Devs were not scripting web pages; they were building full apps. React acknowledged the *essential* complexity of this domain and built constructs to match that complexity. JS used to be considered eng backwater. No longer true.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
11/ React got a lot of things right. Defined its domain well, nailed the abstraction for that domain, adopted formal comp sci constructs to frontend and made them approachable, and was both a step function improvement and incrementally adoptable.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
10/ Fast forward 10 years, and no one says that anymore in frontend. Browsers got better but it is the software abstractions that proved decisive, especially but not exclusively React. People still complain, but no one really says they waste 80% of their time.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
9/ Reminded me of the frontend ecosystem circa a decade ago. Back then engineers would say they spend "80% of their time fighting the browser, and 20% of their time building their app”. Again they said one thing and meant another. The problem was primarily software abstraction.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
8/ Taking this statement literally one would work exclusively on making data cleaning faster. However that is not what people *mean*. They mean they waste lots of time. Building one-off infra, doing systemically repetitive things, unable to truly build on others work, etc.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
7/ The most direct expression of this is when people say “I spent 80% of my time cleaning the data, and 20% of my time doing my job.” While they say that, they are actually describing deeper pathologies.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
6/ Origin: I left FB in Feb ‘17 and started looking for my next challenge. I kept on hearing from people that their biggest tech problem was "their data is totally broken". I didn't understand what that meant initially.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
5/ These computational graphs are (a) abstract and (b) queryable and operable over an API. They can be deployed to arbitrary compute targets, e.g. Airflow, Dask, FaaS, k8s-based engines. Dagster tools are shared regardless of physical compute substrates.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

