Analytics engineering with dbt Labs
If the mention of "the oldest federal cultural institution" in the United States triggers thoughts of Nicholas Cage, know that it's not a coincidence. The main reading room and several areas of the Thomas Jefferson Building of the Library of Congress provide a backdrop for the hunt for the 'book of secrets' in the national treasure movies.
One of the largest libraries in the world, its collections are universal, not limited by subject, format, or national boundary, and include research materials from all parts of the world and in every language known to man.
And yet, like all libraries in the world, the library of congress's collection of books is ever-changing and ever-evolving. A team of curators ensures that while the stacks that hold up the books on the walls of the Thomas Jefferson building show no sign of fault or change, the books that line the walls capture the zeitgeist and protect cultural heritage.
For modern data teams, the equivalent of a 'library curator' is a term that's been gathering some momentum recently: the analytics engineer. Much like the curators at a library, analytics engineers curate the collection on display for the rest of an analytics-driven organization to consume with ease, rather than stare down the abyss of data. Analytics engineers have started owning the data stack at modern data teams, and exist in between the data engineering and business analytics layers by transforming data and making it ready for analysis at the other end. And no tool has had more say in the genesis of this newly defined category than dbt Labs.
👨🏻🏫 A mature analytics workflow
2016 saw a few folks at RJ Metrics - a big-data analytics firm set up an open-source project called the 'Analytics Collective'. The project sought to enable a new workflow for analytics - one that would bring together best practices from software engineering into data analysis, a field whose developments increasingly mimicked that of software. Drawing inspiration from the collaborative nature of software development and the processes that enabled it like -
- version control,
- testing and code review,
- package managers for reusing models,
- development and production environments,
- and deployment automation (via DAGs),
The project sought to build a programming environment for databases
The core component of this project was a command-line tool called dbt (data build tool) - a combination of SQL (the mother tongue for all data analysts) and templating language Jinja.
When RJ Metrics was acquired by Magento, VP Marketing Tristan Handy resigned to set up a lifestyle data consultancy business in the town of Fishtown, Philadelphia. 'Fishtown analytics', saw Tristan and co-founder Drew Banin consult series A/B ventures in setting up advanced analytics. The duo, later joined by 3rd co-founder and CTO Connor McArthur also took up the responsibility of building out dbt, a tool that had followed Tristan from Analytics Collective and had found its way to all of their client engagements!
Today, dbt is a Modern Data Stack standard for running transformations and is a toolset that allows data analysts/engineers to connect to databases using adapters, effortlessly build data models (SQL queries that convert raw data to user friendly formats), auto document the process, test it across development and production environments and execute the model in the right sequence (DAGs).
The tool is accessible as an open-source project:
- dbt Core (over Command Line Interfaces)
- And a managed cloud service called dbt Cloud (over a cloud UI and IDE)
Core is used by 9,000 companies while Cloud has 1,800 paying customers, catapulting dbt Labs to a mind-blowing $4.2B+ valuation, raising $222M from marquee investors like Altimeter, Sequoia capital, A16z, Coatue, and venture wings of data giants - Snowflake and Databricks! 🤯🚀
By empowering SQL savvy data analysts to set up modeling workflows on their own, dbt Labs pioneered the analytics engineer role and dbt is the primary tool in the nascent division's toolkit
👨🏻🔬 Analytics engineers assemble
"....Second [reason for investing], it has an incredibly vibrant community of users who absolutely love the product. The traction really is best of breed for bottoms up, open source projects. Over the last year, they’ve managed double digit MoM growth." - Martin Casado from a16z in his investment memo
In 6 years, dbt Labs have not only co-invented their own ICP but have successfully created a massive community of practitioners around the role. The dbt Community on Slack is 25,000+ strong and boasts 12+ meet-up groups spread out across 8 countries. As an open-source tool, an engaged community also becomes a flywheel of value generation with contributions bettering the tool with each increment. dbt Labs' Community-led growth strategy can be broken down into 3 well-timed phases:
🎪 Setting up camp
The dbt community in its most nascent stage was a slack #general channel that early adopters could use to provide product feedback to the founding team at Fishtown analytics.
These conversations extended onto general discussions about the modern data stack, turning the slack workspace into a forum for all things data analysis.
When companies like Casper and Hubspot started trying out dbt, their users joined the slack channel. When practitioners moved jobs they took dbt and the slack channel along. Some of these champion users also set up local dbt meet-up groups. The result? dbt was able to build and manage city-based communities across major cities including NY, SF, Sydney, London, becoming the de facto forum for all things analytics in these hubs.
🔥 Growth and Marketing as a fuel
The content, product, field, experiential, and organic marketing verticals led by Janessa Lantz create awareness and the top of the funnel for both the product as well as the community.
- The analytics engineering roundup newsletter and podcast series on Substack, strengthen dbt Labs' position as subject matter experts and also funnel readers to participate in the community.
- Event marketing activities that range from Coalesce - a global conference attended by 7,000+ to virtual meetups engage the community and acquire new members.
- dbt Learn - offered as a free-to-access module and also provided as a professional service, is an activity that brings new users into the community while also equipping existing users with the know-how required to thrive as analytics engineers. This initiative is owned by the customer success vertical led by Erin Vaughan.
⚙️ The community machinery
Rapid member growth can be a double-edged sword for online communities. As engagement from facilitators gets diluted across a growing base of members, communities can spiral downwards very quickly into a breeding ground for bots and one-way promotion.
Sustained value, filtration from spam, and healthy engagement are must-haves for growing communities and the dbt Labs community team led by Anna Filipova have hit it out of the park in facilitating this.
- developer advocates who engage and moderate,
- strict onboarding checks for spam prevention,
- documentation tooling to collate community discussions and contributions,
- the 'learn' program,
- offline meetups and local channels
...together maintain community health even at 25,000+ member counts! 🤩
✨ Self-serve 'aha' moments
dbt Labs makes its money from (1) dbt Cloud - their hosted and managed service for accessing dbt and (2) their professional services wing wherein Analytics engineers set up data infrastructure for clients (Fishtown analytics style) - More about that later.
A bottom-up growth model witnesses an initiated analytics engineer who has learned about dbt from the community, via word of mouth, or from Janessa's marketing campaigns to test out Core on CLI, and later move up the convenience tree onto Cloud. Cloud, plush with features including a browser-based IDE, job scheduling features, and more is a no-brainer update for our user because it's free to access. The user who then experiences the aha moment of running a transformation job via the IDE HAS to introduce the tool to their team, at which point the $50 per seat per month becomes a reasonable price to pay. The team further shares a free-to-use read-only seat with their leadership and cross-vertical colleagues increasing reach within an organization.
🖇 Integrations and partnerships
dbt Labs also enjoys a strong product-led acquisition engine courtesy of API-based integrations with other tools in the data stack. Users of other data stack tools like Airbyte and Snowflake are offered plug-and-play integrations and well-documented playbooks to trial Cloud bringing in a win-win for the data ecosystem. As part of the revenue team, Nikhil Kothari heads technology partnerships.
Free access combined with meticulous documentation and content, 24x7 customer support, and community resources eliminate friction points in this self-serve funnel creating a world-class product-led growth flywheel for the company! 🏅
This flow and the team offering however fall short when customer scale increases...
🏢 Selling to enterprises
As larger enterprises adopt dbt Cloud for their transformation needs, the offerings bundle will have to extend to meet enterprise-specific requirements like security, role management, and custom pricing. Customers for these 'Enterprise' and 'Commercial' suites are serviced by the Revenue team (~50) lead by Nicholas Erdenberger and their activities combined with marketing can be broken down as:
- Lead generation and prospecting 🔍
In any given month, we generate about 75% of our sales pipeline from what we elegantly call “inbound magic”. These are incredibly qualified companies who reach out to us through the Contact Us form on our website and say they would like to speak with someone on our sales team, please. - Janessa Lantz, VP Marketing @ dbt labs (as of Aug 2021)
Inbound leads are prospected and qualified by Sales Development Representatives (~2-3) while the revenue marketing and field marketing team generate qualified outbound leads for the sales team.
2. Sales cycle 💰
Sales directors for enterprise accounts (~10) and Account Executives for commercial accounts (~5) own the complete sales cycle from lead to close. These sales professionals report to the Director of Sales and carry significant quotas with compensation being split equally as fixed and variable.
3. Solutions architecture 🛠
The direct-sales resources are supported by solutions architects (~15) to execute custom client requirements, build pipelines and even deploy dbt in private clouds when required.
4. Professional services and Customer success
These enterprise clients can opt-in for professional services in the form of dedicated analytics engineers for turn-key projects across the data stack and also have round-the-clock support access. 🤝
dbt Labs also employ service partners like Cognizant to implement and co-sell to enterprise clients.
⏩ Next steps
As the data management market continues its shift to the cloud, there is a lot more value that dbt Labs stand to capture with dbt as the standard for data transformation. With Margaret Francis stepping in as CPO and with $222M dollars added to the bank, dbt Labs is expected to pick up an ambitious roadmap that will explore new features like the dbt metrics layer and strengthen existing offerings like the dbt Cloud enterprise suite.
With the risks of getting replaced by the 'platforms' of the data stack still looming, dbt Labs has chosen to build better and build faster. They have the numbers to show that this strategy is working and we are sure 2022 will witness new peaks in dbt Lab's customer, community, and revenue metrics! 🚀❤️