Sourcegraph: Google for code

Nov 16, 2021
4
min read

Mark Andreessen was right. Software has eaten the world.

And now companies are drowning in code.

This is the age of Big Code and developers are dealing with hundreds of times the volume of code in their codebases compared to ten years ago. Every company is becoming a tech company, and an ever-diversifying product portfolio coupled with recent inconveniences caused by a bat means that along with sheer volume, the complexity of code has shot through the roof.

Earlier this year, a bug in the flight systems provided by Google took down the American, Delta, and United flight booking systems. Around 200 flights were canceled, and about the same number were delayed before the bug was found and fixed. In many companies today, lines of code written decades ago are still deployed in user-facing applications and are often barely holding massive Jenga blocks together.

Small codebases are a thing of the past, and the opensource explosion has compounded dependencies in the codebase, all of which developers must understand if they are to add to, debug or even recycle efficiently.

Quinn Slack and Beyang Liu were no exception. Being developers themselves, they both faced this very itch at Palantir, where they realized that a lot of code they were writing had either already been written by someone else at the company, or existed in an open-source library that they weren’t aware of.

Drowning in the inefficiencies of duplicated and dependent code, Quinn and Beyang set out to build something that developers like themselves could use to boost productivity by acting as a 'second brain'. They set out to build the equivalent of Google search, but for code.

Sourcegraph was built to reduce the redundancies that their CEO Quinn Slack and CTO Beyang Liu faced at Palantir. Fast forward to today, the "Google for code" is now valued at $2.625 billion on the back of their $150 million Series D funding in July 2021.

Today, a billion people are using products that were built by developers using Sourcegraph.

🌎 Bringing google code-search to the rest of the world

“If you meet a software engineer who works on the main google codebase and ask them what they think of google code search, they will tell you it is the best thing since sliced bread.” - Beyang Liu, CTO

Having worked at Google, Beyang Liu had used Google’s own internal code search before and seen the value that it delivered to developers. Beyang and Quinn wanted to bring this to the outside world for all code - both open-source and code within organizations. They wanted developers to be able to “stand on the shoulders of giants.”

Sourcegraph does this using srclib: the core analysis library that powers the code-search. The library works by graphing the underlying essence of code onto a language-agnostic schema, into which translators can be plugged depending on which language code is written in. Once the source code is graphed, the tool can understand code at a semantic level and is then able to jump to definitions globally, find references, and lookup documentation across all code - private or public.

With (1) code-search, developers can onboard to a new codebase, find answers faster, and identify security risks. With (2) batch changes, they can then keep their code up to date, fix critical security issues, and pay down tech debt across all of their repositories. The (3) code intelligence tool provides advanced code navigation features that let developers explore source code.

For a developer, using Sourcegraph extensions means that when a site is down and they get a call at 3 AM, Sourcegraph can point to the faulty code, tell them where else it is deployed, who last touched it, and show them all the error logs, reducing debugging time to minutes from hours.

"Just call it code-search!" Experimenting with positioning 🧑‍🔬

Within the first two weeks of their launch in 2013, Quinn and Beyang were using Sourcegraph to find code that someone had already written, to build... Sourcegraph. They were using Sourcegraph for Sourcegraph and already saving valuable time and effort. The advantages were immediately evident and they wanted to get the tool in the hands of every developer immediately.

It sounds almost astonishing now, but they weren’t going to find true PMF for another five years. Here's why:

"If you think you're building a category, you're probably fooling yourself." Sourcegraph's early positioning made the mistake of taking this advice to heart and shying away from the reality of their code-search platform: they were out to create a new category of product. So when they pitched themselves as "a new developer platform" or "code intelligence tool", they were putting themselves in competition with other existing products. In reality, their only competition was an open-source code-search tool created by Oracle (then Sun microsystems) back in 2004. It was only later, and after persistence from their early adopters that they repositioned as a code-search tool.

"Don't talk about what your product is; that's an implementation detail. Instead, talk about the value it offers." Here was another piece of advice that Quinn and Beyang wish they could go back in time and take less seriously. After iterating through "developer value creation engine", and among others, "a tool that helps developers leave work at 5 PM", the value was ultimately left to the developers using their platform to decide.

"We are building code-search." End of.

Cloud or self-hosted? The conventional wisdom at the time (keep in mind, that this was in 2013), when faced with this choice was to default to a cloud offering. Sourcegraph was launched on the cloud, and this meant that to use their code-search, companies had to upload all of their code first by clicking on a big green button. A button that not many were eager to click, especially not companies with massive code repositories, for security reasons. The early adopters of Sourcegraph were smaller companies with low volumes of code and a handful of developers maintaining them. They just did not have the big-code problems that Sourcegraph was built to solve. Quinn was certain that the adoption would come once they built a name that the bigger companies could trust. For the first five years, Sourcegraph had limited traction and only with small companies, who probably didn’t need code-search anyway. What proved to be pivotal in the product taking off wasn’t to come until five years later when they finally moved to a self-hosted platform.

After early adoption by Uber, Lyft, and Yelp, Sourcegraph code-search now has 800,000 annual users including 3 out of 5 FAANG companies.

🚀 Accelerated growth post-PMF

From 30 employees in January 2020 to 220 in September 2021, Sourcegraph added one new person to the team almost every other day. They grew 4x in revenue year-over-year in that time. They have never lost a customer. At the core of their powerful PLG motion is a tool that developers just cannot part with once they use it. Before layering in outbound sales, developers discovered Sourcegraph by word-of-mouth, marketing, or OSS (API docs).

Their growth accelerated through some powerful drivers and market forces:

  • Sourcegraph is free for up to ten users. Developers discovering the tool usually brought in the rest of their team on board, driving viral adoption within organizations, and eventually moved to enterprise plans.
Enterprise plans come with all the free features plus batch changes, user roles, training sessions, etc.
Enterprise plans come with all the free features plus batch changes, user roles, training sessions, etc.

  • The diaspora of developers who used and loved Sourcegraph advocated strongly for the tool. When they switched jobs and moved, especially from smaller companies to larger ones, they brought the tool with them to the new team of developers.
  • Ex-Google employees who always had the google internal code-search tool open in a tab every day sought out Sourcegraph as the natural replacement when they left Google and advocated its broader use at their new organizations. (Twitter's adoption of Sourcegraph with Scala support scaled through an engineer who had worked at Google before)
  • Before the pandemic, a developer could tap on someone's shoulder or go to lunch with them to understand a code library that predated their stint at the company. With all systems going remote, this void was now a huge opportunity that Sourcegraph could fill.
  • With all physical touchpoints moving online in the wake of the pandemic, companies had to now onboard thousands of developers into new roles very quickly, as they made increasingly larger bets on software and tech. Sourcegraph grew very very quickly during this time.

💵 Adding Sales to the flywheel

The first Sales teams at Sourcegraph were set up towards the end of 2019. Gregg Stone, who was head of US Sales at Segment was brought in as VP of Sales, along with Kacie Jenkins as VP of Marketing (previously at Fastly). Today, they're a team of ~20 under three regions: East, West, and International.

Here's the current Sourcegraph PLG flywheel, with Sales layered in:

The current state of the Sourgraph PLG flywheel
The current state of the Sourgraph PLG flywheel

Inbound and outbound SDRs identify leads for the AEs to set up meetings with if they fit the target opportunity profile:

  • Potential champion or coach that is aware of or understands code search in some capacity, probably from prior experience with Sourcegraph or another code search tool like OpenGrok or... has worked for Google/Facebook and used their tool

OR

  • Title/role includes developer productivity, developer experience, API services, distinguished eng, platform engineer, director+ eng

AND

  • Work at a company that has 50+ devs or a Commercial account (<250 employees)
The Sourcegraph GTM org (source: Sourgraph handbook)
The Sourcegraph GTM org (source: Sourgraph handbook)


🧘‍♂ Dev-happiness and the flow state

"Quinn and Beyang have been at it since 2013, fueled by their conviction that they could build the tool they needed as developers and by the mission that code search should be as universal as Google Search." - A16z blog

The Sourcegraphs founders want to put code-search in the hands of every developer in the world. They say that their biggest challenge isn't competition, it's that most developers haven't used code-search, and aren't aware of the value it adds.

Fresh from their Series D funding, they are hiring for 40+ open positions on their global all-remote team, aiming to scale companies of every size and scale.

Around 800,000 developers are using Sourcegraph today.

That number is going up every time you hit refresh.

Join 8,000+ growth leaders