Democratizing search technology — by bringing it to Webflow

Democratizing search technology — by bringing it to Webflow

A look at how — and why — we built site search with some of the best search technology available.

Democratizing search technology — by bringing it to Webflow

A look at how — and why — we built site search with some of the best search technology available.

No items found.
Written by
Bryant Chou
Bryant Chou
Bryant Chou
Bryant Chou

When we set out to build site search for Webflow, we knew it had to be more than “just search.” In part, that meant building it on a foundation backed by the best available technology: Elasticsearch. But that was just the beginning.

Given that, we determined we should focus on delivering three key things:

  1. Relevance. Just because a page includes the words you’re searching for doesn’t mean it’s relevant to your search. That’s why we chose Elasticsearch — a leading open-source search technology far more capable of handling relevance than simple database search.
  2. Performance. Our already robust hosting platform nicely handles scaling for both querying and indexing — saving you from having to set up and scale your own search backend.
  3. Indexing controls. We give you fine-grained controls over what’s searchable, so you can omit what’s unhelpful, and get users to their goal that much faster.

In addition to, of course, the level of visual design control you expect from Webflow.

How search works

Search has always been a hard problem on the web, and it only gets harder as the mass of content available grows. From a computational standpoint, a lot goes on behind the scenes of a simple “please show me all content related to x” search.

Here’s what that looks like:

  1. First, content that you want to index needs to be “parsed” and “stemmed” into their root words. For example, “walking” would be parsed, then stemmed into the word “walk.”
  2. Next, the search engine needs to store this data in a way that makes retrieving results efficient (with complex file structures and binary formats optimized for lookups).
  3. Then, when a query is initiated, the search engine has to quickly scan through all the indexed documents and determine what pages or content relate to the search term.
  4. From there, it needs to take this mass of information and sort it all based on a variety of relevancy scores, such as the position of the search term, whether or not articles (“the”, “a”)  are useful, and even the frequency of the term in the overall corpus of text.

This algorithmic challenge has been approached in many different ways over the years, with full text search kicking off as an open source project called Lucene in 1999.

Through the years, many expansions of this basic technology (as well as other, completely different approaches) have been released, with Elasticsearch now leading as the most widespread platform using Lucene — and serving as the foundation for our implementation.

Choosing Elasticsearch

It’s not a very controversial stance to assert that Elasticsearch offers industry-leading relevance technology for search. There’s a reason it’s the core technology behind some of the biggest search implementations on the web, including SoundCloud, The New York Times, Github, and many more.

Aside from its core, Lucene-based relevance technology, Elasticsearch also comes with many natural language processing libraries for handling common spelling errors and language inconsistencies. This makes search “human-friendly” and error-tolerant in the way that people have come to expect.

All that being said, Elasticsearch is highly complex to install, maintain, and scale, but we knew that by integrating it with our existing hosting stack and devops experience, we could make it accessible for every Webflow user.

Design interactions and animations without code

Build complex interactions and animations without even looking at code.

Start animating
Design interactions and animations without code

Build complex interactions and animations without even looking at code.

Start animating
Start animating

Scaling performance

Because search is so computationally demanding, we built ours on AWS — just like the rest of our hosting stack — and have everything managed by Elastic Cloud, the makers of Elasticsearch.

Similar to how we handle page views and content distribution in our existing hosting, this foundation ensures that search is performant and fast, even if your site is getting heavy traffic.

Providing precise indexing controls

We also wanted to set our search apart by giving Webflow users fine-tuned controls over the content of their search engines. For context, most “built-in search” assumes how you want to use search, and pins you into using search in one way.

To break from this approach and let you create your own custom search engine, we introduced controls in Webflow that allow you to:

  • Exclude specific pages
  • Exclude specific Collections
  • Exclude specific elements

This level of control allows you to create whatever search experience your site demands, instead of forcing you to work around built-in constraints and assumptions.

Comparing Webflow search to the market

All this sets Webflow’s site search apart from out-of-the-box, database search options on platforms like WordPress and Drupal, which don’t offer the level of refined relevance technology we gain from Elasticsearch.

Our site search tech also covers most of the bases that prominent third-party tools might be used to compensate for. For example, Swiftype, which is owned by and built on top of Elasticsearch, offers dashboard controls for customizing search results and relevance — but requires hours of developer integration and starts at a cost well above the price of Webflow’s hosting.

Even more technical is Algolia, which while developer-friendly, still requires a level of technical knowledge that makes it inaccessible to many (or too expensive).

Delivering relevance, performance, indexing control — and visual styling

On top of all that, Webflow site search gives you a level of control over the search experience paralleled by no other option out there.  

Thoughts? We'd love to hear them in the comments!

Last Updated
January 11, 2018